Babies are awesome… Humans are the OG neural net.
Even though AI and neuroscience are similar in many ways, they are not identical.
“Babies are awesome… Humans are the OG neural net.” — Elon Musk, during a recent Joe Rogan interview as they were discussing about his newborn, X Æ A-12.
Indeed, there are many similarities between how our brains are wired and how a neural network works. The essentials of an AI neural network are similar to the human brain, simulating what the brain does during the learning processing. Even though AI and neuroscience are similar in many ways, they are not identical.
Just like, we don’t build submarines to swim like a fish; instead, we borrowed the principles of hydrodynamics and applied them to build submarines. Before the Wright brothers, people designed wings to flap like birds. But the Wright brothers solved the problem of flights, by stop trying to build perfect birds’ wings. Rather they studied wings’ patterns and the dynamics of the air that flows over and under the wings that generate lift.
In the same way, we could look at the human brain for inspiration and borrow valuable concepts. AI researchers dedicated their time trying to mimic the internal processes of a human brain, by understanding biological brains, could play a vital role in building intelligent machines.
The key to building an intelligent system relies on memory systems to remember past experiences. In the brain, that is the hippocampus, which plays a vital role in the consolidation of information, learning, and memory.
In reinforcement learning, this allows the value of actions to be learned incrementally through repeated experience and stored in memory, known as episodic memory. One key ingredient in Deep Q-network (DQN) is “experience replay,” whereby the network stores actions’ values learned through experiences, and then “replays” it. DQN stores experiences such as action and reward outcomes associated with every Atari game screens or StarCraft scenario. It selects actions based on the similarity between the current situation and the previous experiences stored in memory, taking the actions that yield the highest reward.
Experience replay allows reinforcement learning to learn from successes or failures that occurred in the past, whereby actions sequence leading to rewards or punishments are internally re-enacted. Experiences stored in replay buffer in DQN are implemented like a primitive hippocampus, allowing consolidation, learning, and memory to take place.
Humans don’t start their thinking from scratch every second; instead, our thoughts have persistence. As you read this sentence, your understanding of a sentence is based on a sequence of words. You use existing knowledge and generate new information.
Human intelligence is defined by our remarkable ability to maintain and manipulate information within an active store, known as working memory. Unlike episodic memory that is about remembering the past, working memory enables cognition, a mental process of acquiring knowledge and understanding through thought, experience, and the senses.
Our ability to maintain and process information over time has led AI researchers to develop recurrent neural network architectures. These networks have loops in them, allowing the information to persist. It has been applied in a variety of applications, such as natural language processing and speech recognition. It has also been used to create captions describing an image, by picking a part of the image to look at for every word it outputs. These networks work tremendously well on a large variety of problems, particularly the long-short-term memory networks, which achieved state of the art performances across a variety of domains.
Unlike most CNN models that work directly on the entire image to determine if a cat exists, our visual attention shifts strategically to objects. Instead of processing the whole image, we center our processing resources, isolating to the information that is relevant at any given moment.
Such attention mechanisms have been a source of inspiration for AI architectures, intending to ignore irrelevant objects in an image, and focus on what is relevant. This has also allowed AI practitioners to scale computational cost with the size of the input image. Attention mechanisms have led to produce impressive performances at difficult multi-object recognition tasks in the presence of clutter. It has also enabled image-to-caption generation.
While attention was initially thought of as an orienting mechanism for perception, it has led to state of the art performances in machine translation and applied in Google Translate. Its success is owed to its ability to generalize well to long sentences by selectively focusing on sub-parts of the sentence during translation.
Humans have an inherent ability to transfer knowledge gained from one context and applied in new situations. The knowledge we acquired while learning about one task, we can generalize that experience the same way to solve related tasks. We don’t learn everything from the ground up when we attempt to learn something new. Instead, we leverage the knowledge from what we have learned in the past.
For example, we recognized cars as an object that have wheels and doors, and they are of a particular shape and size. We can use this knowledge gained when trying to recognize trucks without re-learning how a wheel looks like. Similarly, we can efficiently learn Italian after learning French as we generalize common grammatical characteristics and word similarities.
Despite being second nature for humans, transfer learning is a research problem in machine learning. AI researchers focus on utilizing the stored knowledge gained from one application and applying it to a different but related problem. The key motivation for transfer learning in the context of deep learning is the lack of annotated data for diverse domains. As machine learning approaches rely on the availability of an adequate amount of annotated training data, it is often costly to obtain sufficient labeled training data.
In computer vision, one of the successful uses of using transfer learning is in the ImageNet Challenge, where participants were provided a subset of ImageNet training data containing the 1000 categories and 1.2 million images.
For natural language processing, word embeddings such as Word2vec, where the semantic meaning of a word is trained from source data like Wikipedia and applied it to sentiment analysis and document classification.
In speech recognition, models developed for English has been successfully used to improve speech recognition capabilities for other languages.