Google’s New AI Gets Smarter Thanks to a Working Memory
“The behavior of the computer at any moment is determined by the symbols which he is observing and his ‘state of mind’ at that moment.” – Alan Turing
Artificial intelligence has a memory problem.
Back in early 2015, Google’s mysterious DeepMind unveiled an algorithm that could teach itself to play Atari games. Based on deep neural nets, the AI impressively mastered nostalgic favorites such as Space Invaders and Pong without needing any explicit programming — it simply learned through millions of examples.
But the algorithm had a weakness: memory. Without a memory module, it couldn’t store away any information it had already mastered. When faced with problems requiring multi-step reasoning, the algorithm faltered.
Now, the DeepMind team is back with an updated deep neural net dubbed the “differentiable neural computer (DNC).” Taking inspiration from plasticity mechanisms in the hippocampus, our brain’s memory storage system, the team has added a memory module to a deep learning neural network that allows it to quickly store and access learned bits of knowledge when needed.
With training, the algorithm can flexibly solve difficult reasoning problems that stump conventional neural networks — for example, navigating the London Underground subway system or reasoning about interpersonal relationships based on a family tree.
That might not sound impressive, but DNCs could be a gateway to more powerful computational engines that marry deep learning with rational thinking.
Given deep learning’s superior ability at extracting structure from big data, by adding a dynamic knowledge base to these already powerful algorithms, DNCs could eventually be capable of “intuiting the variable structure and scale of the world within a single, generic model.” The team recently published their work in Nature.
Memorable Deep Neural Nets
In a nutshell, DNC is a hybrid between neural networks and conventional computers equipped with random-access memory (better known as RAM).
Neural networks, true to their name, loosely resemble the biological neurons firing away in our brains. Made up of layers of artificial neurons, deep neural nets take in millions of training examples and learn by adjusting the strength of the connections between the neurons. This gradually moves the AI’s response closer to a given problem’s solution, kind of like tuning a guitar.
Just like our brains, deep neural nets don’t run on preprogrammed commands — they come up with their own by finding structures through pattern recognition.
These powerful algorithms are extremely capable at recognizing faces in photos and translating languages. The problem is they’re one-trick ponies: when asked to perform a new task, the AI needs to be retrained with another data set, which in turns wipes out everything it had previously learned.
This is where RAM comes in.
In conventional computers, the processor dynamically bundles information into combinations stored in RAM, which it can then read and write from. This way, the memory module frees up space in the processor to organize intermediate computational results. Each piece of knowledge — say, a number, a list, a tree or a map — is represented by a variable stored in memory that links to other variables, making each piece of related knowledge easily accessible.
The DNC combines the best of both worlds.
“When we designed DNCs, we wanted machines that could learn to form and navigate complex data structures on their own,” DeepMind researchers wrote in a detailed blog post explaining their work.
At the heart of the DNC is a neural network dubbed the controller, which takes in training data, communicates with the memory module and outputs answers — somewhat like a conventional computer CPU, but without the need for prior programming.
When the team feeds the DNC sets of training data — for example, a map of the London Underground — the controller chooses if and where to store each bit of information in the memory module and links associated pieces in the order in which they were stored. This way, not only did the algorithm build a database of knowledge to draw upon, it also recognized patterns between each bit of information in its database.
Rather than being programmed what to do, the controller — as a neural net —learns through examples. With training, it figures out how to produce better and better answers and learns how to best use its memory in the process.
“Together, these operations give DNCs the ability to make choices about how they allocate memory, store information in memory, and easily find it once there,” explained the authors.
Equipped with memory, the AI outperformed unaided neural networks at multiple reasoning tasks.
For example, a DNC trained on a map of the London Underground gradually intuited what the stations are and how they’re connected, bundled those parameters as variables and off-loaded them to the memory module to free up processing space.
When asked how to get from point A to B, the AI correctly found the optimized route 98.8% of the time after a million training samples. In stark contrast, unaided networks struggled even after double the amount of training data, charting out the correct path with a measly 37% success rate.
The DNC also displayed admirable deductive skills.
When given a family free that only describes parent, child and sibling relationships, the network could correctly answer questions such as “Who is A’s maternal great uncle?”, a kind of generalization far beyond the ability of conventional neural networks.
In another example, the DNC successfully solved block puzzles that required a series of correct operations. Using its memory to store information about the goal of each step, the DNC could logically plan a string of actions that eventually led to the correct answer.
“Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read–write memory,” the authors concluded in their paper.
Deep Neural Reasoning
It’s hard to ignore the parallels between DNCs and human memory.
“Here is a learning machine that, without prior programming, can organize information into connected facts and use those facts to solve problems,” wrote the authors.
Just like the brain, a DNC can reason about symbolic representations — for example, family members, underground stations — using purely electrical impulses. How the brain performs these neural-symbolic computations is still unclear, but the team cautiously suggests that the DNC may be a very rudimentary model of the biological process.
Humans can reason about things on a very long timescale, said the authors in an interview with Nature. “This is a small step towards getting something that can operate towards a longer horizon of time and over more complicated relational data than we have before,” they said.
“The authors’ demonstrations are not particularly complex as demands on rational reasoning go, and could be solved by the algorithms of symbolic artificial intelligence of the 1970s,” says Dr. Herbert Jaeger at Jacobs University in Bremen, Germany, who was not involved in the study.
However, Jaeger quickly pointed out that older programs were “handcrafted by humans and do not learn from examples.”
The promise of DNCs and their future successors is big: a flexible and extensible neural network equipped with memory could allow us to harness deep learning to solve big-data problems with a rational reasoning component, for example, automatically generating video commentaries or semantic text analysis.
“A precursor to the DNC…certainly sent thrills through the deep-learning community,” says Jaeger.
Whether DNCs will receive the same standing ovation is up in the air, but one thing is clear: memory is powerful, and it’s here to stay.