Modern conceptions of memory are a little more nuanced than those from the past, holding that memories are reconstructed and reprocessed based on the problem at hand, and that the questions that are posed influence how memories are recalled. To replicate this process in neural networks, they must be able to both store their memories and logically reason about this data in order to respond to specific queries.
Combining Memory Systems with Neural Networks
DeepMind has introduced a new machine learning model dubbed the differentiable neural computer (DNC), that combines memory systems with neural networks so that it can both store knowledge and reason about it. The DNC augments its neural network with the ability to “read from and write to an external memory matrix”, similar to random access memory (RAM).
In a conventional computer, RAM is used to store variables and complex data structures. The addition of a neural network allows a model to learn to form and manipulate those data structures on its own. Within a DNC, neural network is called a controller, and it is tasked with taking “taking input in, reading from and writing to memory, and producing output that can be interpreted as an answer.”
Controllers can either write memory by sending a vector of information to a new, unused location or update information at a pre-existing location. Whenever information is written at a location, it is connected to other locations by associative temporal links that represent the order in which the information was stored to help the controller keep track of its information.
Controllers can also read memory, searching locations based on the location’s content or via the associative temporal links. As such, within this system, the controller receives an external input, interacts with the memory by reading or writing it, and generates an output.
Answering Complex, Structured Questions
With training, the DNC can learn to answer questions that demand reasoning and inference with increasing accuracy. Using randomly generated graphs, the DNC can be trained to find the shortest path between two points and infer missing links, and then generalize these tasks to complex data structures.
DeepMind tested their model’s ability to construct transport networks and family trees and navigate them in order to answer questions. For instance, after DeepMind described the layout of the stations and lines of the London Underground to the DNC, it was able to answer figure out the shortest route between two stops and follow a sequence of directions from a starting point to determine the final destination.
In the context of large family trees, the DNC could use information about parent, child, and sibling relationships to deduce uncle-niece relationships and answer complex questions such as, “Who is Freya’s maternal great uncle.”
Finally, the DNC can be trained with reinforcement learning to learn how to achieve a series of goals and store those goals in memory. Then, when asked to achieve a certain outcome, the DNC can refer to the subroutines stored in its memory to execute.
All of the aforementioned complex tasks are only solvable with the aid of external read-write memory.