Neural Network “Learning Rules”


Most neurocomputational models are not hard-wired to perform a task. Instead, they are typically equipped with some kind of learning process.  In this post, I'll introduce some notions of how neural networks can learn.  Understanding learning processes is important for cognitive neuroscience because they may underly the development of cognitive ability.

Let's begin with a theoretical question that is of general interest to cognition: how can a neural system learn sequences, such as the actions required to reach a goal? 

Consider a neuromodeler who hypothesizes that a particular kind of neural network can learn sequences. He might start his modeling study by "training" the network on a sequence. To do this, he stimulates (activates) some of its neurons in a particular order, representing objects on the way to the goal. 

After the network has been trained through multiple exposures to the sequence, the modeler can then test his hypothesis by stimulating only the neurons from the beginning of the sequence and observing whether the neurons in the rest sequence activate in order to finish the sequence.

Successful learning in any neural network is dependent on how the connections between the neurons are allowed to change in response to activity. The manner of change is what the majority of researchers call "a learning rule".  However, we will call it a "synaptic modification rule" because although the network learned the sequence, it is not clear that the *connections* between the neurons in the network "learned" anything in particular.

The particular synaptic modification rule selected is an important ingredient in neuromodeling because it may constrain the kinds of information the neural network can learn.

There are many categories of mathematical synaptic modification rule which are used to describe how synaptic strengths should be changed in a neural network.  Some of these categories include: backpropgration of error, correlative Hebbian, and temporally-asymmetric Hebbian.

  • Backpropogation of error states that connection strengths should change throughout the entire network in order to minimize the difference between the actual activity and the "desired" activity at the "output" layer of the network.
  • Correlative Hebbian states that any two interconnected neurons that are active at the same time should strengthen their connections, so that if one of the neurons is activated again in the future the other is more likely to become activated too.
  • Temporally-asymmetric Hebbian is described in more detail in the example below, but essentially emphasizes the importants of causality: if a neuron realiably fires before another, its connection to the other neuron should be strengthened. Otherwise, it should be weakened. 

Why are there so many different rules?  Some synaptic modification rules are selected because they are mathematically convenient.  Others are selected because they are close to currently known biological reality.  Most of the informative neuromodeling is somewhere in between.

An Example

Let's look at a example of a learning rule used in a neural network model that I have worked with: imagine you have a network of interconnected neurons that can either be active or inactive.    If a neuron is active, its value is 1, otherwise its value is 0. (The use of 1 and 0 to represent simulated neuronal activity is only one of the many ways to do so; this approach goes by the name "McCulloch-Pitts").

Read more

Computational models of cognition in neural systems: WHY?

In my most recent post I gave an overview of the "simple recurrent network" (SRN), but I'd like to take a step back and talk about neuromodeling in general.  In particular I'd like to talk about why neuromodeling is going to be instrumental in bringing about the cognitive revolution in neuroscience.

A principal goal of cognitive neuroscience should be to explain how cognitive phenomena arise from the underlying neural systems.  How do the neurons and their connections result in interesting patterns of thought?  Or to take a step up, how might columns, or nuclei, interact to result in problem solving skills, thought or consciousness?

If a cognitive neuroscientist believes they know how a neural system gives rise to a behavior, they should be able to construct a model to demonstrate how this is the case.

That, in brief, is the answer.

But what makes a good model?  I'll partially answer this question below, but in future posts I'll bring up specific examples of models, some good, some poor.

First, any "model" is a simplification of the reality.  If the model is too simple, it won't be interesting.  If it's too realistic, it will be too complex to understand.  Thus, a good model is at that sweet spot where it's as simple as possible but no simpler.
Second, a model whose ingredients spell out the result you're looking for won't be interesting.  Instead, the results should emerge from the combination of the model's resources, constraints and experience.

Third, a model with too many "free" parameters is less likely to be interesting.  So an important requirement is that the "constraints" should be realistic, mimicking the constraints of the real system that is being modeled.

A common question I have gotten is:  "Isn't a model just a way to fit inputs to outputs?  Couldn't it just be replaced with a curve fitter or a regression?"  Well, perhaps the answer should be yes IF you consider a human being to just be a curve fitting device. A human obtains inputs and generates outputs.  So if you wish to say that a model is just a curve fitter, I will say that a human is, too.

What's interesting about neural systems, whether real or simulated, is the emergence of complex function from seemingly "simple" parts.

In future posts, I'll talk more about "constraints" by giving concrete examples.  In the meantime, feel free to bring up any questions you have about the computational modeling of cognition.

[Image by Santiago Ramon y Cajal, 1914.] 

Can a Neural Network be Free…

…from a knee-jerk reaction to its immediate input? Simple Recurrent Network

Although one of the first things that a Neuroscience student learns about is "reflex reactions" such as the patellar reflex (also known as the knee-jerk reflex), the cognitive neuroscientist is interested in the kind of processing that might occur between inputs and outputs in mappings that are not so direct as the knee-jerk reaction. 

An example of a system which is a step up from the knee-jerk reflex is in the reflexes of the sea slug named "Aplysia".  Unlike the patellar reflex, Aplysia's gill and siphon retraction reflexes seem to "habituate" over time — the original input-output mappings are overridden by being repeatedly stimulated.  This is a simple form of memory, but no real "processing" can be said to go on there.

Specifically, cognitive neuroscientists are interested in mappings where "processing" seems to occur before the output decision is made.  As MC pointed out earlier, the opportunity for memory (past experience) to affect those mappings is probably important for "free will". 

But how can past experience affect future mappings in interesting ways? One answer to this question appeared in the year 1990, which began a new era in experimentation with neural network models capable of indirect input-output mappings.  In that year, Elman (inpired by Jordan's 1986 work) demonstrated the Simple Recurrent Network in his paper "Finding Structure in Time".  The concept behind this network is shown in the picture associated with this entry.

The basic idea of the Simple Recurrent Network is that as information comes in (through the input units), an on-line memory of that information is preserved and recirculated (through the "context" units).  Together, the input and context units both influence the hidden layer which can trigger responses in the output layer.  This means that the immediate output of the network is dependent not only on the current input, but also on the inputs that came before it.

The most interesting aspect of the Simple Recurrent Network, however, is that the connections among all the individual units in the network change depending on what the modeler requires the network to output.   The network learns to preserve information in the context layer loops so that it can correctly produce the desired output. For example, if the task of the network is to remember the second word in a sentence, it will amplify or maintain the second word when it comes in, while ignoring the intervening words, so that at the end of the sentence it outputs the target word.

Although this network cannot be said to have "free" will — especially because of the way its connections are forcefully trained — its operation can hint at the type of phenomena researchers should seek in trying to understand cognition in neural systems.