Combining Simple Recurrent Networks and Eye-Movements to study Language Processing

BBS image of GLENMORE model

Modern technologies allow eye movements to be used as a tool for studying language processing during tasks such as natural reading. Saccadic eye movements during reading turn out to be highly sensitive to a number of linguistic variables. A number of computational models of eye movement control have been developed to explain how these variables affect eye movements. Although these models have focused on relatively low-level cognitive, perceptual and motor variables, there has been a concerted effort in the past few years (spurred by psycholinguists) to extend these computational models to syntactic processing.

During a modeling symposium at ECEM2007 (the 14th European Conference on Eye Movements), Dr. Ronan Reilly presented a first attempt to take syntax into account in his eye-movement control model (GLENMORE; Reilly & Radach, Cognitive Systems Research, 2006).

Read more

Redefining Mirror Neurons

Monkey imitating humanIn 1992 Rizzolatti and his colleagues found a special kind of neuron in the premotor cortex of monkeys (Di Pellegrino et al., 1992).

These neurons, which respond to perceiving an action whether it’s performed by the observed monkey or a different monkey (or person) it’s watching, are called mirror neurons

Many neuroscientists, such as V. S. Ramachandran, have seized upon mirror neurons as a potential explanatory ‘holy grail’ of human capabilities such as imitation, empathy, and language. However, to date there are no adequate models explaining exactly how such neurons would provide such amazing capabilities.

Perhaps related to the lack of any clear functional model, mirror neurons have another major problem: Their functional definition is too broad.

Typically, mirror neurons are defined as cells that respond selectively to an action both when the subject performs it and when that subject observes another performing it. A basic assumption is that any such neuron reflects a correspondence between self and other, and that such a correspondence can turn an observation into imitation (or empathy, or language).

However, there are several other reasons a neuron might respond both when an action is performed and observed.

First, there may be an abstract concept (e.g., open hand), which is involved in but not necessary for the action, the observation of the action, or any potential imitation of the action.

Next, there may be a purely sensory representation (e.g., of hands / objects opening) which becomes involved independently of action by an agent.

Finally, a neuron may respond to another subject’s action not because it is performing a mapping between self and other but because the other’s action is a cue to load up the same action plan. In this case the ‘mirror’ mapping is performed by another set of neurons, and this neuron is simply reflecting the action plan, regardless of where the idea to load that plan originated. For instance, a tasty piece of food may cause that neuron to fire because the same motor plan is loaded in anticipation of grasping it.

It is clear that mirror neurons, of the type first described by Rizzolati et al., exist (how else could imitation occur?). However, the practical definition for these neurons is too broad.

How might we improve the definition of mirror neurons? Possibly by verifying that a given cell (or population of cells) responds only while observing a given action and while carrying out that same action.

Alternatively, subtractive methods may be more effective at defining mirror neurons than response properties. For instance, removing a mirror neuron population should make imitation less accurate or impossible. Using this kind of method avoids the possibility that a neuron could respond like a mirror neuron but not actually contribute to behavior thought to depend on mirror neurons.

Of course, the best approach would involve both observing response properties and using controlled lesions. Even better would be to do this with human mirror neurons using less invasive techniques (e.g., fMRI, MEG, TMS), since we are ultimately interested in how mirror neurons contribute to higher-level behaviors most developed in homo sapiens, such as imitation, empathy, and language.


Image from The Phineas Gage Fan Club (originally from Ferrari et al. (2003)).

Grand Challenges of Neuroscience: Day 5

Topic: Languagequit_stealing.jpg

Everyday (spoken) language use involves the production and perception of sounds at a very fast rate. One of my favorite quotes on this subject is in “The Language Instict” by Steven Pinker, on page 157.

“Even with heroic training [on a task], people could not recognize the sounds at a rate faster than good Morse code operators, about three units a second.  Real speech, somehow, is perceived an order of magnitude faster: ten to fifteen phonemes per second for casual speech, twenty to thirty per second for the man in the late-night Veg-O-Matic ads […]. Given how the human auditory system works, this is almost unblievable. […P]honemes cannot possibly be consecutive bits of sound.”

One thing to point out is that there is a lot of context in language.  At a high level, there is context from meaning which is constantly anticipated by the listener: meaning imposes restrictions on the possibilities of the upcoming words.  At a lower level there’s context from phonetics and co-articulation; for example, it turns out that the “l” in “led” sounds different from the “l” in “let”, and this may give the listener a good idea of what’s coming next.

Although this notion of context at multiple levels may sound difficult to implement in a computer program, the brain is fundamentally different from a computer.  It’s important to remember that the brain is massively parallel processing machine, with millions upon millions of signal processing units (neurons).
(I think this concept of context and prediction is lost on more traditional linguists.  On the following page of his book, Pinker misrepresents the computer program Dragon NaturallySpeaking by saying that you have to speak haltingly, one-word-at-a-time to get it to recognize words.  This is absolutely not the case: the software works by taking context into account, and performs best if you speak at a normal, continuous rate.  Reading software instructions often results in better results.)

Given that the brain is a massively parallel compuer, it’s really not difficult to imagine that predictions on several different timescales are taken into account during language comprehension.  Various experiments from experimental psychology have indicated that this is, in fact, the case.

The study of the brain and how neural systems process language will be fundamental to advancing the field of theoretical linguistics — which thus far seems to be stuck in old ideas from early computer science.


Because language operates on such a rapid timescale, and involves so many different brain areas, there is need to use multiple non-invasive (as well as possibly invasive) recording techniques to get at how language is perceived and produced such as ERP, MEG, fMRI and microelectrodes.

In addition to recording from the brain, real-time measurements of behavior are important in assessing language perception. Two candidate behaviors come to mind:  eye movements and changes in hand movements.

Eye movements are a really good candidate for tracking real-time language perception because they are so quick: you can move your eyes before a word has been completely said.  Also, there has been some fascinating work done with continuous mouse movements towards various targets to measure participant’s on-line predictions of what is about to be said.  These kinds of experimental approaches promise to provide insight on how continuous speech signals are perceived.


History’s Top Brain Computation Insights: Day 15

A split brain patient verbally reports the left-brain information, while reporting with his hand for right-brain information15) Consciousness depends on cortical communication; the cortical hemispheres are functionally specialized (Sperry & Gazzaniga – 1969)

It is quite difficult to localize the epileptic origin in some seizure patients. Rather than removing the gray matter of origin, neurosurgeons sometimes remove white matter to restrict the seizure to one part of the brain.

One particularly invasive procedure (the callosotomy) restricts the seizure to one half of cortex by removing the connections between the two halves. This is normally very effective in reducing the intensity of epileptic events. However, Sperry & Gazzaniga found that it comes at a price.

They found that presenting a word to the right visual field of a patient without a corpus callosum allowed only the patient’s left hemisphere to become aware of that word (and vice versa). When only the side opposite the one which was presented the word was allowed to respond, it had no idea what word had been presented.

The two hemispheres of cortex could not communicate, and thus two independent consciousnesses emerged.

Sperry & Gazzaniga also found that the left hemisphere, and not the right, could typically respond linguistically. This suggested that language is largely localized in the left hemisphere. (See the above figure for illustration.)

The functional distinction between the left and right hemispheres is supported by many lesion studies. Generally, the left hemisphere is specialized for language and abstract reasoning, while the right hemisphere is specialized for spatial, body, emotional, and environment awareness. The boundary between these specializations has been trivialized in the popular media; it is actually quite complex and, in healthy brains, quite subtle.

Implication: The mind, largely governed by reward-seeking behavior, is implemented in an electro-chemical organ with distributed and modular function consisting of excitatory and inhibitory neurons communicating via ion-induced action potentials over convergent and divergent synaptic connections strengthened by correlated activity. The cortex, a part of that organ composed of functional column units whose spatial dedication determines representational resolution, is composed of many specialized regions involved in perception (e.g., touch: parietal lobe, vision: occipital lobe), action (e.g., frontal lobe), and memory (e.g., temporal lobe), which depend on inter-regional communication for functional integration.

[This post is part of a series chronicling history’s top brain computation insights (see the first of the series for a detailed description). See the history category archive to see all of the entries.]