Everyday (spoken) language use involves the production and perception of sounds at a very fast rate. One of my favorite quotes on this subject is in “The Language Instict” by Steven Pinker, on page 157.
“Even with heroic training [on a task], people could not recognize the sounds at a rate faster than good Morse code operators, about three units a second. Real speech, somehow, is perceived an order of magnitude faster: ten to fifteen phonemes per second for casual speech, twenty to thirty per second for the man in the late-night Veg-O-Matic ads […]. Given how the human auditory system works, this is almost unblievable. […P]honemes cannot possibly be consecutive bits of sound.”
One thing to point out is that there is a lot of context in language. At a high level, there is context from meaning which is constantly anticipated by the listener: meaning imposes restrictions on the possibilities of the upcoming words. At a lower level there’s context from phonetics and co-articulation; for example, it turns out that the “l” in “led” sounds different from the “l” in “let”, and this may give the listener a good idea of what’s coming next.
Although this notion of context at multiple levels may sound difficult to implement in a computer program, the brain is fundamentally different from a computer. It’s important to remember that the brain is massively parallel processing machine, with millions upon millions of signal processing units (neurons).
(I think this concept of context and prediction is lost on more traditional linguists. On the following page of his book, Pinker misrepresents the computer program Dragon NaturallySpeaking by saying that you have to speak haltingly, one-word-at-a-time to get it to recognize words. This is absolutely not the case: the software works by taking context into account, and performs best if you speak at a normal, continuous rate. Reading software instructions often results in better results.)
Given that the brain is a massively parallel compuer, it’s really not difficult to imagine that predictions on several different timescales are taken into account during language comprehension. Various experiments from experimental psychology have indicated that this is, in fact, the case.
The study of the brain and how neural systems process language will be fundamental to advancing the field of theoretical linguistics — which thus far seems to be stuck in old ideas from early computer science.
Because language operates on such a rapid timescale, and involves so many different brain areas, there is need to use multiple non-invasive (as well as possibly invasive) recording techniques to get at how language is perceived and produced such as ERP, MEG, fMRI and microelectrodes.
In addition to recording from the brain, real-time measurements of behavior are important in assessing language perception. Two candidate behaviors come to mind: eye movements and changes in hand movements.
Eye movements are a really good candidate for tracking real-time language perception because they are so quick: you can move your eyes before a word has been completely said. Also, there has been some fascinating work done with continuous mouse movements towards various targets to measure participant’s on-line predictions of what is about to be said. These kinds of experimental approaches promise to provide insight on how continuous speech signals are perceived.