Statistical physics reveals the power of simple word-learning strategies

Mathematical model shows two simple strategies can work together to allow large numbers of words to be learned almost as quickly as they are encountered.

Learning new words feels easy, and humans are prolific word learners: the average high-school graduate knows approximately 60,000 words, and even young children learn around ten words a day. But word learning should be really hard, since there are many things that any one word could mean. For instance, whenever a child hears the word “cup”, there will be lots of other objects in the room that “cup” could refer to (juice, spoon, table, chair, etc).

In a paper published in Physical Review Letters on 21st June 2013, a team of researchers based at the School of Physics & Astronomy and the School of Philosophy, Psychology and Language Sciences at the University of Edinburgh used a statistical physics approach to show that two very simple word-learning strategies can work together to allow large numbers of words to be learned almost as quickly as they are encountered.

Mathematical model

The mathematical model was developed by Rainer Reisenauer, a visiting undergraduate student from Ludwig-Maximilians-Universität Munich, along with physicist Richard Blythe and linguist Kenny Smith, both Edinburgh-based researchers. The model is grounded in small-scale laboratory experiments that reveal tricks children use to deal with uncertainty when learning the meaning of words. For instance, they might notice that cups tend to be around whenever the word "cup" is used, an approach called cross-situational learning. Children also seem to assume that words are mutually exclusive: that no object has two names.

The problem with experiments is that they are necessarily limited to very small numbers of words: there is no way to condense a lifetime of word learning into a single hour-long experiment. This is where statistical physics comes into play: it is a theory that allows the structure of large systems – traditionally, condensed matter systems comprising vast numbers of molecules – to be predicted from interactions between the microscopic components.

By treating each encounter with a word as the microscopic component in the statistical physics model, the researchers found that the simple combination of cross-situational learning and mutual exclusivity allows learners to acquire large numbers of words at a surprisingly rapid rate. In fact, when the level of uncertainty in each word’s meaning lies below a critical value (less than around fifteen potential alternative meanings to consider each time a word is used), the entire lexicon can be learned in the about same time it takes to hear the least common word. 

"What appears to be happening here is that the incorrect meanings of common words get eliminated quickly, so that when rare words are heard, there is essentially only one possible meaning remaining." Richard Blythe

This work shows that a small number of simple mechanisms may be enough for children to learn words to describe the complex world around them. Models of this type may also help shed some light on the nature of the developments that took place in early humans' brains as they developed their language ability.