PLEASE HELP WITH ASSIGNMENT
Discussion
Provide at least two references for your initial post.
Recent developments within cognitive psychology have contributed to the development of the interdisciplinary field of cognitive science.
Describe how the contributions of both neurophysiology and computer science have helped us to understand more about how people think
.
DOI: 10.1126/science.1175626
, 284 (2009); 325Science
et al.Andrew N. Meltzoff,
Foundations for a New Science of Learning
www.sciencemag.org (this information is current as of July 16, 2009 ):
The following resources related to this article are available online at
http://www.sciencemag.org/cgi/content/full/325/5938/284
version of this article at:
including high-resolution figures, can be found in the onlineUpdated information and services,
http://www.sciencemag.org/cgi/content/full/325/5938/284#otherarticles
, 19 of which can be accessed for free: cites 65 articlesThis article
http://www.sciencemag.org/cgi/collection/education
Education
: subject collectionsThis article appears in the following
http://www.sciencemag.org/about/permissions.dtl
in whole or in part can be found at: this article
permission to reproduce of this article or about obtaining reprintsInformation about obtaining
registered trademark of AAAS.
is aScience2009 by the American Association for the Advancement of Science; all rights reserved. The title
CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org/cgi/content/full/325/5938/284
http://www.sciencemag.org/cgi/content/full/325/5938/284#otherarticles
http://www.sciencemag.org/cgi/collection/education
http://www.sciencemag.org/about/permissions.dtl
http://www.sciencemag.org
Foundations for a New Science
of Learning
Andrew N. Meltzoff,1,2,3* Patricia K. Kuhl,1,3,4 Javier Movellan,5,6 Terrence J. Sejnowski5,6,7,8
Human learning is distinguished by the range and complexity of skills that can be learned and the
degree of abstraction that can be achieved compared with those of other species. Homo sapiens is
also the only species that has developed formal ways to enhance learning: teachers, schools, and
curricula. Human infants have an intense interest in people and their behavior and possess powerful
implicit learning mechanisms that are affected by social interaction. Neuroscientists are beginning
to understand the brain mechanisms underlying learning and how shared brain systems for
perception and action support social learning. Machine learning algorithms are being developed that
allow robots and computers to learn autonomously. New insights from many different fields are
converging to create a new science of learning that may transform educational practices.
C
ultural evolution, which is rare among
species and reaches a pinnacle in Homo
sapiens, became possible when new forms
of learning evolved under selective pressure in
our ancestors. Culture underpins achievements in
language, arts, and science that are unprecedented
in nature. The origin of human intelligence is still
a deep mystery. However, the study of child de-
velopment, the plasticity of the human brain, and
computational approaches to learning are laying
the foundation for a new science of learning
that provides insights into the origins of human
intelligence.
Human learning and cultural evolution are sup-
ported by a paradoxical biological adaptation: We
are born immature. Young infants cannot speak,
walk, use tools, or take the perspective of others.
Immaturity comes at a tremendous cost, both to
the newborn, whose brain consumes 60% of its
entire energy budget (1), and to the parents. Dur-
ing the first year of life, the brain of an infant is
teeming with structural activity as neurons grow
in size and complexity and trillions of new con-
nections are formed between them. The brain
continues to grow during childhood and reaches
the adult size around puberty. The development of
the cerebral cortex has “sensitive periods” during
which connections between neurons are more
plastic and susceptible to environmental influence:
The sensitive periods for sensory processing areas
occur early in development, higher cortical areas
mature later, and the prefrontal cortex continues
to develop into early adulthood (2).
Yet immaturity has value. Delaying the matu-
ration and growth of brain circuits allows initial
learning to influence the developing neural archi-
tecture in ways that support later, more complex
learning. In computer simulations, starting the learn-
ing process with a low-resolution sensory system
allows more efficient learning than starting with a
fully developed sensory system (3).
What characterizes the exuberant learning that
occurs during childhood? Three principles are
emerging from cross-disciplinary work in psychol-
ogy,neuroscience,machinelearning,andeducation,
contributing to a new science of learning (Fig. 1).
These principles support learning across a range of
areas and ages and are particularly useful in ex-
plaining children’s rapid learning in two unique
domains of human intelligence: language and so-
cial understanding.
Learning is computational. Discoveries in de-
velopmental psychology and in machine learning
are converging on new computational accounts
of learning. Recent findings show that infants and
young children possess powerful computational
skills that allow them automatically to infer struc-
tured models of their environment from the statis-
tical patterns they experience. Infants use statistical
patterns gleaned from experience to learn about
both language and causation. Before they are three,
children use frequency distributions to learn which
phonetic units distinguish words in their native
language (4, 5), use the transitional probabilities
between syllables to segment words (6), and use
covariation to infer cause-effect relationships in
the physical world (7).
Machine learning has the goal of developing
computer algorithms and robots that improve
automatically from experience (8, 9). For exam-
ple, BabyBot, a baby doll instrumented with a
video camera, a microphone, and a loudspeaker
(10), learned to detect human faces using the
temporal contingency between BabyBots’ pro-
grammed vocalizations and humans that tended to
1Institute for Learning and Brain Sciences, University of Wash-
ington, Seattle, WA 98195, USA. 2Department of Psychology,
University of Washington, Seattle, WA, 98195, USA. 3Learning
in Informal and Formal Environments (LIFE) Center, University of
Washington, Seattle, WA 98195, USA. 4Department of Speech
and Hearing Sciences, University of Washington, Seattle, WA,
98195, USA. 5Institute for Neural Computation, University of
California at San Diego, La Jolla, CA 92093, USA. 6Temporal
Dynamics of Learning Center (TDLC), University of California at
San Diego, La Jolla, CA 92093, USA. 7Howard Hughes Medical
Institute, Salk Institute for Biological Studies, La Jolla, CA
92037, USA. 8Division of Biological Sciences, University of
California at San Diego, La Jolla, CA 92093, USA.
*To whom correspondence should be addressed. E-mail:
Meltzoff@u.washington.edu
Fig. 1. The new science of learning has arisen from several disciplines. Researchers in developmental
psychology have identified social factors that are essential for learning. Powerful learning algorithms from
machine learning have demonstrated that contingencies in the environment are a rich source of infor-
mation about social cues. Neuroscientists have found brain systems involved in social interactions and
mechanisms for synaptic plasticity that contribute to learning. Classrooms are laboratories for discovering
effective teaching practices. [Photo credits: R. Goebel (neuroscience), iStockphoto.com/J. Bryson (education),
Y. Tsuno/AFP/Getty Images (machine learning)]
17 JULY 2009 VOL 325 SCIENCE www.sciencemag.org284
REVIEW
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org
respond to these babylike vocalizations. After 6
min of learning, BabyBot detected novel faces
and generalized to the schematic faces used in
studies of infant face recognition.
Statistical regularities and covariations in the
world thus provide a richer source of information
than previously thought. Infants’ pickup of this
information is implicit; it occurs without parental
training and begins before infants can manipulate
the physical world or speak their first words. New
machine learning programs also succeed with-
out direct reinforcement or supervision. Learning
from probabilistic input provides an alternative to
Skinnerian reinforcement learning and Chomskian
nativist accounts (11, 12).
Learning is social. Children do not compute
statistics indiscriminately. Social cues highlight
what and when to learn. Even young infants are
predisposed to attend to people and are motivated
to copy the actions they see others do (13). They
more readily learn and reenact an event when it is
produced by a person than by an inanimate device
(14, 15).
Machine learning studies show that system-
atically increasing a robot’s social-like behaviors
and contingent responsivity elevates young chil-
dren’s willingness to connect with and learn from
it (16). Animal models may help explain how
social interaction affects learning: In birds, neuro-
steroids that affect learning modulate brain activity
during social interaction (17). Social interaction
can extend the sensitive period for learning in
birds (18). Social factors also play a role in life-
long learning—new social technologies (for ex-
ample, text messaging, Facebook, and Twitter)
tap humans’ drive for social communication. Edu-
cational technology is increasingly embodying the
principles of social interaction in intelligent tutor-
ing systems to enhance student learning (19).
Learning is supported by brain circuits linking
perception and action. Human social and language
learning are supported by neural-cognitive systems
that link the actions of self and other. Moreover, the
brain machinery needed to perceive the world and
move our bodies to adapt to the movements of
people and objects is complex, requiring contin-
uous adaptation and plasticity. Consider what is
necessary to explain human imitative learning.
Newborns as young as 42 min old match gestures
shown to them, including tongue protrusion and
mouth opening (20). This is remarkable because
infants cannot see their own faces, and newborns
have never seen their reflection in a mirror. Yet,
newborns can map from observed behavior to
their own matching acts, suggesting shared repre-
sentations for the acts of self and others (15, 20).
Neuroscientists have discovered a striking over-
lap in the brain systems recruited both for the
perception and production of actions (21, 22).
For example, in human adults there is neuronal
activation when observing articulatory movements
in the cortical areas responsible for producing those
articulations (23). Social learning, imitation, and
sensorimotor experience may initially generate, as
well as modify and refine, shared neural circuitry
for perception and action. The emerging field of
social neuroscience is aimed at discovering brain
mechanisms supporting close coupling and at-
tunement between the self and other, which is the
hallmark of seamless social communication and
interaction.
Social Learning and Understanding
Human children readily learn through social in-
teractions with other people. Three social skills
are foundational to human development and rare
in other animals: imitation, shared attention, and
empathic understanding.
Imitation. Learning by observing and imitating
experts in the culture is a powerful social learning
mechanism. Children imitate a diverse range of
acts, including parental mannerisms, speech pat-
terns, and the use of instruments to get things
done. For example, a toddler may see her father
using a telephone or computer keyboard and
crawl up on the chair and babble into the receiver
or poke the keys. Such behavior is not explicitly
trained (it may be discouraged by the parent), and
there is no inborn tendency to treat plastic boxes
in this way—the child learns by watching others
and imitating.
Imitation accelerates learning and multiplies
learning opportunities. It is faster than individual
discovery and safer than trial-and-error learning.
Children can use third-person information (ob-
servation of others) to create first-person knowl-
edge. This is an accelerator for learning: Instead
of having to work out causal relations themselves,
children can learn from watching experts. Imitative
learning is valuable because the behavioral ac-
tions of others “like me” serve as a proxy for
one’s own (15).
Children do not slavishly duplicate what they
see but reenact a person’s goals and intentions.
For example, suppose an adult tries to pull apart
an object but his hand slips off the ends. Even at
18 months of age, infants can use the pattern of
unsuccessful attempts to infer the unseen goal of
another. They produce the goal that the adult was
striving to achieve, not the unsuccessful attempts
(14). Children choose whom, when, and what to
imitate and seamlessly mix imitation and self-
discovery to solve novel problems (24, 25).
Imitation is a challenging computational prob-
lem that is being intensively studied in the robotic
and machine learning communities (26, 27). It
requires algorithms capable of inferring complex
sensorimotor mappings that go beyond the repeti-
tion of observed movements. The match must be
achieved despite the fact that the teacher may be
different from the observer in several ways (e.g.,
size, spatial orientation, morphology, dexterity).
The ultimate aim is to build robots that can learn
like infants, through observation and imitation
(28). Current computational approaches to imi-
tation can be divided into direct and goal-based
approaches. Direct approaches learn input-action
mappings that reproduce the observed behaviors
(26); goal-based approaches, which are more re-
cent and less explored, infer the goal of the ob-
served behaviors and then produce motor plans
that achieve those goals (29, 30).
Shared attention. Social learning is facilitated
when people share attention. Shared attention
to the same object or event provides a common
ground for communication and teaching. An early
component of shared attention is gaze following
(Fig. 2). Infants in the first half year of life look
more often in the direction of an adult’s head turn
when peripheral targets are in the visual field (31).
By 9 months of age, infants interacting with a
responsive robot follow its head movements, and
the timing and contingencies, not just the visual
appearance of the robot, appear to be key (32). It
is unclear, however, whether young infants are
trying to look at what another is seeing or are
simply tracking head movements. By 12 months,
sensitivity to the direction and state of the eyes
exists, not just sensitivity to the direction of head
turning. If a person with eyes open turns to one of
Fig. 2. Gaze following is a mechanism that brings adults and infants
into perceptual contact with the same objects and events in the world,
facilitating word learning and social communication. After interacting
with an adult (left), a 12-month-old infant sees an adult look at one
of two identical objects (middle) and immediately follows her gaze
(right).
www.sciencemag.org SCIENCE VOL 325 17 JULY 2009 285
REVIEW
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org
two equidistant objects, 12-month-old infants look
at that particular target, but not if the person makes
the same head movement with eyes closed (33).
A blindfold covering the person’s eyes causes
12-month-olds to make the mistake of following
the head movements. They understand that eye
closure, but not a blindfold, blocks the other per-
son’s view. Self-experience corrects this error. In a
training study, 1-year-olds were given experience
with a blindfold so they understood that it made it
impossible to see. When the adult subsequently
wore the blindfold, infants who had received self-
experience with it treated the adult as if she could
not see (34), whereas control infants did not.
Infants project their own experience onto other
people. The ability to interpret the behavior and
experience of others by using oneself as a model
is a highly effective learning strategy that may be
unique to humans and impaired in children with
autism (35, 36). It would be useful if this could be
exploited in machine learning, and preliminary
progress is being made (37).
Empathy and social emotions. The capacity
to feel and regulate emotions is critical to under-
standing human intelligence and has become an
active area of research in human-computer inter-
action (38). In humans, many affective processes
are uniquely social. Controlled experiments lead
to the conclusion that prelinguistic toddlers en-
gage in altruistic, instrumental helping (39). Chil-
dren also show primitive forms of empathy. When
an adult appears to hurt a finger and cry in pain,
children under 3 years of age comfort the adult,
sometimes offering a bandage or teddy bear (40).
Related behavior has been observed with children
helping and comforting a social robot that was
“crying” (16, 41).
Brain imaging studies in adults show an over-
lap in the neural systems activated when people
receive a painful stimulus themselves or perceive
that another person is in pain (42, 43). These
neural reactions are modulated by cultural expe-
rience, training, and perceived similarity between
self and other (43, 44). Atypical neural patterns
have been documented in antisocial adolescents
(45). Discovering the origins of individual differ-
ences in empathy and compassion is a key issue
for developmental social-cognitive neuroscience.
Language Learning
Human language acquisition poses a major chal-
lenge for theories of learning, and major ad-
vances have been made in the last decade (46). No
computer has cracked the human speech code and
achieved fluent speech understanding across talk-
ers, which children master by 3 years of age (11).
Human language acquisition sheds light on the
interaction among computational learning, social
facilitation of learning, and shared neural circuitry
for perception and production.
Behavioral development. Early in development,
infants have a capacity to distinguish all sounds
across the languages of the world, a capacity shared
by nonhuman primates (47). However, infants’
universal capacities narrow with development,
and by one year of age, infants’ ability to perceive
sound distinctions used only in foreign languages
and not their native environment is weakened.
Infants’ universal capacities become language-
specific between 9 and 12 months of age. Ameri-
can and Japanese infants, who at 7 months of age
discriminated /ra/ from /la/ equally well, both
change by 11 months: American infants improve
significantly while Japanese infants’skills show a
sharp decline (48).
This transition in infant perception is strongly
influenced by the distributional frequency of sounds
contained in ambient language (4, 5). Infants’ com-
putational skills are sufficiently robust that labo-
ratory exposure to artificial syllables in which the
distributional frequencies are experimentally ma-
nipulated changes infants’ abilities to discriminate
the sounds.
However, experiments also show that the com-
putations involved in language learning are“gated”
by social processes (49). In foreign-language learn-
ing experiments, social interaction strongly influ-
ences infants’statistical learning. Infants exposed
to a foreign language at 9 months learn rapidly,
but only when experiencing the new language
during social interchanges with other humans.
American infants exposed in the laboratory to
Mandarin Chinese rapidly learned phonemes and
words from the foreign language, but only if ex-
posed to the new language by a live human being
during naturalistic play. Infants exposed to the
same auditory input at the same age and for the
same duration via television or audiotape showed
no learning (50) (Fig. 3). Why infants learned bet-
ter from people and what components of social
interactivity support language learning are currently
being investigated (51). Determining the key stim-
ulus and interactive features will be important for
theory. Temporal contingencies may be critical (52).
Other evidence that social input advances lan-
guage learning comes from studies showing that
infants vocally imitate adult vowel sounds by
Fig. 3. The need for social interaction in language acquisition is shown by foreign-language learning
experiments. Nine-month-old infants experienced 12 sessions of Mandarin Chinese through (A) natural
interaction with a Chinese speaker (left) or the identical linguistic information delivered via television
(right) or audiotape (not shown). (B) Natural interaction resulted in significant learning of Mandarin
phonemes when compared with a control group who participated in interaction using English (left). No
learning occurred from television or audiotaped presentations (middle). Data for age-matched Chinese
and American infants learning their native languages are shown for comparison (right). [Adapted from
(50) and reprinted with permission.]
17 JULY 2009 VOL 325 SCIENCE www.sciencemag.org286
REVIEW
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org
5 months but not acoustically matched nonspeech
sounds that are not perceived as human speech
(53, 54). By 10 months, even before speaking
words, the imitation of social models results in
a change in the types of vocalizations children
produce. Children raised in Beijing listening to
Mandarin babble by using tonelike pitches char-
acteristic of Mandarin, which make them sound
distinctly Chinese. Children being raised in Seattle
listening to English do not babble by using such
tones and sound distinctly American.
Children react to a social audience by increas-
ing the complexity of their vocal output. When
mothers’ responses to their infants’ vocalizations
are controlled experimentally, a mother’s imme-
diate social feedback results both in greater num-
bers and more mature, adultlike vocalizations from
infants (55). Sensory impairments affect infant
vocalizations: Children with hearing impairments
use a greater preponderance of sounds (such as
“ba”) that they can see by following the lip move-
ments of the talker. Infants who are blind babble
by using a greater proportion of sounds that do
not rely on visible articulations (“ga”) (56).
Birdsong provides a neurobiological model of
vocal learning that integrates self-generated sen-
sorimotor experience and social input. Passerine
birds learn conspecific song by listening to and
imitating adult birds. Like humans, young birds
listen to adult conspecific birds sing during a sen-
sitive period in development and then practice that
repertoire during a “sub-song” period (akin to bab-
bling) until it is crystallized (57). Neural models
of birdsong learning can account for this gradual
process of successive refinement (58). In birds, as
in humans, a social context enhances vocal learn-
ing (59).
Neural plasticity. In humans, a sensitive pe-
riod exists between birth and 7 years of age when
language is learned effortlessly; after puberty, new
language learning is more difficult, and native-
language levels are rarely achieved (60, 61). In
birds, the duration of the sensitive period is ex-
tended in richer social environments (18, 62).
Human learning beyond the sensitive period may
also benefit from social interaction. Adult foreign-
language learning improves under more social
learning conditions (63).
A candidate mechanism governing the sensi-
tive period for language in humans is neural com-
mitment (11). Neural commitment is the formation
of neural architecture and circuitry dedicated to the
detection of phonetic and prosodic characteristics
of the particular ambient language(s) to which the
infant is exposed. The neural circuitry maximizes
detection of a particular language and, when fully
developed, interferes with the acquisition of a new
language.
Neural signatures of children’s early language
learning can be documented by using event-related
potentials (ERPs). Phonetic learning can be docu-
mented at 11 months of age; responses to known
words, at 14 months; and semantic and syntactic
learning, at 2.5 years (64). Early mastery of the
sound patterns of one’s native language provides a
foundation for later language learning: Children
who show enhanced ERP responses to phonemes
at 7.5 months show faster advancement in language
acquisition between 14 and 30 months of age (65).
Children become both native-language listen-
ers and speakers, and brain systems that link per-
ception and action may help children achieve
parity between the two systems. In adults, func-
tional magnetic resonance imaging studies show
that watching lip movements appropriate for speech
activates the speech motor areas of the brain (66).
Early formation of linked perception-production
brain systems for speech has been investigated by
using brain imaging technology called magneto-
encephalography (MEG). MEG reveals nascent
neural links between speech perception and pro-
duction. At 6 months of age, listening to speech
activates higher auditory brain areas (superior
temporal), as expected, but also simultaneously
activates Broca’s area, which
controls speech production,
although listening to non-
speech sounds does not [(67);
see also (68)]. MEG technol-
ogy will allow linguists to ex-
plore how social interaction
and sensorimotor experience
affects the cortical process-
ing of language in children
and why young children can
learn foreign language mate-
rial from a human tutor but
not a television.
New interactive robots
are being designed to teach
language to children in a
social-likemanner.Engineers
created a social robot that
autonomously interacts with
toddlers, recognizing their
moods and activities (16)
(Fig. 4). Interaction with the
social robot over a 10-day period resulted in a
significant increase in vocabulary in 18- to 24-
month-old children compared with the vocabu-
lary of an age-matched control group (41). This
robotic technology is now being used to test
whether children might learn foreign language
words through social games with the robot.
Education
During their long period of immaturity, human
brains are sculpted by implicit social and statis-
tical learning. Children progress from relatively
helpless, observant newborns to walking, talking,
empathetic people who perform everyday exper-
iments on cause and effect. Educators are turning
to psychology, neuroscience, and machine learn-
ing to ask: Can the principles supporting early
exuberant and effortless learning be applied to
improve education?
Progress is being made in three areas: early
intervention programs, learning outside of school,
and formal education.
Children are born learning, and how much
they learn depends on environmental input, both
social and linguistic. Many children entering kinder-
garten in the United States are not ready for school
(69), and children who start behind in school-entry
academic skills tend to stay behind (70). Neuro-
science work suggests that differences in learning
opportunities before first grade are correlated with
neural differences that may affect school learning
(71, 72).
The recognition that the right input at the right
time has cascading effects led to early interventions
for children at risk for poor academic outcomes.
Programs enhancing early social interactions and
contingencies produce significant long-term im-
provements in academic achievement, social ad-
justment, and economic success and are highly
cost effective (73–75).
The science of learning has also affected the
design of interventions with children with dis-
abilities. Speech perception requires the ability to
perceive changes in the speech signal on the time
scale of milliseconds, and neural mechanisms for
plasticity in the developing brain are tuned to
these signals. Behavioral and brain imaging ex-
periments suggest that children with dyslexia have
difficulties processing rapid auditory signals; com-
puter programs that train the neural systems re-
sponsible for such processing are helping children
with dyslexia improve language and literacy (76).
The temporal “stretching” of acoustic distinctions
that these programs use is reminiscent of infant-
directed speech (“motherese”) spoken to infants in
natural interaction (77). Children with autism spec-
trum disorders (ASD) have deficits in imitative
learning and gaze following (78–80). This cuts
them off from the rich socially mediated learning
opportunities available to typicallydeveloping chil-
dren, with cascading developmental effects. Young
children with ASD prefer an acoustically matched
nonspeech signal over motherese, and the degree
of preference predicts the degree of severity of
their clinical autistic symptoms (81). Children
Fig. 4. A social robot can operate autonomously with children in a
preschool setting. In this photo, toddlers play a game with the robot. One
long-term goal is to engineer systems that test whether young children
can learn a foreign language through interactions with a talking robot.
www.sciencemag.org SCIENCE VOL 325 17 JULY 2009 287
REVIEW
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org
with ASD are attracted to humanoid robots with
predictable interactivity, which is beginning to be
used in diagnosis and interventions (82).
Elementary and secondary school educators
are attempting to harness the intellectual curiosity
and avid learning that occurs during natural so-
cial interaction. The emerging field of informal
learning (83) is based on the idea that informal
settings are venues for a significant amount of
childhood learning. Children spend nearly 80%
of their waking hours outside of school. They
learn at home; in community centers; in clubs;
through the Internet; at museums, zoos, and aquar-
iums; and through digital media and gaming.
Informal learning venues are often highly social
and offer a form of mentoring, apprenticeship,
and participation that maximizes motivation and
engages the learner’s sense of identity; learners
come to think of themselves as good in tech-
nology or as budding scientists, and such self-
concepts influence children’s interests, goals, and
future choices (84, 85). A recent National Research
Council study on science education (83) cataloged
factors that enliven learning in informal learning
venues with the long-term goal of using them to
enhance learning in school.
In formal school settings, research shows that
individual face-to-face tutoring is the most ef-
fective form of instruction. Students taught by
professional tutors one on one show achievement
levels that are two standard deviations higher than
those of students in conventional instruction (86).
New learning technologies are being developed
that embody key elements of individual human
tutoring while avoiding its extraordinary financial
cost. For example, learning researchers have de-
veloped intelligent tutoring systems based on cog-
nitive psychology that provide an interactive
environment with step-by-step feedback, feed-
forward instructional hints to the user, and dy-
namic problem selection (19). These automatic
tutors have been shown to approximate the ben-
efits of human tutoring by adapting to the needs of
individual students, as good teachers do. Class-
rooms are becoming living laboratories as research-
ers and educators use technology to track and
collect data from individual children and use this
information to test theories and design curricula.
Conclusions
A convergence of discoveries in psychology, neu-
roscience, and machine learning has resulted in
principles of human learning that are leading to
changes in educational theory and the design of
learning environments. Reciprocally, educational
practice is leading to the design of new experi-
mental work. A key component is the role of “the
social” in learning. What makes social interaction
such a powerful catalyst for learning? Can key
elements be embodied in technology to improve
learning? How can we capitalize on social factors
to teach children better and to foster their natural
curiosity about people and things? These are
deep questions at the leading edge of the new
science of learning.
References and Notes
1. J. M. Allman, Evolving Brains (Freeman, New York, 1999).
2. S. R. Quartz, T. J. Sejnowski, Behav. Brain Sci. 20, 537
(1997).
3. R. A. Jacobs, M. Dominguez, Neural Comput. 15, 761 (2003).
4. P. K. Kuhl, K. A. Williams, F. Lacerda, K. N. Stevens,
B. Lindblom, Science 255, 606 (1992).
5. J. Maye, J. F. Werker, L. Gerken, Cognition 82, B101 (2002).
6. J. R. Saffran, R. N. Aslin, E. L. Newport, Science 274,
1926 (1996).
7. A. Gopnik et al., Psychol. Rev. 111, 3 (2004).
8. T. M. Mitchell, Machine Learning (McGraw-Hill, New York,
1997).
9. R. Douglas, T. Sejnowski, “Future challenges for the
science and engineering of learning,” www.nsf.gov/sbe/
SLCWorkshopReportjan08 .
10. N. J. Butko, I. R. Fasel, J. R. Movellan, in Proceedings of
the 5th IEEE International Conference on Development
and Learning, Bloomington, IN, 31 May to 3 June 2006.
11. P. K. Kuhl, Nat. Rev. Neurosci. 5, 831 (2004).
12. A. Gopnik, J. B. Tannenbaum, Dev. Sci. 10, 281 (2007).
13. A. N. Meltzoff, M. K. Moore, Science 198, 75 (1977).
14. A. N. Meltzoff, Dev. Psychol. 31, 838 (1995).
15. A. N. Meltzoff, Acta Psychol. 124, 26 (2007).
16. F. Tanaka, A. Cicourel, J. R. Movellan, Proc. Natl. Acad.
Sci. U.S.A. 104, 17954 (2007).
17. L. Remage-Healey, N. T. Maidment, B. A. Schlinger,
Nat. Neurosci. 11, 1327 (2008).
18. M. S. Brainard, E. I. Knudsen, J. Neurosci. 18, 3929 (1998).
19. K. R. Koedinger, V. Aleven, Educ. Psychol. Rev. 19, 239
(2007).
20. A. N. Meltzoff, M. K. Moore, Early Dev. Parenting 6, 179
(1997).
21. R. Hari, M. Kujala, Physiol. Rev. 89, 453 (2009).
22. G. Rizzolatti, L. Fogassi, V. Gallese, Nat. Rev. Neurosci. 2,
661 (2001).
23. R. Möttönen, J. Järveläinen, M. Sams, R. Hari, Neuroimage
24, 731 (2004).
24. R. A. Williamson, A. N. Meltzoff, E. M. Markman, Dev. Psychol.
44, 275 (2008).
25. B. M. Repacholi, A. N. Meltzoff, B. Olsen, Dev. Psychol.
44, 561 (2008).
26. S. Schaal, Trends Cogn. Sci. 3, 233 (1999).
27. A. P. Shon, J. J. Storz, A. N. Meltzoff, R. P. N. Rao, Int. J.
Hum. Robotics 4, 387 (2007).
28. Y. Demiris, A. N. Meltzoff, Infant Child Dev. 17, 43 (2008).
29. A. Y. Ng, S. Russell, in Proceedings of the 17th International
Conference on Machine Learning, Morgan Kaufmann,
Stanford, CA, 29 June to 2 July 2000, pp. 663–670.
30. D. Verma, R. P. N. Rao, in Advances in Neural
Information Processing Systems (MIT Press, Cambridge,
MA, 2006), pp. 1393–1400.
31. R. Flom, K. Lee, D. Muir, Eds., Gaze-Following (Erlbaum,
Mahwah, NJ, 2007).
32. J. R. Movellan, J. S. Watson, in Proceedings of the 2nd IEEE
International Conference on Development and Learning,
Cambridge, MA, 12 to 15 June 2002, pp. 34–42.
33. R. Brooks, A. N. Meltzoff, Dev. Psychol. 38, 958 (2002).
34. A. N. Meltzoff, R. Brooks, Dev. Psychol. 44, 1257 (2008).
35. A. N. Meltzoff, Dev. Sci. 10, 126 (2007).
36. M. Tomasello, M. Carpenter, J. Call, T. Behne, H. Moll,
Behav. Brain Sci. 28, 675 (2005).
37. J. Bongard, V. Zykov, H. Lipson, Science 314, 1118 (2006).
38. R. W. Picard, Affective Computing (MIT Press, Cambridge,
MA, 1997).
39. F. Warneken, M. Tomasello, Science 311, 1301 (2006).
40. C. Zahn-Waxler, M. Radke-Yarrow, R. A. King, Child Dev.
50, 319 (1979).
41. J. R. Movellan, M. Eckhart, M. Virnes, A. Rodriguez, in
Proceedings of the International Conference on Human
Robot Interaction, San Diego, CA, 11 to 13 March 2009,
pp. 307–308.
42. T. Singer et al., Science 303, 1157 (2004).
43. G. Hein, T. Singer, Curr. Opin. Neurobiol. 18, 153 (2008).
44. C. Lamm, A. N. Meltzoff, J. Decety, J. Cogn. Neurosci., in
press; available online at www.mitpressjournals.org/doi/
abs/10.1162/jocn.2009.21186.
45. J. Decety, K. J. Michalska, Y. Akitsuki, B. Lahey, Biol. Psychol.
80, 203 (2009).
46. P. Kuhl, L. Gleitman, “Opportunities and challenges for
language learning and education,” www.nsf.gov/sbe/slc/
NSFLanguageWorkshopReport .
47. P. K. Kuhl, J. D. Miller, Science 190, 69 (1975).
48. P. K. Kuhl et al., Dev. Sci. 9, F13 (2006).
49. P. K. Kuhl, Dev. Sci. 10, 110 (2007).
50. P. K. Kuhl, F.-M. Tsao, H.-M. Liu, Proc. Natl. Acad. Sci. U.S.A.
100, 9096 (2003).
51. B. T. Conboy, P. K. Kuhl, in On Being Moved: From Mirror
Neurons to Empathy, S. Bråten, Ed. (John Benjamins,
Philadelphia, 2007), pp. 175–199.
52. P. R. Montague, T. J. Sejnowski, Learn. Mem. 1, 1 (1994).
53. P. K. Kuhl, A. N. Meltzoff, Science 218, 1138 (1982).
54. P. K. Kuhl, A. N. Meltzoff, J. Acoust. Soc. Am. 100, 2425
(1996).
55. M. H. Goldstein, A. P. King, M. J. West, Proc. Natl. Acad. Sci.
U.S.A. 100, 8030 (2003).
56. C. Stoel-Gammon, in Phonological Development,
C. A. Ferguson, L. Menn, C. Stoel-Gammon, Eds. (York,
Timonium, MD, 1992), pp. 273–282.
57. M. S. Brainard, A. J. Doupe, Nature 417, 351 (2002).
58. K. Doya, T. J. Sejnowski, in Advances in Neural Information
Processing Systems, G. Tesauro, D. S. Touretzky, T. Leen,
Eds. (MIT Press, Cambridge, MA, 1995), pp. 101–108.
59. A. J. Doupe, P. K. Kuhl, in Neuroscience of Birdsong,
H. P. Zeigler, P. Marler, Eds. (Cambridge Univ. Press,
Cambridge, 2008), pp. 5–31.
60. J. S. Johnson, E. L. Newport, Cognit. Psychol. 21, 60 (1989).
61. R. I. Mayberry, E. Lock, Brain Lang. 87, 369 (2003).
62. L. F. Baptista, L. Petrinovich, Anim. Behav. 34, 1359 (1986).
63. Y. Zhang et al., Neuroimage 46, 226 (2009).
64. P. K. Kuhl, M. Rivera-Gaxiola, Annu. Rev. Neurosci. 31,
511 (2008).
65. P. K. Kuhl et al., Philos. Trans. R. Soc. London Ser. B.
363, 979 (2008).
66. N. Nishitani, R. Hari, Neuron 36, 1211 (2002).
67. T. Imada et al., Neuroreport 17, 957 (2006).
68. G. Dehaene-Lambertz et al., Proc. Natl. Acad. Sci. U.S.A.
103, 14240 (2006).
69. J. P. Shonkoff, D. A. Philips, Eds., From Neurons to
Neighborhoods (National Academy Press, Washington,
DC, 2000).
70. G. J. Duncan et al., Dev. Psychol. 43, 1428 (2007).
71. R. D. S. Raizada, T. L. Richards, A. N. Meltzoff, P. K. Kuhl,
Neuroimage 40, 1392 (2008).
72. D. A. Hackman, M. J. Farah, Trends Cogn. Sci. 13, 65 (2009).
73. C. T. Ramey, S. L. Ramey, Merrill Palmer Q. 50, 471 (2004).
74. J. J. Heckman, Science 312, 1900 (2006).
75. E. I. Knudsen, J. J. Heckman, J. L. Cameron, J. P. Shonkoff,
Proc. Natl. Acad. Sci. U.S.A. 103, 10155 (2006).
76. P. Tallal, Nat. Rev. Neurosci. 5, 721 (2004).
77. P. K. Kuhl et al., Science 277, 684 (1997).
78. S. J. Rogers, J. H. G. Williams, Eds., Imitation and the
Social Mind: Autism and Typical Development (Guilford,
New York, 2006).
79. P. Mundy, L. Newell, Curr. Dir. Psychol. Sci. 16, 269 (2007).
80. K. Toth, J. Munson, A. N. Meltzoff, G. Dawson, J. Autism
Dev. Disord. 36, 993 (2006).
81. P. K. Kuhl, S. Coffey-Corina, D. Padden, G. Dawson,
Dev. Sci. 8, F1 (2005).
82. B. Scassellati, in Proceedings of the 14th IEEE International
Workshop on Robot and Human Interactive Communication,
Nashville, TN, 13 to 15 August 2005, pp. 585–590.
83. P. Bell, B. Lewenstein, A. W. Shouse, M. A. Feder, Eds.,
Learning Science in Informal Environments (National
Academy Press, Washington, DC, 2009).
84. C. D. Lee, Educ. Res. 37, 267 (2008).
85. J. Bruner, The Culture of Education (Harvard Univ. Press,
Cambridge, MA, 1996).
86. B. S. Bloom, Educ. Res. 13, 4 (1984).
87. Supported by NSF Science of Learning Center grants
SBE-0354453 (A.M. and P.K.) and SBE-0542013 (J.M. and
T.S.), NIH grants HD-22514 (A.M.) and HD-37954 (P.K.),
NSF IIS-0808767 (J.M.), and Howard Hughes Medical
Institute (T.S.). The views in this article do not necessarily
reflect those of NSF or NIH. We thank M. Asada for
assistance with the robotic image in Fig. 1; A. Gopnik,
J. Bongard, P. Marshall, S. Cheryan, P. Tallal, and J. Watson
for valuable comments; and the members of the NSF
Science of Learning Centers for helpful discussions.
10.1126/science.1175626
17 JULY 2009 VOL 325 SCIENCE www.sciencemag.org288
REVIEW
o
n
J
u
ly
1
6
,
2
0
0
9
w
w
w
.s
ci
e
n
ce
m
a
g
.o
rg
D
o
w
n
lo
a
d
e
d
f
ro
m
http://www.sciencemag.org
Perspectives on Psychological Science
8(5) 573 –585
© The Author(s) 2013
Reprints and permissions:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1745691613498098
pps.sagepub.com
In 1916, Margaret Floy Washburn, the first woman to
receive a doctorate in psychology, championed the need
to connect the facts of mental life with those of bodily
movement. In contrast, the behaviorists that followed,
led by John B. Watson, ousted the study of mental
life from scientific psychology—for a time—while retain-
ing the study of motor responses. In the mid-1960s,
a backlash to behaviorism, the cognitive revolution,
occurred. Mental life was back, not only for people but
even for computers, with their gargantuan size, kludgy
switches, fans, and paper tapes. By 1988, the cognitive
revolution was complete: Behaviorism was vanquished,
but in the ensuing enthusiasm for studying the mind, the
relation that Washburn had seen as so worthy of study—
that human consciousness is grounded in the human
body and movements—was nearly forgotten. Thought,
and with it consciousness, was seen, by the standard
view, as disembodied, with the contribution of action
being relegated, along with behaviorism, to the third sub-
basement. Today, though, the view of disembodied cog-
nition is being challenged by approaches that emphasize
the importance of embodiment.
Here we present an idiosyncratic account of what the
field of cognition was and how it has evolved since 1988.
We then describe our approach to embodied cognition.
In preview, the fundamental tenet of embodied cognition
research is that thinking is not something that is divorced
from the body; instead, thinking is an activity strongly
influenced by the body and the brain interacting with the
environment. To say it differently, how we think depends
on the sorts of bodies we have. Furthermore, the reason
why cognition depends on the body is becoming clear:
Cognition exists to guide action. We perceive in order to
act (and what we perceive depends on how we intend to
act); we have emotions to guide action; and understand-
ing even the most abstract cognitive processes (e.g., the
self, language) is benefited by considering how they are
grounded in action. This concern for action contrasts
with standard cognitive psychology that, for the most
part, considers action (and the body) as secondary to
cognition.
We believe that this approach is pushing the evolution
of the field with a surprising (for some of us) but happy
498098PPSXXX10.1177/1745691613498098Glenberg et al.From the Revolution to Embodiment
research-article2013
Corresponding Author:
Arthur M. Glenberg, Department of Psychology, Arizona State
University, Mail Code 1104, Tempe, AZ 85287-1104
E-mail: glenberg@ asu.edu
From the Revolution to Embodiment:
25 Years of Cognitive Psychology
Arthur M. Glenberg1,2, Jessica K. Witt3, and Janet Metcalfe4
1Arizona State University, 2University of Wisconsin-Madison, 3Colorado State University,
and 4Columbia University
Abstract
In 1988, the cognitive revolution had become institutionalized: Cognition was the manipulation of abstract symbols
by rules. But, much like institutionalized political parties, some of the ideas were becoming stale. Where was action?
Where was the self? How could cognition be smoothly integrated with emotions, with social psychology, with
development, with clinical analyses? Around that time, thinkers in linguistics, philosophy, artificial intelligence, biology,
and psychology were formulating the idea that just as overt behavior depends on the specifics of the body in action, so
might cognition depend on the body. Here we characterize (some would say caricature) the strengths and weaknesses
of cognitive psychology of that era, and then we describe what has come to be called embodied cognition: how
cognition arises through the dynamic interplay of brain controlling bodily action controlling perception, which changes
the brain. We focus on the importance of action and how action shapes perception, the self, and language. Having
the body in action as a central consideration for theories of cognition promises, we believe, to help unify psychology.
Keywords
action, performance, cognition, language, communication, memory
574 Glenberg et al.
conclusion: Simple principles of embodiment may pro-
vide a unifying perspective for psychology (Glenberg,
2010; Schubert & Semin, 2009).
Cognition in 1988: Thinking Is Symbol
Manipulation
The standard approach to the study of cognition in 1988
was formalized a decade earlier by two of the giants in the
field, Alan Newell and Herbert Simon, under the guise of
the physical symbol system hypothesis (PSSH; Newell &
Simon, 1976). Of course, not every cognitive theory was a
perfect exemplar of the PSSH, but it provided the back-
ground and a set of default assumptions. It was also bril-
liant. Start with the following conundrum: Computers
appear to think. People appear to think. But computers
are lifeless silicon and humans are living flesh. What could
they have in common that could produce thinking? The
answer, according to the PSSH, was that both are symbol
processors with three features in common.
Three features of PSSH thinking
“machines”
First, the symbols have a physical instantiation (e.g.,
bistable memory cells that store patterns of zeros and
ones in computer memory or, in its strong form, physical
objects such as physical tokens). These physical symbols
have representational properties, that is, the symbol
stands in for real things like colors, emotions, images, and
activities. But in contrast to William James’s insight that
mentally we never descend twice into the same stream,
these physical symbols were thought to be context invari-
ant, static, and disembodied. They were characterized by
what came to be known as the “trans-situational identity
assumption” (Tulving, 1983)—the assumption that a par-
ticular symbol was static and immutable, remaining the
same regardless of when or how it was used.
Second, the symbols are manipulated by rules, namely
the if–then operations of programs in a computer, which
were equated to learned rules, associations, and produc-
tions in humans. Thought was taken to be the manipula-
tion of symbols using rules, in both computers and
people.
Third, and importantly, both the computer symbols
and human symbols were seen as arbitrarily (i.e., by con-
vention alone) related to the referent. Thus, just as a
sequence of zeros and ones in a computer representing
the concept, say, bird, does not look or sound or behave
as a real bird, neither does the human mental symbol
resemble a bird. This arbitrariness ensured that the sym-
bols could be manipulated solely by the properties rele-
vant to the rules; it ensured efficiency because there is
nothing in the symbol that is extraneous; it ensured the
possibility of limitless symbols even if the human ability
to discriminate along some dimension was limited; and
finally, it ensured that computer symbols and human
symbols were the same in kind.
A good analogy for understanding the PSSH is lan-
guage. In many analyses, languages have words that act
like symbols and syntax that acts like rules for combining
the symbols into new thoughts. It is easy to see that many
words are arbitrarily related to their referents. The word
bird no more looks like, sounds like, or acts like a bird
than does the equally arbitrary French oiseau or German
Vogel. This sort of analogy led Fodor (1975) to explicitly
propose that thinking is a language-like activity: the lan-
guage of thought. Just as language has symbols and syn-
tax, thought also has symbols and syntax.
The PSSH provided the background for many cogni-
tive theories of the day, even though the particular
assumptions were not often explicitly stated or tested.
For example, Anderson’s ACT theory (Anderson, 1990),
Collins and Quillian’s (1969) spreading activation model,
and Kintsch’s construction-integration theory (Kintsch,
1988) were built on the notion of propositions, or units
of meaning consisting of symbols and the relations
among them. Those symbols were the arbitrary symbols
of the PSSH, and the relations and processes were the
rules.
Four problems with PSSH
Symbols. Given the success of PSSH approaches to cog-
nition, what is the problem? Even before the advent of
embodied cognition, cognitive theories were evolving
away from the PSSH. First, if symbols are part of human
cognition, they are not arbitrary but grounded, that is,
traceable to referents in the world. Mental operations
depend on the brain’s neural layers interacting in particu-
lar ways that could be thought of as transformations and
that have particular functions. These operations take
information that originates in the world and subject it to
transformations (such as lateral inhibition in the retina,
autocorrelation in the memory systems, and Fourier
transforms in the auditory system) that are useful to the
person or animal. What is important, however, is that
even with the transformations, the internal representa-
tions at deeper and deeper brain levels are ultimately
traceable to the outside world, that is, grounded.
An example of an early model attempting to ground
symbols is Metcalfe’s CHARM (Metcalfe, 1990; Metcalfe
Eich, 1982, 1985) model. Whereas it is true, even in this
model, that the numbers used to represent a bird do not
in any way look like a bird, the representations were not
completely arbitrary—namely, the model used complex
operations of convolution and correlation for memory
encoding and retrieval, and these operations preserved
From the Revolution to Embodiment 575
similarity among symbols across transformations. Also,
this model and other related non-PHHS models (e.g.,
McClelland & Elman, 1986; Murdock, 1982) were begin-
ning to strive for representations that could potentially be
instantiated in neurons and that bore a more realistic
relation to the structure of semantic memory.
Furthermore, the notion of transsituational identity of
symbols, although fine for computers, did not work well
for humans. The experimental data demonstrated that it
was simply untrue that the symbols used by humans
were immutable. The “train” that a person imagined in
the context of freight was a different “train” than was
encoded in the context of black, and memory was con-
text specific—dependent on the specificity of encoding
(Tulving & Thomson, 1973).
Representations separate from processes. The PSSH
makes a strong distinction between representations and
the processes that act on them. Whereas this distinction
holds for some computer languages, it is not at all clear
that the distinction can be made for humans. For exam-
ple, from a connectionist–constraint satisfaction perspec-
tive, representations arise from patterns of activation
produced by processing, and they are not separable from
the processing. From the dynamic systems perspective,
there is no representation without process. From the per-
ception–action perspective, what we perceive is neces-
sarily related to how we act (Gibson, 1979).
The Cartesian distinction. The PSSH makes a Carte-
sian distinction between thought and action, treating
mind as disembodied. That is, according to PSSH, the
exact same thoughts occur when a computer is manipu-
lating symbols by using rules and when a person is
manipulating the same symbols by using the same rules.
The particulars of the body housing the symbol manipu-
lation were thought to be irrelevant. Later, we review
work within the tradition of embodied cognition that
shows that the Cartesian distinction is false.
The role of the self in cognitive processing. Although
PSSH models accounted for much of the data on human
memory and cognition, there was no hint of a “self” in
these models. Theorists such as Tulving (1993) had
insisted that episodic memory involves a special kind of
self-knowing or autonoetic consciousness, and evidence
was mounting for this view (Wheeler, Stuss, & Tulving,
1997). Autobiographical memory for events that were
specific to the individual was extensively studied. But
conceptualizing and formally implementing something
like a “self” and characterizing its function were then
(and largely remain) beyond the scope of models of
memory and cognition. One reason may have been that
although the operations used in these models
were neurologically plausible, they relied only on the
perceptual system, not the motor system or any feedback
from the latter.
In contrast to the PSSH, memory is radically enhanced
when the self is involved. Thus, for example, literally
enacting a task, such as breaking a toothpick, produces
much better memory for the event than watching another
perform the task (e.g., Engelcamp, 1995). Similarly, mem-
ory is also enhanced when the encoder relates the items
to his- or herself (Cloutier & Macrae, 2008; Craik et al.,
1999; Macrae, Moran, Heatherton, Banfield, & Kelley,
2004). What is this self-knowledge, self-involvement, and
self-reflective consciousness, what are its effects, and
how did it come about? We propose that some notion of
embodiment is the needed but missing ingredient.
Cognition in 2013: Embodiment
Just as there are many “standard” approaches to cognition,
not just the PSSH, there are also many embodied
approaches. And these approaches can differ in some fun-
damental ways. For example, Lakoff and Johnson’s (1980)
work on metaphor and Barsalou’s (1999) work on con-
cepts rely strongly on involvement of representations,
whereas Beer’s (1996) work on understanding “minimally”
cognitive tasks and Gibson’s (1979) work on direct per-
ception—which has inspired many embodied cognition
theorists—are explicitly representation-less. Nonetheless,
there are also some general themes that resonate in the
embodiment community.
Cartesian dualism is wrong
As noted above, thinking is not something that is divorced
from the body; instead, thinking is influenced by the
body and the brain interacting with the environment.
This claim can be fleshed out in a number of ways. For
example, Proffitt’s (2006) work on visual perception
demonstrates that perceived distance is affected by the
energetic demands of the body needed to traverse the
distance. Thus, the same distance looks longer when you
are tired, when wearing a heavy backpack, or when of
low physical fitness. Casasanto’s (2011) studies reveal
surprising differences between left-handers and right-
handers. Left-handers and right-handers use different
parts of the brain when thinking about action verbs; they
think about abstract concepts such as “goodness” differ-
ently; and, perhaps most amazingly, a few minutes of
experimentally induced changes in use of the hands pro-
duces differences in how people think about good and
bad. Work on brain imaging (e.g., Hauk, Johnsrude, &
Pulvermüller, 2004) has shown that when people hear a
verb such as “pick,” areas of the motor cortex used to
control the hands are quickly activated, whereas on
576 Glenberg et al.
hearing a verb such as “kick,” areas of motor cortex used
to control the legs are activated.
Furthermore, changes in the body produce changes in
cognition. Blocking the use of the corrugator (frowning)
muscle by cosmetic injection of Botox selectively slows
the processing of sentences describing angry and sad
events but not happy events (Havas, Glenberg, Gutowski,
Lucarelli, & Davidson, 2010). Patients with severe spinal
cord injury have a reduction in ability to perceive differ-
ences in human gaits, although not in perception of grat-
ings (Arrighi, Cartocci, & Burr, 2011). These results
suggest the possibility of what philosophers call “consti-
tution”: Activity in the body and sensorimotor cortices of
the brain not only contribute to cognition—that activity is
cognition.
These data help to explicate the notion that thinking is
grounded in the sensorimotor system. Thus, the psycho-
logical meaning of distance is grounded in the body’s
energy consumption; the meanings of words like “kick”
and “pick” are grounded in how we interact with the
world by using our legs and hands; and the meaning of
anger is at least in part its expression using the facial
muscles. Ideas are not platonic abstractions that can be
as easily implemented in a computer as in a person.
Instead, ideas and the very act of thinking are simulations
using sensory, motor, and emotional systems.
The importance of action
A second theme of embodiment is an emphasis on action.
It seems certain that one evolutionary pressure on cogni-
tion is the need to control action: Without action there is
no survival. Furthermore, according to the noted biolo-
gist Rudolfo Llinas (2001, p. 15), “A nervous system is
only necessary for multicellular creatures . . . that can
orchestrate and express active movement.” And ever
since the seminal work of James Gibson (1979) on direct
perception and affordances, psychologists are increas-
ingly convinced that action changes perception and ani-
mals have perceptual systems because of the need to
control action. Thus, the perception–action cycle is at the
heart of cognition.
The amazing Held and Hein (1963) kitten carousel
experiment illustrates the role of action on perception in
development. Pairs of kittens were exposed to a visual
environment in which one kitten controlled its own loco-
motion while the other was in a yoked gondola—receiving
the same visual input but without self-controlled locomo-
tion. Whereas all kittens had a normal response to light
and visually pursued moving objects, the passive kittens
exhibited impaired blink responses and impaired visually
guided paw placement. They also failed to discriminate
depth, frequently walking over a visual cliff (which in
the real world could, of course, be fatal). Campos et al.
(2000) documented a similar need to link perception and
self-locomotion in human development. Whereas tradi-
tional approaches consider perception to operate inde-
pendently of action, these results demonstrate the
necessity of action even for learning how to perceive.
Overview
Below we expand on the action theme in three ways.
First, we describe the role of action in perception.
Traditional models consider perception to be indepen-
dent and prior to action; here we describe evidence that
action should be considered as an intricate part of per-
ception. Second, we speculate on how an understanding
of the underlying mechanisms of action can reveal insights
into what appears to be the most abstract of concepts: the
self. Consequently, the inclusion of action in processes
such as perception and memory results in the inclusion of
the self in these processes. Third, we demonstrate the
practical side of embodiment by describing an embodied
reading comprehension intervention based on action.
These three areas of research are not typically associated
with action; thus, demonstrating the connection helps to
justify our claim that cognition is thoroughly embodied
because it is for action.
Action’s effect on perception
During the past 25 years, much of the research on per-
ception has resonated with themes found in other areas
of cognitive research. Starting with Marr’s (1982) seminal
book, researchers used the analogy of computers to
study perception. Marr distinguished three levels at which
a process could be assessed: a computational level (What
does a process do?), an algorithmic level (How does the
process work?), and an implementation level (How is the
process realized?). In this framework, it is irrelevant
whether the implementation occurs on neural hardware
or computer hardware. The focus was instead on the
computations needed to perceive.
The prevalence of this approach—treating human
perception as analogous to the processes of computer
symbol manipulators—is revealed by the methods
used to study perception. A typical setup is shown in
Figure 1. The observer’s head is stabilized on a chin rest
so as to eliminate head motion, thus allowing the experi-
menter to have complete control over what the observer
views. The observer’s task is nearly always a judgment
of some sort but not in the context action. Indeed, the
setup intentionally removes any potential for action, forc-
ing on the human observer the constraints and limita-
tions of a computer. Furthermore, the conjectured goals
of vision also align with those that could be equally
shared by a computer program, namely, of formulating a
From the Revolution to Embodiment 577
representation of shape (Marr, 1982), object identification
(Yantis, 2000), or spatial layout (Palmer, 1999), based on
the impoverished information given in the display unin-
formed by human action and its consequents.
These PSSH-oriented approaches were challenged by
Gibson (1979) in favor of an approach that captures sev-
eral of the key ideas of embodied cognition, including a
role for the self. According to Gibson’s theory of direct
perception, the information for perception comes from
invariants and variants within the optic array as the per-
ceiver moves. These invariants and variants specify both
the external environment and the perceiver. For example,
the close correlation between self-movement and change
in the optic array indicates that the self is moving (e.g.,
Blakemore, Wolpert, & Frith, 1998); changes in the optic
array without self-action indicate a changing world; and
the two together specify the dimensionality of space
(Philipona, O’Regan, & Nadal, 2003). Consequently, the
self is necessarily perceived when perceiving the envi-
ronment, and perception of the environment could not
be achieved without also perceiving the self.
Gibson’s theory of affordances emphasized the role of
action in perception. Affordances are possibilities for
action or for a consequence on the perceiver in a given
environment. According to Gibson, affordances are the
main object of perception. In making this claim, Gibson
stressed action as a key concept in perception, rather
than behaviorally independent representations of spatial
layout and object identity. But although this concept was
accepted for animal vision, mainstream researchers, such
as Marr (1982), held to the idea that perception’s goal is
to recover platonic geometric properties rather than to
uncover affordances:
The usefulness of a representation depends upon
how well suited it is to the purpose for which it is
used. A pigeon uses vision to help it navigate, fly,
and seek out food. . . . Human vision, on the other
hand, seems to be very much more general. (p. 32)
Perhaps it is a testament to the importance of action in
perception that its role continues to reemerge. Today,
there are many different findings of effects of action on
perception. For instance, a slight modification to the typi-
cal setup for perception experiments—putting one’s
hands next to the display—modifies visual processes
related to attention, such as detection, search, and inhibi-
tion (see Brockmole, Davoli, Abrams, & Witt, 2013). Thus,
perception is influenced by the mere proximity of one’s
hands and their potential to act (e.g., Reed, Betz, Garza,
& Roberts, 2010). Another example of action’s influence
is apparent in action observation—the perception of oth-
ers performing actions (Knoblich, Thornton, Grosjean, &
Shiffrar, 2006). For example, apparent motion is the illu-
sion of motion from static images of the endpoints of the
motion. What is important is that when the endpoints
depict humans, people perceive biologically plausible
paths rather than paths along the shortest distance, as
would be expected when observing apparent motion
with nonbiological objects (Shiffrar & Freyd, 1990).
Perception of object features, such as position and direc-
tion, is also influenced by the perceiver’s actions and
intentions to act (Lindemann & Bekkering, 2009; Müsseler
& Hommel, 1997).
Action-specific account of perception. The main
claim of the action-specific account of perception (Prof-
fitt, 2006; Witt, 2011a) is that perception is influenced by
a person’s ability to act. For example, softball players
who are hitting better than others see the ball as bigger
(Witt & Proffitt, 2005). In addition to effects of sports per-
formance on apparent target size (e.g., Gray, in press;
Lee, Lee, Carello, & Turvey, 2012; Witt & Dorsch, 2009;
Witt & Sugovic, 2010), the energetic costs associated with
performing an action also influence perception. Hills
look steeper and distances look farther to perceivers who
are fatigued, out of shape, encumbered by a heavy load,
or elderly (Bhalla & Proffitt, 1999; Sugovic & Witt, in
press). Furthermore, changes to a perceiver’s ability to
perform an action, such as via tool use, also influences
perception. Targets that would otherwise be beyond
reach look closer when the perceiver intends to reach
with a tool (Witt, 2011b; Witt & Proffitt, 2008; Witt, Prof-
fitt, & Epstein, 2005; see also Bloesch, Davoli, Roth,
Brockmole, & Abrams, 2012; Brockmole et al., 2013;
Davoli, Brockmole, & Witt, 2012; Kirsch, Herbort, Butz, &
Kunde, 2012; Osiurak, Morgado, & Palluel-Germain,
Fig. 1. A typical setup for studying perception. Chin rests are used to
minimize head movements.
578 Glenberg et al.
2012). In addition, changing the apparent size of one’s
body in virtual reality (Linkenauger, Mohler, & Bülthoff,
2011; Mohler, Creem-Regehr, Thompson, & Bülthoff,
2010; Van der Hoort, Guterstam, & Ehrsson, 2011) or
through optic magnification (Linkenauger, Ramenzoni, &
Proffitt, 2010; Linkenauger, Witt, & Proffitt, 2011) influ-
ences perceived distance to and size of objects. This
research demonstrates that the same object, which gives
rise to the same optical information, looks different
depending on the perceiver’s abilities.
That the perceptual system is tightly connected to the
motor system seems necessary from an evolutionary per-
spective. Sensory processes are costly in terms of ener-
getics, and only sensory processes that are useful will
confer an adaptive advantage that will likely be promoted
through reproduction. Both nonhuman animals and
humans have perceptual processes relevant to their spe-
cific bodies and capabilities. The time has come to con-
sider the perceptual system as part of an integrated
perception–action system.
Necessity of embodiment. The studies mentioned above
make a case for a role of embodiment in perception, but
they do not speak to whether embodiment is necessary for
perception. Indeed, people can perceive objects for which
action is not very likely (such as the moon, although of
course perception of the moon is not very accurate). There
are some reasons to believe that action might be necessary
for perception. First, developing the ability to perceive in a
way that is useful for acting requires experience with the
pairing of perceptual information while acting (Held &
Hein, 1963; see also Campos et al., 2000).
In addition, a new proposal for the mechanism of
action-specific effects called the perceptual ruler hypoth-
esis suggests that embodiment may be necessary for per-
ception. The perceptual ruler hypothesis (Proffitt &
Linkenauger, 2013) solves a problem that is largely
ignored: All optical information takes the form of angles,
such as angles of retinal projection or angles of disparity,
or changes in these angles. To perceive dimensions such
as distance and size, these angles must be transformed
accordingly. Yet little is known about these transforma-
tion processes or “rulers.” According to the perceptual
ruler hypothesis, these rulers are based on the body. For
example, eye height is used to perceive distance and
object height (Sedgwick, 1986), the hand is used for scal-
ing the size of graspable objects (Linkenauger, Witt, et al.,
2011), and the arm is used for scaling the size of reach-
able objects (Witt et al., 2005). Similarly, the body’s abili-
ties in terms of physiological and behavioral potential
scale other aspects of the environment, such as hills, far
distances, and the size of targets.
In summary, the perceptual ruler account solves the
important problem of scaling visual information (i.e.,
turning visual angles into meaningful information), and
the ruler used to perform the scaling is the body. In fact,
no one has proposed a ruler that is not body-based.
Thus, this solution emphasizes that the body not only
influences perception, it is necessary for perception, too.
Forward models: A mechanism for
action simulation, prediction, and
the self
Barsalou (1999) made a strong case that tracing the trans-
formations from perception at the level of the retina or
the ear, or action at the level of external bodily move-
ment, through to coherent high-level representations is
necessary and a central goal of embodied cognition
research. In fact, only by grounding representational cog-
nition in the world can symbols take on meaning (Searle,
1980). Although a general well-specified model integrat-
ing cognition and action does not yet exist, some inroads
have been made. As we shall see, to model actions, both
internal representations (simulations) and a computation
that provides the basis for a “self”—a missing component
in cognitive models—are required.
One of the most striking of these action models comes
from endeavors undertaken by Wolpert (e.g., Wolpert,
Ghahramani, & Jordan, 1995). The argument is that action,
from the simplest finger movement to the most complex
social interactions (e.g., Wolpert, Doya, & Kawato, 2003),
requires two types of models. A controller model gener-
ates the efferents or commands to move the muscles. But
how does the system determine that the movement is, or
is likely to be, successful? Of course, sensory feedback is
important, but for many situations in which action timing
is important (e.g., playing the piano, walking down stairs,
control of the speech articulators, or holding a conversa-
tion), sensory feedback is too slow. The solution is a sec-
ond type of model variously called a predictor model or
forward model. The forward model uses an exact copy of
the efferent signal to simulate the action and predict the
effects of the action. A comparator can then make several
types of computations. First, it can be used to determine
whether the action is unfolding correctly: Do the predic-
tions from the forward model match the desired outcomes?
Second, it allows knowledge of whether the action was
effective: Does the sensory feedback match the prediction
from the forward model?
Critically, the forward model can also be used imagi-
natively, in mental practice, to evaluate outcomes without
performing them (e.g., Grush, 2004). That is, imagination
is generating motor commands but inhibiting the efferent
signal before the muscles are moved. Nonetheless, gen-
erating the efferent commands also generates the signal
used by the forward model, and the predictions gener-
ated by the forward model are tantamount to imagery.
From the Revolution to Embodiment 579
Combining the notions of imagination, prediction, and
action prompted several independent applications of for-
ward models to language. Pickering and Garrod (2013)
developed the idea that forward models are used to pre-
dict how and what conversational partners are going to
say, and thus forward models provide a mechanism for
coordinating discourse through alignment in speaking
rate, word choice, and syntax. The action-based language
theory of Glenberg and Gallese (2012) suggests that for-
ward models play an essential role in predicting (simulat-
ing), and thus understanding, the content of language. In
fact, Lesage, Morgan, Olson, Meyer, and Miall (2012)
demonstrated that disrupting forward models in the
motor system (using repetitive transcranial magnetic
stimulation of the cerebellum) disrupted prediction in
language processing.
Forward models and the self. Although the compara-
tor model rests firmly in the domain of the motor system,
an extension of this model to the role of the self in cogni-
tion has been explicitly proposed. Many investigators
(e.g., Blakemore et al., 1998) realized that a close match
of actual and predicted outcomes justifies the inference
that the person himself or herself was in control. A mis-
match implies the work of external forces and a lack of
personal control. This simple mechanism, then, provides
a basis for people’s knowledge of their own agency. Dis-
crepancies between one’s own intention and the out-
come could be used in making judgments of self- or
other attribution. Indeed, Blakemore, Wolpert, and Frith
(2002; and see Blakemore & Frith, 2003; Decety & Lamm,
2007) soon proposed brain-based frameworks using the
comparator model to make testable predictions concern-
ing people’s feelings of agency—their feelings of the
involvement of self.
Action, the self, and memory. The “self” in memory
tasks has always been mysterious. William James referred
to it in his seminal writings. Tulving, too, makes use of
this construct, in distinguishing between semantic mem-
ory and episodic memory and in discussing different pur-
ported kinds of consciousness. The “highest” of these, he
claimed, is self-knowing—autonoetic—consciousness.
Nevertheless, the construct of self has always been
slightly disreputable for cognitive psychologists, perhaps
because it is difficult to define and elusive to model.
Even so, the many cases where the involvement of the
“self” has a large impact on memory involve either physi-
cal or mental action. The connection to the motor system
may not be accidental. Enactment effects, wherein mem-
ory is enhanced for those things one does oneself, as
compared with those that someone else does (e.g.,
Engelcamp, 1995), is pervasive for all but some people
with autism. The benefits of active versus passive
learning are probably due to involvement of the self.
Nairne and Pandeirada’s (2008) findings that survival-
relevant scenarios are easily remembered may also be
due to “self” involvement or some implicit threat to the
self (and see Humphrey, 2006). The generation effect,
whereby memory is greatly improved when the answers
are self-generated rather than given externally, also impli-
cates an active self, which alters memory processing. It
seems, then, that despite its elusiveness, the self has
importance for human memory, and the motor system—
which appears to be the basis of this construct—may be
implicated at deep levels.
Action and language
At first blush, the notion that action systems play a role in
language seems close to absurd: Language appears to be
a mental activity, not a bodily one (e.g., Hauser, Chomsky,
& Fitch, 2002). But in fact, the data are clear that action
systems play an important role in language production,
perception, and comprehension (e.g., Glenberg &
Gallese, 2012). In this section, we review the data sup-
porting this claim and then demonstrate its importance
for both theory and practice by describing an embodied
reading comprehension intervention called Moved by
Reading (MbR).
Language comprehension as a simulation pro-
cess. Work in cognitive neuroscience demonstrates
many links between language and action. As noted
before, Hauk et al. (2004) used functional MRI to demon-
strate that just hearing words such as “pick” activated
motor areas of the brain controlling the hand. Further-
more, language–action links are bidirectional (e.g., Arav-
ena et al., 2010): Preparing to move the hand affects
understanding language about hand movements, and lan-
guage understanding affects motor preparation. Even the
specifics of motor experience matter. For instance, hockey
experience modifies motor areas of the brain (including
those used in prediction), and those areas are then used
in the understanding of language about hockey (Beilock,
Lyons, Mattarella-Micke, Nusbaum, & Small, 2008).
Behavioral data tell a similar story. For example, turn-
ing a knob clockwise or counterclockwise to reveal the
next portion of a story affects the reading of sentences
implying clockwise (e.g., “He turned the key to start the
car”) or counterclockwise actions as shown by Zwaan,
Taylor, and De Boer (2010).
These results support the claim that language compre-
hension is a simulation process (Aziz-Zadeh, Wilson,
Rizzolatti, & Iacoboni, 2006; Barsalou, 1999; Glenberg
& Gallese, 2012). In brief, words and phrases are
transformed into a simulation of the situation described.
Furthermore, the simulation takes place in neural
580 Glenberg et al.
systems normally used for action, sensation, and emo-
tion. Thus, language about human action is simulated
using motor cortices; comprehending descriptions of
visual scenes activates areas of the brain used in visual
perception (e.g., Rueschemeyer, Glenberg, Kaschak,
Mueller, & Friederici, 2010); and understanding language
about emotional situations calls on neural systems
engaged by emotion (e.g., Havas et al., 2010). In all of
these cases, forward models based in the motor system
are used to guide the simulations, but the forward mod-
els make use of sensory and emotional systems.
Action and language: An implication for educa-
tion. Given that language comprehension is a process
of simulation, it follows that one component in teaching
children how to read for comprehension should be
teaching them to simulate. That is the goal of the MbR
intervention.
The MbR computer program consists of two parts. In
the first part, physical manipulation (PM), children read
multiple texts from a scenario (e.g., stories that take place
on a farm). After reading an action-oriented sentence
(indicated to the child by a green traffic light), the child
manipulates images on the computer screen to simulate
the content of the sentence (See Fig. 2). For example,
if the child reads, “The farmer drives the tractor to the
barn,” the child moves the image of the farmer to the trac-
tor and then moves the conjoined image of the tractor and
farmer to the image of the barn. In the second stage,
called imagine manipulation (IM), the child is taught to
imagine manipulating the scene in the same way that he
or she physically manipulated the scene. IM, in contrast to
simple imagery instructions, is likely to engender a signifi-
cant motor component in addition to visual imagery.
The PM and IM procedures enhance simulation and
comprehension by encouraging (a) vocabulary develop-
ment by grounding word meaning in the neural represen-
tation of the pictures and objects, (b) appreciation of the
function of syntax by grounding the syntax (i.e., the who
does what to whom) in the child’s own actions, and
(c) the integration of words in a sentence and the integra-
tion of sentences across the text as the visually depicted
scene is updated by the child. The advantages of MbR
have been demonstrated for texts that have been trained
with the technique (Glenberg, Gutierrez, Levin, Japuntich,
& Kaschak, 2004), generalize to texts read after training,
can be implemented in small reading groups (Glenberg,
Brown, & Levin, 2007), extend to solving math problems
(Glenberg, Willford, Gibson, Goldberg, & Zhu, 2012), and
benefit nonnative speakers (Marley, Levin, & Glenberg,
2007, 2010).
If a child understands oral language, and thus is simu-
lating language by using appropriate neural systems,
why does the child need to be taught how to simulate
when reading? In learning an oral language, the linking
of symbols (spoken words and phrases) to sensorimotor
activity is frequent and immediate. For example, a
mother will say, “Here is your bottle” and give the
baby the bottle; or, a father will say, “Wave bye-bye”
and gesture waving. From these interactions, the process
of moving from the auditory linguistic symbol to the
neural representations of objects and actions is highly
practiced and becomes fast and automatic. The key to
MbR is to make reading more like oral language: Teach
the child how to ground written words in sensorimotor
activity.
Conclusion
Approaching solutions to four
problems
Earlier, we noted four problems with the PSSH. Does the
embodied approach help to solve those problems?
Consider first arbitrary, ungrounded symbols. Although
there are debates among embodied cognition theorists as
to necessity of representations, they all agree that cogni-
tion is grounded in the body’s actions in the world and is
not just the manipulation of arbitrary symbols. Instead,
what we perceive is related to how we can act in the
world; our sense of self is determined by the relations
among our actions, their expected effects, and the
observed effects; our understanding of language depends
on simulations using neural and bodily systems of action,
perception, and emotion.
Second, are there static symbols that form the core of
our knowledge? Perhaps there are some, but the data
strongly imply that our activity influences and coordi-
nates that knowledge. Our skill in acting affects how we
perceive the world, and changing that skill changes
what we perceive (Lee et al., 2012; Witt, Linkenauger,
Bakdash, & Proffitt, 2008; Witt & Proffitt, 2005).
Third, is the mind disembodied? Can there be a brain
or mind in a vat? Not if that vat is unconnected to a sens-
ing and acting body. But is it right to consider the bodily
activity as part of cognition rather than just the mecha-
nism of input and output? Shapiro (2011) makes an inter-
esting analogy. A car engine’s exhaust seems to be just an
external waste product of the generation of energy. But
consider when the exhaust powers a turbocharger that
forces more air into the cylinders, thereby boosting
energy output. The exhaust now becomes an integral
component of energy production. Similarly, the predic-
tions of sensory feedback generated by forward models,
bodily activity, and the feedback from activity become
integral to cognition.
From the Revolution to Embodiment 581
Fig. 2. The screenshots illustrate physical manipulation (a) before reading the sentence “He drives the tractor to
the barn” (b) midway through manipulating for the sentence, and (c) after successful manipulation. The green
traffic light is the signal for the child to manipulate the toys to correspond to the sentence.
582 Glenberg et al.
Finally, as a result of the interaction between action
and feedback, the very perception of the self is grounded
in activity—it is embodied.
The notion that embodiment can help to unify psy-
chology is hinted at in this essay where we have tried to
illustrate links between perception, action, memory, lan-
guage, the self, and applications of psychology. Given
that the body is present when we are sensing, thinking,
acting, emoting, socializing, and obeying cultural impera-
tives, it is a good bet that considering the effects of the
body in these endeavors will lead to a more unified,
coherent, comprehensive, and useful psychology.
Cognition in 2038
Even the best forward models are hard-pressed to predict
more than a second or two into the future, let alone
25 years. Whether or not embodiment survives as a via-
ble theoretical framework for decades to come, it has set
a salutary course that we hope will continue—namely, it
provides new perspectives, new theories, and new meth-
ods that may help to unify psychology. If this unification
is to emerge by 2038, much work needs to be done. In
particular, embodiment researchers must move from
demonstrations of embodiment to theoretical approaches
that have broad reach. These theoretical approaches
must make strong predictions about the directions of the
effects, and must be able to describe the underlying
mechanisms, including how information from forward
models is incorporated into other processes, such as
memory and perception. In addition, the concept of the
self needs development. Here, we described the self as if
it were a singular concept, but it is likely there are mul-
tiple aspects of the self that play different roles in cogni-
tion (e.g., Damasio, 2010).
Theoretical approaches also need to continue to make
the distinction between the idea that the body can influ-
ence cognition and the idea that the body is necessary to
understand cognition. Given the omnipresence of the
body in human activity, we think that developing such an
approach is not only possible but essential for develop-
ment and unification of our field.
Acknowledgments
The authors contributed equally to this article. We thank Sian
Beilock, Douglas Hintzman, Dennis Proffitt, and tDavid Sbarra
for helping to improve this article.
Declaration of Conflicting Interests
Arthur M. Glenberg is co-owner of Moved by Reading, LLC.
Funding
In preparing this article, Arthur M. Glenberg was partially sup-
ported by the National Science Foundation (Grant Number
1020367), Jessica K. Witt was supported by the National Science
Foundation (Grant Number BCS-0957051), and Janet Metcalfe
was supported by the James S. McDonnell Foundation (Grant
Number 220020166). Any opinions, findings, and conclusions
or recommendations expressed in this material are those of the
authors and do not necessarily reflect the views of the funding
agencies.
References
Anderson, J. R. (1990). The adaptive character of thought.
Mahwah, NJ: Erlbaum.
Aravena, P., Hurtado, E., Riveros, R., Cardona, J. F., Manes, F.,
& Ibáñez, A. (2010). Applauding with closed hands: Neural
signature of action-sentence compatibility effects (P. L.
Gribble, Ed.). PLoS ONE, 5(7), e11751. doi:10.1371/journal
.pone.0011751
Arrighi, R., Cartocci, G., & Burr, D. (2011). Reduced percep-
tual sensitivity for biological motion in paraplegia patients.
Current Biology, 21, R910–R911.
Aziz-Zadeh, L., Wilson, S. M., Rizzolatti, G., & Iacoboni, M.
(2006). Congruent embodied representations for visually
presented actions and linguistic phrases describing actions.
Current Biology, 16, 1818–1823.
Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral
& Brain Sciences, 22, 577–660.
Beer, R. D. (1996). Toward the evolution of dynamical neural
networks for minimally cognitive behavior. In P. Maes, M.
Mataric, J. Meyer, J. Pollack, & S. Wilson (Eds.), From ani-
mals to animats 4: Proceedings of the Fourth International
Conference on Simulation of Adaptive Behavior (pp. 421–
429). Cambridge, MA: MIT Press.
Beilock, S. L., Lyons, I. M., Mattarella-Micke, A., Nusbaum,
H. C., & Small, S. L. (2008). Sports experience changes
the neural processing of action language. Proceedings
of the National Academy of Sciences, USA, 105, 13269–
13272.
Bhalla, M., & Proffitt, D. R. (1999). Visual-motor recalibration
in geographical slant perception. Journal of Experimental
Psychology: Human Perception and Performance, 25,
1076–1096.
Blakemore, S.-J., & Frith, C. (2003). Self-awareness and action.
Current Opinion in Neurobiology, 13, 219–224.
Blakemore, S.-J., Wolpert, D. M., & Frith, C. D. (1998). Central
cancellation of self-produced tickle sensation. Nature
Neuroscience, 1, 635–640.
Blakemore, S.-J., Wolpert, D. M., & Frith, C. D. (2002).
Abnormalities in the awareness of action. Trends in Cognitive
Science, 6, 237–242. doi:10.1016/S1364 6613(02)01907-1
Bloesch, E. K., Davoli, C. C., Roth, N., Brockmole, J. R., &
Abrams, R. A. (2012). Watch this! Observed tool use affects
perceived distance. Psychonomic Bulletin & Review, 19,
177–183.
Brockmole, J. R., Davoli, C. C., Abrams, R. A., & Witt, J. K.
(2013). The world within reach: Effects of hand posture
and tool-use on visual cognition. Current Directions in
Psychological Science, 22, 38–44.
Campos, J. J., Anderson, D. I., Barbu-Roth, M. A., Hubbard, E.
M., Hertenstein, M. J., & Witherington, D. (2000). Travel
broadens the mind. Infancy, 1, 149–219.
From the Revolution to Embodiment 583
Casasanto, D. (2011). Different bodies, different minds:
The body specificity of language and thought. Current
Directions in Psychological Science, 20, 378–383.
Cloutier, J., & Macrae, C. N. (2008). The feeling of choosing:
Self-involvement and the cognitive status of things past.
Consciousness and Cognition, 17, 125–135.
Collins, A. M., & Quillian, M. R. (1969). Retrieval time from
semantic memory. Journal of Verbal Learning and Verbal
Behavior, 8, 240–247.
Craik, F. I. M., Moroz, T. M., Moscovitch, M., Stuss, D. T.,
Winocur, G., Tulving, E., & Kapur, S. (1999). In search of
the self: A positron emission tomography. Psychological
Science, 10, 26–34.
Damasio, A. (2010). Self comes to mind. New York, NY: Random
House.
Davoli, C. C., Brockmole, J. R., & Witt, J. K. (2012). Compressing
perceived distance with remote tool-use: Real, imagined,
and remembered. Journal of Experimental Psychology:
Human Perception and Performance, 38, 80–89.
Decety, J., & Lamm, C. (2007). The role of the right tempo-
roparietal junction in social interaction: How low-level
computational processes contribute to meta-cognition.
Neuroscientist, 13, 580–593. doi:10.1177/1073858407304654
Engelcamp, J. (1995). Visual imagery and enactment of actions
in memory. British Journal of Psychology, 86, 227–240.
Fodor, J. (1975). The language of thought. Cambridge, MA:
Harvard University Press.
Gibson, J. J. (1979). The ecological approach to visual percep-
tion. Boston, MA: Houghton Mifflin.
Glenberg, A. M. (2010). Embodiment as a unifying perspective
for psychology. Wiley Interdisciplinary Reviews: Cognitive
Science, 1, 586–596. doi:10.1002/wcs.55
Glenberg, A. M., Brown, M., & Levin, J. R. (2007). Enhancing
comprehension in small reading groups using a manipu-
lation strategy. Contemporary Educational Psychology, 32,
389–399.
Glenberg, A. M., & Gallese, V. (2012). Action-based Language:
A theory of language acquisition, comprehension, and pro-
duction. Cortex, 48, 905–922.
Glenberg, A. M., Gutierrez, T., Levin, J. R., Japuntich, S., &
Kaschak, M. P. (2004). Activity and imagined activity can
enhance young children’s reading comprehension. Journal
of Educational Psychology, 96, 424–436.
Glenberg, A. M., Willford, J., Gibson, B., Goldberg, A., & Zhu,
X. (2012). Improving reading to improve math. Scientific
Studies of Reading, 16, 316–340. doi:10.1080/10888438.20
11.564245
Gray, R. (in press). Being selective at the plate: Processing
dependence between perceptual variables relates to hit-
ting goals and performance. Journal of Experimental
Psychology: Human Perception and Performance.
Grush, R. (2004). The emulation theory of representation:
Motor control, imagery, and perception. Behavioral &
Brain Sciences, 27, 377–442.
Hauk, O., Johnsrude, I., & Pulvermüller, F. (2004). Somatotopic
representation of action words in human motor and premo-
tor cortex. Neuron, 41, 301–307.
Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002). The faculty
of language: What is it, who has it, and how did it evolve.
Science, 298, 1569–1579.
Havas, D. A., Glenberg, A. M., Gutowski, K. A., Lucarelli, M. J.,
& Davidson, R. J. (2010). Cosmetic use of botulinum toxin-
A affects processing of emotional language. Psychological
Science, 21, 895–900.
Held, R., & Hein, A. (1963). Movement-produced stimulation
in the development of visually guided behavior. Journal
of Comparative and Physiological Psychology, 56, 872–876.
Humphrey, N. (2006). Seeing red: A study in consciousness.
Boston, MA: Harvard University Press.
Kintsch, W. (1988). The role of knowledge in discourse compre-
hension: A construction-integration model. Psychological
Review, 95, 163–182.
Kirsch, W., Herbort, O., Butz, M. V., & Kunde, W. (2012).
Influence of motor planning on distance perception within
the peripersonal space. PLoS ONE, 7(4), e34880.
Knoblich, G., Thornton, I. M., Grosjean, M., & Shiffrar, M.
(2006). Human body perception from the inside out. Oxford,
England: Oxford University Press.
Lakoff, G., & Johnson, M. (1980). Metaphors we live by. Chicago,
IL: University of Chicago Press.
Lee, Y., Lee, S., Carello, C., & Turvey, M. T. (2012). An archer’s
perceived form scales the “hitableness” of archery targets.
Journal of Experimental Psychology: Human Perception
and Performance, 38, 1125–1131.
Lesage, E., Morgan, B. E., Olson, A. C., Meyer, A. S., & Miall,
R. C. (2012). Cerebellar rTMS disrupts predictive language
processing. Current Biology, 22, R794–R795. doi:10.1016/
j.cub.2012.07.006
Lindemann, O., & Bekkering, H. (2009). Object manipulation
and motion perception: Evidence of an influence of action
planning on visual processing. Journal of Experimental
Psychology: Human Perception and Performance, 35,
1062–1071.
Linkenauger, S. A., Mohler, B. J., & Bülthoff, H. H. (2011, May).
Welcome to wonderland: The apparent size of the self-
representing avatar’s hands and arms influences perceived
size and shape in virtual environments. Poster presented
at the 10th Annual meeting of the Vision Sciences Society,
Tampa, FL.
Linkenauger, S. A., Ramenzoni, V., & Proffitt, D. R. (2010).
Illusory shrinkage and growth: Body-based rescaling
affects the perception of size. Psychological Science, 21,
1318–1325.
Linkenauger, S. A., Witt, J. K., & Proffitt, D. R. (2011). Taking a
hands-on approach: Apparent grasping ability scales the per-
ception of object size. Journal of Experimental Psychology:
Human Perception and Performance, 37, 1432–1441.
Llinas, R. (2001). The I of the vortex. Cambridge, MA: MIT Press.
Macrae, C. N., Moran, J. M., Heatherton, T. F., Banfield, J. F.,
& Kelley, W. M. (2004). Medial prefrontal activity predicts
memory for self, Medial prefrontal activity predicts memory
for self. Cerebral Cortex, 14, 647–654.
Marley, S. C., Levin, J. R., & Glenberg, A. M. (2007). Improving
Native American children’s listening comprehension through
584 Glenberg et al.
concrete representations. Contemporary Educational Psy-
chology, 32, 537–550.
Marley, S. C., Levin, J. R., & Glenberg, A. M. (2010). What cogni-
tive benefits does an activity-based reading strategy afford
young Native American readers? Journal of Experimental
Education, 78, 395–417. doi:10.1080/00220970903548061
Marr, D. (1982). Vision: A computational investigation into the
human representation and processing of visual Information.
San Francisco, CA: W. H. Freeman.
McClelland, J. L., & Elman, J. L. (1986). The TRACE model of
speech perception. Cognitive Psychology, 18, 1–86.
Metcalfe, J. (1990). A composite holographic associative recall
model (CHARM) and blended memories eyewitness testi-
mony. Journal of Experimental Psychology: General, 119,
145–160.
Metcalfe Eich, J. (1982). A composite holographic associative
recall model. Psychological Review, 89, 627–661.
Metcalfe Eich, J. (1985). Levels of processing, encoding speci-
ficity, elaboration, and CHARM. Psychological Review, 91,
1–38.
Mohler, B. J., Creem-Regehr, S. H., Thompson, W. B., & Bülthoff,
H. H. (2010). The effect of viewing a self-avatar on distance
judgments in an HMD-based virtual environment. Presence:
Teleoperators and Virtual Environments, 19, 230–242.
Murdock, B. B., Jr. (1982). A theory for the storage and retrieval
of associative information. Psychological Review, 89, 609–
626.
Müsseler, J., & Hommel, B. (1997). Blindness to response-
compatible stimuli. Journal of Experimental Psychology:
Human Perception and Performance, 23, 861–872.
Nairne, J. S., & Pandeirada, J. N. S. (2008). Adaptive memory:
Remembering with a stone-age brain. Current Directions in
Psychological Science, 17, 239–243.
Newell, A., & Simon, H. A. (1976). Computer science as empiri-
cal enquiry. Communications of the ACM, 19, 113–126.
Osiurak, F., Morgado, N., & Palluel-Germain, R. (2012). Tool
use and perceived distance: When unreachable becomes
spontaneously reachable. Experimental Brain Research,
218, 331–339.
Palmer, S. E. (1999). Vision science: From photons to phenom-
enology. Cambridge, MA: MIT Press.
Philipona, D., O’Regan, J., & Nadal, J. (2003). Is there some-
thing out there? Inferring space from sensorimotor depen-
dencies. Neural Computation, 15, 2029–2049.
Pickering, M., & Garrod, S. (2013). An integrated view of lan-
guage production and comprehension. Behavioral & Brain
Sciences, 36, 329–347.
Proffitt, D. R. (2006). Embodied perception and the economy
of action. Perspectives on Psychological Science, 1, 110–
122.
Proffitt, D. R., & Linkenauger, S. A. (2013). Perception viewed
as a phenotypic expression. In W. Prinz, M. Beisert, &
A. Herwig (Eds.), Tutorials in action science (pp. 171–197).
Cambridge, MA: MIT Press.
Reed, C. L., Betz, R., Garza, J. P., & Roberts, R. J. (2010). Grab
it! Biased attention in functional hand and tool space.
Attention, Perception, & Psychophysics, 72, 236–245.
Rueschemeyer, S.-A., Glenberg, A. M., Kaschak, M. P., Mueller,
K., & Friederici, A. D. (2010). Top-down and bottom-
up contributions to understanding sentences describing
objects in motion. Frontiers in Psychology, 1. doi:10.3389/
fpsyg.2010.00183
Schubert, T. W., & Semin, G. R. (2009). Embodiment as a unify-
ing perspective for psychology. European Journal of Social
Psychology, 39, 1135–1141. doi:10.1002/ejsp.670
Searle, J. R. (1980). Minds, brains, and programs. Behavioral &
Brain Sciences, 3, 417–457.
Sedgwick, H. A. (1986). Space perception. In K. L. Boff, L.
Kaufman, & J. P. Thomas (Eds.), Handbook of perception
and human performance, Vol. 1: Sensory processes and per-
ception (pp. 1–56) New York, NY: Wiley.
Shapiro, L. (2011). Embodied cognition. New York, NY:
Routledge.
Shiffrar, M., & Freyd, J. (1990). Apparent motion of the human
body. Psychological Science, 1, 257–264.
Sugovic, M., & Witt, J. K. (in press). An older view of distance
perception: Older adults perceive walkable extents as far-
ther. Experimental Brain Research.
Tulving, E. (1983). Elements of episodic memory. Oxford,
England: Oxford University Press.
Tulving, E. (1993). Varieties of consciousness and levels of
awareness in memory. In A. Baddeley & L. Weiskrantz
(Eds.), Attention: Selection, awareness, and control: A trib-
ute to Donald Broadbent (pp. 283–299). London, England:
Oxford University Press.
Tulving, E., & Thomson, D. M. (1973). Encoding specificity
and retrieval processes in episodic memory. Psychological
Review, 80, 352–373.
Van der Hoort, B., Guterstam, A., & Ehrsson, H. H. (2011).
Being Barbie: The size of one’s own body determines the
perceived size of the world. PLoS ONE, 6, e20195.
Washburn, M. F. (1916). Introduction. In M. F. Washburn (Ed.),
Movement and mental imagery: Outlines of a motor theory
of the complexer mental processes (pp. xi–xv). Boston, MA:
Houghton Mifflin.
Wheeler, M. A., Stuss, D. T., & Tulving, E. (1997). Toward
a theory of episodic memory: The frontal lobes and
autonoetic consciousness. Psychological Bulletin, 121,
331–354.
Witt, J. K. (2011a). Action’s effect on perception. Current
Directions in Psychological Science, 20, 201–206.
Witt, J. K. (2011b). Tool use influences perceived shape and
parallelism: Indirect measures of perceived distance.
Journal of Experimental Psychology: Human Perception
and Performance, 37, 1148–1156.
Witt, J. K., & Dorsch, T. (2009). Kicking to bigger uprights:
Field goal kicking performance influences perceived size.
Perception, 38, 1328–1340.
Witt, J. K., Linkenauger, S. A., Bakdash, J. Z., & Proffitt, D. R.
(2008). Putting to a bigger hole: Golf performance relates to
perceived size. Psychonomic Bulletin & Review, 15, 581–585.
Witt, J. K., & Proffitt, D. R. (2005). See the ball, hit the ball:
Apparent ball size is correlated with batting average.
Psychological Science, 16, 937–938.
Witt, J. K., & Proffitt, D. R. (2008). Action-specific influences on
distance perception: A role for motor simulation. Journal
of Experimental Psychology: Human Perception and
Performance, 34, 1479–1492.
Witt, J. K., Proffitt, D. R., & Epstein, W. (2005). Tool use affects
perceived distance but only when you intend to use it.
From the Revolution to Embodiment 585
Journal of Experimental Psychology: Human Perception
and Performance, 31, 880–888.
Witt, J. K., & Sugovic, M. (2010). Performance and ease influ-
ence perceived speed. Perception, 39, 1341–1353.
Wolpert, D. M., Doya, K., & Kawato, M. (2003). A unifying
computational framework for motor control and social
interaction. Philosophical Transactions of the Royal Society
B: Biological Sciences, 358, 593–602.
Wolpert, D. M., Ghahramani, Z., & Jordan, M. I. (1995). An
internal model for sensorimotor integration. Science, 269,
1880–1882. doi:10.1126/science.7569931
Yantis, S. (Ed.). (2000). Visual perception: Essential readings.
Philadelphia, PA: Psychology Press.
Zwaan, R. A., Taylor, L. J., & De Boer, M. (2010). Motor reso-
nance as a function of narrative time: Further tests of the
linguistic focus hypothesis. Brain & Language, 112, 143–149.
Language experienced in utero affects vowel perception after
birth: a two-country study
Christine Moon1, Hugo Lagercrantz2, and Patricia K Kuhl3
1Pacific Lutheran University, Tacoma, Washington, USA
2Neonatology Unit, Karolinska Institute, Stockholm, Sweden
3Institute for Learning & Brain Sciences, University of Washington, Seattle, Washington USA
Abstract
Aims—To test the hypothesis that exposure to ambient language in the womb alters phonetic
perception shortly after birth. This two-country study aimed to see if neonates demonstrated
prenatal learning by how they responded to vowels in a category from their native language and
another nonnative language, regardless of how much postnatal experience the infants had.
Method—A counterbalanced experiment was conducted in Sweden (n=40) and the USA (n=40)
using Swedish and English vowel sounds. The neonates (mean postnatal age = 33 hrs) controlled
audio presentation of either native or nonnative vowels by sucking on a pacifier, with the number
of times they sucked their pacifier being used to demonstrate what vowel sounds attracted their
attention. The vowels were either the English /i/ or Swedish /y/ in the form of a prototype plus 16
variants of the prototype.
Results—The infants in the native and nonnative groups responded differently. As predicted, the
infants responded to the unfamiliar nonnative language with higher mean sucks. They also sucked
more to the nonnative prototype. Time since birth (range: 7–75 hours) did not affect the outcome.
Conclusion—The ambient language to which foetuses are exposed in the womb starts to affect
their perception of their native language at a phonetic level. This can be measured shortly after
birth by differences in responding to familiar vs. unfamiliar vowels.
Keywords
fetal; language; learning; neonatal; vowels
Introduction
It is now well established that listening to a specific language early in life alters young
infants’ perception of speech sounds even before they begin to produce their first words.
Within the first months of postnatal life, infants discern differences between phonemes 1–4
and they group consonants into perceptual categories regardless of whether the sounds are
from their native or a nonnative language5,6. Between six and 12 months, their ability to
discriminate between different native language speech sounds increases, but it declines
sharply for nonnative sounds7,8.
A question often raised, both for theoretical and practical reasons, is: how early in life does
experience with language affect infants? A laboratory study showed that the effect of
Corresponding author: Christine M. Moon, Dept. of Psychology, Pacific Lutheran University, Tacoma, WA 98447, Tel (253)
535-7471, Fax (253) 535-8305, mooncm@plu.edu.
NIH Public Access
Author Manuscript
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
Published in final edited form as:
Acta Paediatr. 2013 February ; 102(2): 156–160. doi:10.1111/apa.12098.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
experience appears earlier for vowels than for consonants and can be measured by six
months of age. Testing in Stockholm and Seattle revealed that infants in both Sweden and
the USA were able to group native vowel sounds into categories, but were unable to do so
for nonnative vowels9. It is not surprising that experience with vowels would affect
perception earlier than consonants, because vowels are louder, longer in duration and carry
salient prosodic information (melody, rhythm and stress).
This paper reports the results of an investigation into an area that has not been studied
before: does experience in the womb affect infants’ perception of vowel sounds? We already
know that hearing begins at the onset of the third trimester of pregnancy and that sound is
transmitted primarily through bone conduction, from the amniotic fluid through the foetal
skull and into the inner ear10. Reliable foetal responses to pure tones presented via a
loudspeaker have been recorded as early as 12–13 weeks before birth 11 and late-term
foetuses not only detect vowels12,13 but show heart rate changes when the vowels /a/ as in
pot and /i/ as in fee shift position in [ba]-[bi] to [bi]-[ba]14. Recordings made in utero show
sufficient preservation of formant frequencies to make vowel discrimination possible and
intelligibility tests show that adults can identify about 30–40% of phonetic content of speech
recorded in the womb 15,16. What is unknown, however, is whether an ability to detect and
to discriminate vowel sounds in the womb is accompanied by an ability to learn vowels from
prenatal experience17.
Learning from natural exposure to sound has been shown in neonatal experiments using
entire sentences or phrases, and this is thought to reflect learning about prosodic aspects of
language 2,18–22. However, newborns have been considered to be phonetically naïve—
influenced only by their innate, universal capacities—and ready for learning through
postnatal experience. The aim of the current study was to investigate whether neonates are
capable of learning phonetically in utero by examining the effects of language experience on
phonetic perception at the youngest feasible age.
Study participants were from Stockholm and Tacoma, Washington, and the stimuli for the
neonate study were the same as those used in the Sweden/USA study of six-month-olds by
Kuhl et al., 19929. The native and nonnative vowels were the English vowel /i/ as in fee and
the Swedish vowel /y/ as in fy (like the German /ü/). The 19929 experiment used a prototype
of each of the two native vowels and 32 variants of each category to test vowel category
perception. Speech prototypes are defined as the best example of a category as judged by
adult speakers of the language23. In the present study, our goal was to ascertain whether
neonates in the two countries showed any tendency to group native vowels into a category,
pre-dating the abilities of six-month-olds. If neonates showed an effect of learning, and if
time since birth did not affect how they responded, it would provide evidence of learning in
utero. We used a contingent procedure in which each sucking response by the infant would
produce a particular vowel sound and the number of sucking responses provided a measure
of the infant’s interest in each sound. We hypothesised that infants would show less interest
in the familiar sound stream (native language vowels) due to the vowels’ equivalency with
each other as members of a category and their repetitive nature. In contrast, the nonnative
vowels would seem to be in a constantly changing sound stream and therefore more likely to
attract the infant’s attention24,25. We used the prototype vowel of the English and Swedish
vowel category as well as variants of the prototype of each of these two categories used by
Kuhl et al. (1992)9.
We specifically predicted that infants in the native language group would show a lower level
of attention to the familiar category of native vowels and, and a result, suck their pacifiers
less often when they heard these than infants in the nonnative group. We also predicted that
Moon et al. Page 2
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
the distinction could be particularly evident for the prototype vowels, given that prototypes
are considered representative of categories as a whole26.
Method
Participants
Neonates (M = 32.8 hrs postnatal age, range = 7–75 hrs) were tested in hospitals inTacoma,
Washington, USA (N = 40) and Stockholm, Sweden (N = 40). Infants were eligible if: a)
pregnancy, birth and neonatal hearing tests were typical, b) the maternal native language
was English (in Tacoma) or Swedish (in Stockholm) and c) the infant’s mother did not speak
a second language more than rarely during the final trimester of pregnancy, as assessed
informally in the USA and by a questionnaire in Sweden. In addition to the 80 total infants
in the study, 15 additional infants were excluded from the analysis for: cessation of sucking
during the experiment (5), fussiness (1), drowsiness (5), parental interference (1) and
equipment or experimenter error (3). This represents a low (16 %) attrition rate for neonatal
behavioural tests of perception. Half of the infants in each country were assigned to hear the
Swedish vowel stimuli and half the English. There were no statistically significant
differences between the two groups in terms of gender, test age, gestational age, birth weight
or number of sucks in five minutes.
Stimuli
The stimuli from Kuhl et al. (1992)9 were computer-generated sound files of 17 different
examples of the English front unrounded vowel /i/ as in “fee” and 17 instances of the
Swedish front rounded vowel /y/ as in “fy.” The vowel inventories in both Washington State
and Sweden do not include the nonnative vowels used in the study. The prototype vowel for
each language had been experimentally determined by native speakers in Seattle and
Stockholm, who rated the vowels as the best representatives of the category9,27 The 16
vowel variants were created by altering the first and second formant frequencies of the
prototype in a step-wise fashion, to create two concentric rings of eight stimuli around each
prototype (see Figure 1). The vowel stimuli were each 500 ms in duration and were
delivered at 72 dB (Bruel & Kjaer Model 2235, Scale A) through headphones (Grado 225).
Design, equipment and procedure
A between-subjects design was chosen because exposure to one vowel category could affect
performance on the second vowel category. This also reduced the session time and attrition
rate. Sessions were conducted in a quiet room with infants lying supine in their bassinets
with headphones placed next to their ears (see Figure 2). Each infant was offered a pacifier
and the data collection began if the infant accepted it and started sucking in a typical,
rhythmic burst/pause pattern. The protocol stipulated a five-minute test period to allow
presentation of each of the 17 vowels in the set.
Pacifiers were fitted with a sensor connected to a computer that delivered auditory stimuli
when the infants sucked on the pacifier. When they completed the second suck in a burst,
one of the 17 stimuli was presented in random order. Each vowel was repeated until there
was a pause in sucking of one second or longer. Resumption of sucking resulted in a new
vowel stimulus from the set. Equipment consisted of an air pressure sensor for the pacifier,
A/D converter, sound amplifier (Klevang Electronics, Portland, Oregon) and a Dell laptop
computer with custom software (Klevang Electronics). An air pressure threshold was set so
that virtually all the sucks after the first one in a sucking burst resulted in sound
presentation. Stimuli were delivered in random order without replacement. The equipment
used in both countries was identical and the same experimenter (CM) conducted the tests in
both countries using the same protocol. Examination of the randomisation showed that the
Moon et al. Page 3
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
prototype appeared in all 17 possible serial positions, was roughly equally distributed among
positions (χ2 = 15.6, p = .48) and the test groups did not differ when it came to the serial
position of prototype (Kruskal Wallace χ2= 1.61, p = .66). Average time to complete the
series of 17 stimuli was 135.1 seconds (SD 50.2).
Results
Data from the first presentation of each of the 17 stimuli were analysed and the dependent
measure was number of sucks per vowel stimulus. A mixed ANOVA was conducted to
examine the effects of Language Experience: (native vs. nonnative), Prototypicality
(prototype vs. nonprototype) and Country (USA vs. Sweden), with hours since birth as a
covariate. We examined hours since birth to determine whether learning effects could be
attributed to pre- vs. postnatal experience. There was no significant main effect (F1,75 = .04,
p =.85) of hours since birth, nor any interaction effects with hours since birth. As a result,
this factor was not included in subsequent analyses.
The effect of Language Experience was significant (F1,75 = 4.95, p = .029), with a greater
number of sucks overall during the nonnative (MNonnative= 7.1, SD = 2.9) than during the
native language (MNative= 6.5, SD = 3.3). The results show that the native prototype and its
variants received fewer sucking responses than the nonnative prototype and its variants.
Results also showed an interaction effect of Language Experience and Prototypicality (F1,75
= 4.6, p =.035) (see Figure 3). To further examine the interaction, planned t-tests were
conducted. For the 40 infants who heard the native vowel, there was no difference in mean
sucks during the prototype vs. the nonprototype (t39 = −1.08, p = .29). However, for the 40
infants who heard the nonnative vowel, the difference in response to the prototype was
significantly greater than the response to the nonprototype (t39 = 2.03, p = .049).
In the overall ANOVA, the effect of Country was significant (F1,75 = 18.4, p < .001), with USA mean sucks per stimulus (MUSA=8.1, SD = 3.0) greater than Sweden sucks (MSweden= 5.4, SD = 2.6). No other main or interaction effects were significant.
Although there were no significant interaction effects involving Country, a further analysis
was undertaken to measure the infants’ responses to the native and nonnative prototypes in
each of the two language communities. The difference between native and nonnative
prototypes was significant in the Sweden infants (t38 =3.58, p=.001), but not in the USA
infants (t38 = .68, p>.05). A nonparametric analysis was conducted in which the 17 stimuli
were rank ordered for individual infants according to how many sucks they received from
the infant with Rank 1 being the stimulus with the highest number of sucks and Rank 17
being the stimulus with the lowest number of sucks from that infant. For the USA nonnative
group, 16 of 20 infants had the prototype among ranks 1–8, roughly the top half of the 17
ranks (binomial test, p<.05, two-tailed). For the USA native language group, 9 of 20 infants
had the prototype in the top eight ranks (binomial test, p>.05). The difference between the
USA native and nonnative groups was significant (Pearson χ2= 5.2, p=.024, two-tailed.)
Discussion
The results of our study support the hypothesis that language experienced in utero affects
vowel perception. Neonates in the two different language communities responded to vowels
in their familiar native language category as though they were equivalent to each other. In
contrast, they did not treat vowels in an unfamiliar, nonnative category as equivalent. The
infants did this in two ways: 1) they sucked less overall when the sound stream was from
their native language, presumably because successive sounds were from the same category
and repetitive and 2) they behaved as though the native prototype was equivalent to the other
Moon et al. Page 4
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
vowels in the category. The absence of significant interaction effects for Country indicates
that responding to native/nonnative vowels and prototype/nonprototype did not differ
between the groups from Sweden and the USA. We argue that the difference in response to
the native and nonnative vowels can be attributed to prenatal perceptual experience, even
though infants had between seven and 75 hours of postnatal exposure to their native
language. Our results show that there was no effect of the number of hours of postnatal
exposure to ambient language on infant responding to familiar and unfamiliar vowels.
We found that neonates, having heard native vowels many times before birth, sucked less
when they were exposed to them after birth and they sucked as much to the native prototype
as they did to the native variants. Our interpretation, following the prototype literature, is
that experience with native vowels renders the native prototype very similar to its variants
(see Bremner, 201128, for a discussion of face prototypes, and Kuhl, 199129, for a discussion
on prototypes and speech categories). Because the infants have never experienced nonnative
vowels in the womb, they perceive them as more distinct from one another. One interesting
result is the large difference between sucking responses to the nonnative prototype and its
variants. Sucking responses were largest to the nonnative prototype. There are data to
suggest that, regardless of experience, vowel prototypes are more salient than
nonprototypes30. Our results only show enhanced prototype salience for unfamiliar vowels.
Further studies with equivalent methods are needed to examine this difference in results.
Additional studies will be necessary to examine whether the results reported here can be
generalised to other vowels and languages. It will be especially interesting to compare
newborn perception of vowels that are more extreme with regard to their location in the
vowel space (such as the vowels used in the present study), with ones that are less extreme.
Data suggest that more extreme vowels are generally more salient for infants31.
We found that overall pacifier sucking rates were higher in the USA than in Sweden. This
could be due to differences in pre- and perinatal care practices in the two countries, such as
rates of neonatal pacifier use, epidural anesthesia and breast-feeding.
These results suggest that birth is not a benchmark that reflects a complete separation
between the effects of nature versus those of nurture on infants’ perception of the phonetic
units of speech. Our results indicate that neonates’ perception of speech sounds already
reflects some degree of learning. Although technically daunting, research during the foetal
period is warranted to provide a complete developmental picture of phonetic perception. The
finding also raises questions regarding what sounds are available to the developing fetus,
how they are processed by the developing brain and how further experience during
development continues to shape perception, both in typically developing children and also
clinical populations.
Acknowledgments
The opinions or assertions contained in this paper are the private views of the authors and should not to be
construed as official or as reflecting the views of the United States Department of Defense. This research was
supported by a National Institutes of Health Grant HD 37954 to Patricia K. Kuhl and by the S. Erving Severtson
Forest Foundation Undergraduate Research Program. We would like to thank the families and staff at Madigan
Army Medical Center in Tacoma and The Astrid Lindgren Children’s Hospital in Stockholm. Student research
assistants were: K. Fashay, A. Gaboury, J. Jette, K. Kerr, S. Rebar, M. Spadaro, G. Stock, and C. Zhao.
References
1. Cheour-Luhtanen M, Alho K, Kujala T, et al. Mismatch negativity indicates vowel discrimination in
newborns. Hearing Research. 1995; 82:53–58. [PubMed: 7744713]
Moon et al. Page 5
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
2. Moon C, Panneton Cooper R, Fifer WP. Two-day-olds prefer their native language. Infant Behav
Dev. 1993; 16:495–500.
3. Dehaene-Lambertz G, Pena M. Electrophysiological evidence for automatic phonetic processing in
neonates. Neuroreport. 2001; 12:3155–3158. [PubMed: 11568655]
4. Bertoncini J, Bijeljac-Babic R, Jusczyk PW, Kennedy LJ, Mehler J. An investigation of young
infants’ perceptual representations of speech sounds. J Exp Psychol. 1988; 117:21–33.
5. Eimas PD, Siqueland ER, Jusczyk P, Vigorito J. Speech perception in infants. Science (New York,
NY ). 1971; 171:303–306.
6. Streeter LA. Language perception of 2-mo-old infants shows effects of both innate mechanisms and
experience. Nature. 1976; 259:39–41. [PubMed: 1256541]
7. Kuhl PK, Stevens E, Hayashi A, Deguchi T, Kiritani S, Iverson P. Infants show a facilitation effect
for native language phonetic perception between 6 and 12 months. Developmental Science. 2006;
9:F13–f21. [PubMed: 16472309]
8. Werker JF, Tees RC. Cross-language speech perception: Evidence for perceptual reorganization
during the first year of life. Infant Behavior and Development. 1984; 7:49–63.
9. Kuhl PK, Williams KA, Lacerda F, Stevens KN, Lindblom B. Linguistic experience alters phonetic
perception in infants by 6 months of age. Science. 1992; 255:606–608. [PubMed: 1736364]
10. Granier-Deferre C, Ribeiro A, Jacquet A-Y, Bassereau S. Near-term fetuses process temporal
features of speech. Dev Sci. 2011; 14:336–352. [PubMed: 22213904]
11. Hepper PG, Shahidullah BS. Development of fetal hearing. Arch Dis Child. 1994; 71:F81–87.
[PubMed: 7979483]
12. Groome LJ, Mooney DM, Holland SB, Smith YD, Atterbury JL, Dykman RA. Temporal pattern
and spectral complexity as stimulus parameters for eliciting a cardiac orienting reflex in human
fetuses. Percept Psychophys. 2000; 62:313–320. [PubMed: 10723210]
13. Zimmer EZ, Fifer WP, Kim Y, Rey HR, Chao CR, Myers MM. Response of the premature fetus to
stimulation by speech sounds. Early Hum Dev. 1993; 33:207–215. [PubMed: 8223316]
14. Lecanuet J, Granier-Deferre C, DeCasper AJ, Maugeais R, Andrieu A, Busnel M. Perception et
discrimination foetales de stimuli langagiers; mise en evidence a partir de la reactivite cardiaque;
resultats preliminaires. C R Acad Sci Paris. 1987; 305:161–164. [PubMed: 3113681]
15. Querleu D, Renard X, Boutteville C, Crepin G. Hearing by the human fetus? Semin Perinatol.
1989; 13:409–420. [PubMed: 2683112]
16. Griffiths SK, Brown WS, Gerhardt KJ, Abrams RM, et al. The perception of speech sounds
recorded within the uterus of a pregnant sheep. J Acoust Soc Am. 1994; 96:2055–2063. [PubMed:
7963021]
17. Huotilainen M. Building blocks of fetal cognition: emotion and language. Infant and Child
Development. 2010; 19:94–98.
18. Byers-Heinlein K, Burns TC, Werker JF. The roots of bilingualism in newborns. Psychological
Science. Mar 1.2010 21:343–348. [PubMed: 20424066]
19. Mampe B, Friederici AD, Christophe A, Wermke K. Newborns’ cry melody is shaped by their
native language. Curr Biol. Dec 15.2009 19:1–4. [PubMed: 19135370]
20. Mehler, J.; Christophe, A.; Ramus, F.; Marantz, A.; Miyashita, Y.; O’Neil, W. Image, language,
brain: Papers from the first mind articulation project symposium. The MIT Press; 2000. How
infants acquire language: Some preliminary observations; p. 51-75.
21. Ramus F, Nespor M, Mehler J. Correlates of linguistic rhythm in the speech signal. Cognition.
1999; 73:265–292. [PubMed: 10585517]
22. May L, Byers-Heinlein K, Gervain J, Werker JF. Language and the newborn brain: does prenatal
language experience shape the neonate neural response to speech? Front Psychol. 2011; 2:222–
222. [PubMed: 21960980]
23. Miller JL, Eimas PD. Internal structure of voicing categories in early infancy. Percept
Psychophysics. 1996; 58:1157–1167.
24. Cowan N, Suomi K, Morse PA. Echoic storage in infant perception. Child Development. 1982;
53:984–990. [PubMed: 7128263]
Moon et al. Page 6
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
25. Floccia C, Christophe A, Bertoncini J. High-amplitude sucking and newborns: The quest for
underlying mechanisms. J Exp Child Psychol. 1997; 64:175–198. [PubMed: 9120380]
26. Rosch E. Cognitive reference points. Cogn Psychol. 1975; 7:532–547.
27. Kuhl, PK. Psychoacoustics and speech perception: Internal standards, perceptual anchors, and
prototypes. In: Werner, LA.; Rubel, EW., editors. Developmental Psychoacoustics. Washington
D.C: American Psychological Association; 1992.
28. Bremner JG. Four themes from 20 years of research on infant perception and cognition. Inf Child
Dev. 2011; 20:137–147.
29. Kuhl PK. Human adults and human infants show a “perceptual magnet effect” for the prototypes of
speech categories, monkeys do not. Perception & Psychophysics. 1991; 50:93–107. [PubMed:
1945741]
30. Aldridge MA, Stillman RD, Bower TGR. Newborn categorization of vowel-like sounds. Dev Sci.
2001; 4:220–232.
31. Polka L, Bohn O-S. Natural Referent Vowel (NRV) framework: An emerging view of early
phonetic development. J Phonetics. 2011; 39:467–478.
Moon et al. Page 7
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
Key Notes
• Being exposed to ambient language in the womb affects foetal phonetic
perception.
• Eighty neonates (mean: 33 hrs since birth) in Sweden and the USA responded
differently to vowels sounds, depending on whether they were from their
familiar native or an unfamiliar nonnative language.
• The neonates sucked their pacifiers more frequently to activate recordings of
unfamiliar nonnative vowel sounds, and the hours that had elapsed since birth
had no effect on these rates.
Moon et al. Page 8
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
Figure 1.
The 17 stimuli for the English vowel /i/ and Swedish /y/ mapped according to their formant
frequencies converted to mels which are psychophysical units. The figure is adapted from
Kuhl et al, 19929.
Moon et al. Page 9
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
Figure 2.
An infant, 20 hours after birth, takes part in the procedure in which hearing speech sounds
through headphones is contingent on sucks on a pacifier.
Moon et al. Page 10
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
Figure 3.
Infant sucking response to contingent auditory presentations of vowels from the familiar
native language or the unfamiliar nonnative language. Means represent the mean of sucks
that produce the prototype vowel and the means of 16 nonprototype vowels.
Moon et al. Page 11
Acta Paediatr. Author manuscript; available in PMC 2014 February 01.
$w
aterm
ark-text
$w
aterm
ark-text
$w
aterm
ark-text
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read moreOur specialists are always online to help you! We are available 24/7 via live chat, WhatsApp, and phone to answer questions, correct mistakes, or just address your academic fears.
See our T&Cs