Chuo Online

  • Top
  • Opinion
  • Research
  • Education
  • People
  • RSS

Top>Research>The Very First Word


Keiko Masuda

Keiko Masuda [Profile]

Education course

The Very First Word

-Why mothers are called 'mama' and fathers 'papa'-

Keiko Masuda
Associate Professor of Linguistics & Phonetics, Faculty of Commerce, Chuo University

1. Introduction

We come into the world without any innate language and gradually acquire a language from scratch. We can only cry when we are born, but we soon start to voice sounds such as 'er' or 'ur' before we begin to produce sounds which bear meaning at around the age of one. Many parents have a fond memory of the moment when their children uttered a 'word' for the first time. What was that first word? In the case of Japanese parents, many would say it was 'mama', and it is natural that the word for mother should be first. What kind of word do children who speak other languages use to refer to their mothers and fathers?

2. Mothers are called [mama] and fathers [papa]

The world's languages can be divided into groups called language families. Languages belonging to the same family are assumed to share a protolanguage, or ancestral language. For instance, many European languages including English, German, and French, belong to the Indo-European language family, whereas Arabic and Hebrew belong to the Afro-Asiatic family, and Mandarin Chinese and Thai to the Sino-Tibetan family. As the languages in the same family are thought to share the same protolanguage, they tend to have sounds, words, and syntactic structures in common. On the other hand, such similarities are not observed among languages from different families except for loanwords or the odd coincidence. However, the words used by young children to refer to mothers and fathers seem to be an exception. In spite of the fact they are speaking languages from different language families, many young children around the world use very similar sounds and phonological structures to describe their parents. About fifty years ago the anthropologist George Murdock noticed this, and in the following year, the linguist Roman Jakobson examined the finding and formulated a hypothesis to account for Murdoch's observation. Let us see what they said.

Murdoch found that in many languages, the words used by infants to refer to 'mother' are represented by the combination of a consonant [m] or [n] and an open vowel such as [a] or [o] (Figure 1). Such words include [mama], [na], or [ama]. On the other hand, the word denoting 'father' tends to contain a consonant [p] or [t] and an open vowel (Figure 2). Why do languages from different families have such similar sound patterns when it comes to describing parents?

Figure 1

Figure 1: Combinations of sounds used by infants to refer to their mothers (Number of Samples: 531 from 474 language communities)

Figure 2

Figure 2: Combinations of sounds used by infants to refer to their fathers (Number of Samples: 541 from 474 language communities)
(Offered by: Murdock (1959))

3. Language acquisition during the first 12 months

The similarity in the terms used by babies to describe parents (especially mothers) seems to have something to do with the language acquisition process in children. A newborn baby's vocal tract, the air passage above the vocal cords up to the lips, is undeveloped and greatly differs from that of an adult. The baby's vocal tract develops quickly and after the first six to eight months it begins to look like that of an adult. Young children cannot pronounce certain sounds or sound sequences properly. For instance, they tend to palatalize consonants, e.g., [ʧu] for [ʦu], pronouncing tsumetai (cold to the touch) as chumetai. This can be attributed to their underdeveloped vocal organs.

Although there are individual differences, infants begin to produce vowel-like sounds after the first six weeks or so, and after the first four or five months they start to utter sound sequences consisting of a consonant and a vowel (babbling), such as [ma] or [bubu]. Babbling begins with a monosyllable such as [ba], which then evolves into a repetitive form (e.g., [bababa]). At this stage, the babbling sounds do not refer to any specific object. At around 12 months of age, infants begin to use the same sound or sound sequence consistently to refer to a specific object or person. In other words, they begin to use words and begin to speak.

Adults often speak to young children in a distinctive way called motherese or baby-talk. Motherese is not created solely by an adult, nor does it consist only of sounds uttered by a child. Instead, it is established through reciprocal exchanges between children and adults. Motherese therefore reflects the features of and tendencies in the development of children's speech. It inevitably contains sounds and sound sequences that are easy for children to pronounce. The words used to refer to parents, more than anything else, have to have sounds and a phonetic structure that are easily pronounced by children, as parents are the most important presence in their lives and a child needs some means of describing them.

4. Speech sounds

In order to work out what sounds are easy for children to pronounce, we need to understand a little about speech sounds. Speech sounds can be divided into vowels and consonants. Vowels are produced with vocal cord vibration by letting air from the lungs escape unobstructed in the vocal tract through the opening of the mouth. The five vowel sounds in Japanese are [a, i, u, e, o]. Vowels are further divided depending on the position of the tongue and the degree of lip-rounding. Try pronouncing the Japanese [a, i, u] slowly. When pronouncing [a], your mouth is wide open and your tongue is lowered. Your tongue is then raised and your lips are spread to pronounce [i]. Then your lips are rounded and protruded and your tongue is retracted when pronouncing [u]. Vowels are classified as high, mid, and low vowels by the height of the tongue, and as front, central, and back vowels by the horizontal position of the tongue. [i], for example, is a high front vowel.

Consonants are sounds produced by obstructing the airflow in the vocal tract. For instance, plosives or stops (such as [p, b, t, d, k, ɡ]) are produced by stopping the airflow and then releasing the air, while fricatives (such as [s, z, ʃ]) are produced by letting the air flow through a narrowing made by the tongue in the oral cavity, causing friction. Other consonants include affricates (such as [ʧ]) and liquids ([r, l]). Although [m] and [n] are classified as consonants, they have a feature different from the other consonants. They are called nasals and are produced by releasing the air through the nostrils via the nasal cavity, not the oral cavity.

In addition to the way the airflow is obstructed (manner of articulation) and the vocal cords vibration (voiceless or voiced), consonants can also be classified by the place in the vocal tract where the airflow is obstructed (place of articulation). For instance, sounds such as [p, b, m] are produced by using both the upper and the lower lips (bilabial consonants). Sounds such as [t, d, s, z] use the tip of the tongue and the root of the upper teeth or the alveolar ridge (alveolar consonants), while sounds such as [k, ɡ] use the back of the tongue and the soft palate, or velum (velar consonants). [p], for instance, is a voiceless bilabial plosive.

5. Easy sounds and structure for children

So - what sounds and structures are easy for children to pronounce?

In terms of structure, it is safe to say that the combination of a consonant and a vowel (CV), which is often observed in babbling, is the easiest for children to pronounce. The basic phonological structure of Japanese, a mora, has the same CV structure. Research shows that consonant clusters, in the English word street [stri:t] for example, are not likely to be used in babbling. It is also indicated that a syllable ending with a consonant is rare at the babbling stage. Native speakers of Japanese naturally tend to pronounce street as [sutori:to], that is, with an extra vowel between consonants and after the final consonant - following the basic CV structure of Japanese. These structures are difficult to pronounce even for adults with fully developed vocal organs if they are not used to them. It is easy to imagine the difficulty that infants have in saying them.

In terms of sounds, at the babbling phase, mid and low vowels are much the most common and high vowels are very rare. In Japanese, [a, e, o] are among such vowels. The vocalized sound of an unconscious sigh is, for most Japanese people, a vowel similar to [a]. This sound is produced quite easily with relaxed lips, tongue, and muscles around the mouth. Imagine the voice you unconsciously make when soaking in a bath after a long, tiring day. Sounds that are easy for adults to produce are also easy for children.

The most common consonants in babbling are, in the order of frequency, [m] = [b] > [p] > [d] > [h] = [n] > [t]. In general, children acquire consonants that are articulated around the entrance of the mouth first and those at the back of the oral cavity later. During babbling, bilabial and alveolar sounds are frequently observed in terms of the place of articulation, and nasals and plosives in terms of the manner of articulation. You can easily understand why if you pronounce these sounds while thinking about the muscles around the mouth, lips, and the tongue movement and tension. [m] in particular is an easy sound, as it is pronounced simply by closing your mouth and releasing the air through the nostrils (with vocal cord vibration). [b] and [p] also require little effort; they are pronounced just by opening and closing your mouth. These sounds can be pronounced easily even by small infants.

We have seen why the combination of a nasal (an easy consonant) and a mid or low vowel (an easy vowel), i.e., [ma(ma)], is used by young children to refer to their mothers. It is interesting that 'food' or 'meal', which is equally important to them, is called [mamma] in Japanese, which sounds very similar to [mama]. It might have something to do with the fact that the mother is a source of food until a baby is weaned and that she is usually the one who mostly feeds the infant after weaning.

What about fathers? They are scarcely less familiar than mothers and an important presence for infants. It would be most natural for them to use easily pronounced sounds when referring to the father. However, the word must be clearly distinguishable from the word used to refer to the mother, otherwise confusion may arise. When considering the order of frequency of consonants in babbling, [b] may seem like a strong candidate for the word for 'father'. Nevertheless, more languages use [p] or [t] than [b]. It may be because the voiced sound, [b] (or [d]), is averted to avoid confusion since the nasals [m, n] too are voiced. The voiceless sounds [p] or [t] are easier to distinguish from the nasals. In this way, the word used for 'father' by young children needs to contain sounds that are easy to pronounce and be clearly different from the word used to refer to mothers.


Jakobson, R. (1960) "Why 'mama' and 'papa'?" In Jakobson, R. Selected Writings, Vol. I: Phonological Studies, pp. 538-545. The Hague: Mouton.
Murdock, G. P. (1959) "Cross-language Parallels in Parental Kin Terms." Anthropological Linguistics, Vol. 1, No. 9. pp. 1-5.

(Offered by: Kusa no Midori (The Greeness of Grass) No.214)

Keiko Masuda
Associate Professor of Linguistics & Phonetics, Faculty of Commerce, Chuo University
Born in 1971 in Gifu Prefecture. Graduated from the Department of English, Faculty of Foreign Languages at Sophia University in 1994. Obtained an M.Phil. in Linguistics in 1998 and a Ph.D. in Linguistics in 2003 from the University of Cambridge. She joined Chuo University in 2004 as Assistant Professor before assuming her current position. Her area of expertise is linguistics and phonetics. Her research is particularly focused on the symbolic aspects of speech sounds.