Arguably the most unique ability of humans is the ability to communicate highly complex concepts, for which we need language. But language can mean many things, from sign language to writing, although the most efficient form of language we have is speech – a skill unique today to living modern humans. The apes have a remarkable capacity for language and communication: Kanzi the bonobo (fig. 1) has a working knowledge of perhaps thousands of words and Koko the gorilla understands and uses American Sign Language. But they are unable to speak. Why is this? What are differences between human and ape anatomy that allow us to produce these sounds, and what selection pressures may have driven the evolution of our highly specialised anatomy?
The Anatomy of Speech
In human babies the entrance to the windpipe (trachea) is very high up in the throat, allowing them to swallow and breathe through their nose at the same time – milk simply flows around the trachea as it enters the throat (diagram). This also means that the voice box (larynx) is very high up in the throat. Starting around 3 months of age, the larynx begins to descend further down the trachea and stops its migration by the time we reach 4 years old.
The final position of the larynx is possibly the most important difference in the large-scale anatomy of our vocal apparatus. Its low position increases the overall length of the cavity that is our vocal tract which consists of the pharynx, oral cavity and nasal cavity (fig. 2). The human nasal cavity is also significantly longer than in apes and the tongue is far more compact, allowing air to resonate more freely.
I know a song that’ll get on your nerves…
It is this resonance of the air that allows us to produce noises – the vibration of the vocal cords inside the larynx produces a tone that is bounced around inside our vocal tract before leaving our mouth. But obviously we can produce more than one type of sound, which means we must be able to change the shape of our vocal tract. Our two most distinct vowel sounds are ‘ee’ and ‘oo’ but to produce all the sounds in between such as ‘ah, ‘uh’ and ‘oh’ we need to have much finer control over the shape of our vocal tract.
This is where we really have the upper hand over our ape-y cousins. We have much finer control over our vocal tract. We can independently control the size and shape of the pharynx and the oral cavity which greatly increases the range of sounds we can make, even down to the difference between ‘aahhh’ (relaxation) and ‘awww’ (cuteness). Our ability here is aided by the great flexibility and accuracy of our tongue – there’s even research being done into using it to help paralysed people control electric wheelchairs. But what could have caused us to evolve these abilities in the first place?
Stand up, speak up.
A few possible explanations exist for why we evolved a larynx which descends over time, including breathing during heavy exercise, our upright posture or to sound as if we are bigger than we actually are. However, plenty of animals breathe just fine during exercise and even manage to pant at the same time to regulate heat, all without a low larynx. The upright posture hypothesis doesn’t really hold up either, as both gibbons and kangaroos are upright creatures with a high larynx. Pretending to be bigger than we are may be a promising explanation. Birds, deer and lions all possess longer vocal tracts than you would expect for their size, and also settle disputes using vocal cues.
Given that we also see a secondary descent of the larynx in male humans as their voice breaks, it looks likely that a lower larynx may aid in communication of your place in the social hierarchy, or to put off predators by sounding bigger. These two roles may have been particularly useful early on in hominin evolution before we evolved complex speech or the ability to use weapons to protect ourselves. Either way, speech did evolve and it must have been incredibly important – the very low position of our larynx and trachea in our throat puts us in great danger of choking on our food if we breathe at just the wrong time.
Aping around – mimicry and speech
So the descent of the larynx may have played an early role in communication, whether between ourselves or to other animals, but this is a long way from the intricate vocal patterns and linguistic meanings which comprise true speech. The closest that we find in the animal kingdom is bird song, refined through a mixture of instinct and imitation, leading to development of recognisable songbird ‘dialects’. This imitation is taken to the extreme in species such as parrots and lyrebirds (here’s some classic Attenborough) –
Such mimicry is rare in the animal kingdom but has evolved separately in birds, seals and whales and dolphins. Dolphins use it to learn their mother’s song before leave their pod. When they come across other dolphins, they sing their mother’s song which can be recognised by siblings they have never known.
Vocal mimicry could have followed a similar path in our own ancestors, with dispersing females (New Scientist) needing to mimic the ‘dialect’ of a neighbouring population in order to be accepted into the social structure. With competition for mating, food and water running high dialects may have become ever more complex to keep outsiders out, which would have required a better ear to recognise and distinguish between different dialects. It would also have been ever more important to accurately mimic more complex dialects in order to be accepted. this may eventually have led to runaway selection for vocal complexity, understanding and mimicry, the three vital ingredients needed to assign meaning to sounds and, crucially, learn and remember that meaning for the future.
Speak now, or forever hold your peace…
But can we tell when these abilities evolved just from looking at the fossil record? The most bone most likely to tell us anything about the state of our larynx is the hyoid bone, which sits at the front of our throat below the tongue and is the only bone in the body which doesn’t touch another bone (fig. 3). Instead, it is held in place by muscles, tendons, ligaments and cartilage.
Sadly, the hyoid only contains indirect clues about the anatomy of our vocal tract. It is also a very fragile bone which does not preserve well in the fossil record. Its shape remains the same as our larynx descends so is not very useful for finding out how complex Neanderthal speech was, although it is widely agreed that they had spoken language. The main difference in hyoid shape between humans and apes is the presence of a hollow in apes to accommodate vocal sacs which were lost at some point along the hominin line. The size and shape of the bony canal for the hypoglossal nerve which controls muscle movements below the tongue is also uninformative as there is a large range of overlap between human and apes, while the diameter of the spinal cord is more likely to have allowed greater control of breath for endurance running and later been co-opted to allow fine control of breath for speech.
In conclusion, while we can explain how we are able to speak, questions relating to precisely why we speak and when we evolved this ability remains elusive.
If you’ve got any more questions, just grab me in the comments or on Tw*tter!
This blog post is based on Fitch, 2000.
Arcadian / Wikimedia Commons, 2007. Illu701 head neck. Available from: http://en.wikipedia.org/wiki/Pharynx#mediaviewer/File:Illu01_head_neck.jpg (Accessed: 01/09/14).
Fitch, W. (2000). The evolution of speech: a comparative review. Trends in Cognitive Sciences, 4, 258–267.
Great Ape Trust, 2013. Available from: greatapetrust.org (Accessed: 30/08/14)
The Worlds of David Darling, 1999-2014. Hyoid. Available from: www.daviddarling.info (Accessed: 01/09/14)