Music and the Human Brain

Our perception and enjoyment of music is both wonderful and mysterious. No known human culture, now or anytime in the recorded past, lacked music. Some of the oldest artifacts found in human and proto-human excavation sites are musical instruments (In 2009 a 35,000-year-old bone flute was found in Germany). Why is music such an important part of human culture? Is there an evolutionary purpose for music, or is it just an evolutionary accident? Did it evolve before, after, or simultaneously with speech?

A study of the brain provides at least a few clues regarding these questions. It is clear that the human brain has a highly developed capacity for processing music. It is difficult for me to believe that this very specialized capability evolved without an evolutionary benefit. However the benefit might not directly involve music. Pinker believes music does not serve an evolutionary purpose; he states: "As far as biological cause and effect are concerned, music is could vanish from our species and the rest of our lifestyle would be virtually unchanged." He also states that music builds on language structures, implying that music evolved after language. Pinker is not alone in his opinion, but there is also no lack of experts who vigorously disagree.

Acknowledgement: much of the material in this section is derived from Levitin's book "This is your brain on music", and the book Musicophilia by Oliver Sacks. If you find this subject interesting, I also highly recommend the 52 minute YouTube lecture by Aniruddh Patel.

Speculations on the Evolution of Music

Again I will make a comparison between human hearing and a military Electronic Counter Measures (ECM) system (I used to design these things). The function of an ECM system is to detect a potential threat. As I note in the section on Music and the Human Ear, there is a lot of similarity in the functionality of the human ear and the detector of an ECM system. The analogy can be carried forward another level. After an ECM system detects a radar signal, it immediately attempts to identify the source. It does this by determining the frequency of the radar signal (pitch), the pulse repetition and/or scan rate (rhythm), or for a swept frequency radar the rate and range of the sweep (melody). These characteristics are compared to a data base of friendly and hostile radars. Again we see a great functional similarity to what a brain does with music. So it does not seem at all implausible that the brain's tool kit for processing music originally evolved for defensive purposes.

A very common characteristic of evolved features is that evolution initially selects for a capability to solve one problem, but then it turns out that the solution can be adapted to solve another problem, and evolution then selects for that as well. Thus it is plausible that a tool kit first evolved to identify threats, and was subsequently evolved for music appreciation. What could be the evolutionary benefit of music? Dance and music are intimately associated. In some cultures there is only one verb meaning both "to sing" and "to dance." For many creatures singing and mating dances are a central feature of their courting rituals. A likely evolutionary purpose for this is that it is one way to display fitness to a potential mate. Several contemporary authors agree with this possibility. The March 2009 issue of Smithsonian magazine reports that when two mosquitoes of the opposite sex approach each other, they harmonize their wing beats to produce a love duet. If the male can't keep up with the female's rhythm, he is history. Levitin also argues that music is an evolutionary adaptation, and suggests mating and several other possible purposes including social bonding and improved cognitive development. Sacks mentions that in preliterate cultures music played a huge role in oral traditions of storytelling and liturgy, greatly enhancing the ability to memorize long texts. Music could be an adaptation for any or all of these things. In his book Music, Language, and the Brain, Patel devotes an entire chapter to the adaptation issue, and concludes that there is not enough evidence to reach a definite answer.

The master himself, Charles Darwin, believed that music was connected with mating: "When we treat of sexual selection we shall see that primeval man, or rather some early progenitor of man, probably first used his voice in producing true musical cadences, .... this power would have been especially exerted during the courtship of the sexes.." The Descent of Man, page 138.

On a personal level, music is a major part of my life. I will soon turn 70, and I can't even remember the last time I performed a dance to attract a potential mate. Although I always go to concerts with my wife, at home I usually listen to music alone. So music does not seem to be adaptive for me at this point. However I still believe that music is an evolutionary adaptation, was one for me at one time, and most likely as a part of human courtship rituals.

Independence of Music and Speech Processing

Sacks provides abundant evidence that the brain mainly processes music and speech separately, although there is some overlap. Speaking and speech recognition is mainly a left hemisphere function. Although some believe that music is primarily a right hemisphere function, Levitin states that virtually all parts of the brain are involved. Sacks gives many examples of people who have lost one or more music processing functions, but have no problem with a similar speech function. For example, he describes a woman with gross dystimbria. She described music as sounding like pots and pans being thrown on the floor, and opera like screaming. But she had no such problem with speech. He also refers to research where "voice selective" areas have been found in the auditory cortex that are anatomically separate from the areas involved in the perception of musical timbre.

Sacks also describes an interesting situation regarding people who have lost their capability to speak, but retain a capability of singing musical lyrics. This led to "melodic intonation therapy" (MIT) where patients were taught to sing short phrases, and then the musical elements were slowly removed. In some cases patients regained the power to speak a little. One case he describes is a man who could only produce meaningless grunts, and had received three months of speech therapy without any improvement. After two days of of MIT he started to produce words. It is not clear if this is a result of the right hemisphere gaining speech capability, or if it has a beneficial effect on the damaged left hemisphere function. Sacks reports a case where the entire left hemisphere was surgically removed from a young child, who subsequently developed a right hemisphere language capability, so the brain does have that adaptive capability.

Specialized Brain Functions

Historically, much of the knowledge of specialized functions of the brain was deduced from the failure of the functions following a stroke or an accident. Recently magnetic resonance imaging (MRI) has shed a lot of additional light on how the brain operates. Nerve impulses from the cochlea arrive at the brain and are first processed to extract specific categories of information. Additional levels of processing eventually result in our emotional reaction to music.

Levitin lists eight "dimensions" of music, meaning attributes that can be individually varied without affecting the other dimensions. Among the most important dimensions are: perceived pitch; rhythm; timbre; melody; and reverberation. It seems that the brain contains specialized modules for extracting each of these attributes. The evidence for this is that brain damage can cause a loss of one of the functions without affecting the others. (There is similar evidence in the case of speech, where it is possible to lose an ability to speak verbs, or to speak nouns, for example). The brain appears to be a collection of highly specialized modules that are seamlessly integrated by a hierarchy of higher order modules.

Consider the perception of timbre, the attributes that distinguish a saxophone from a trumpet. When both are playing the same note A4 each instrument creates a fundamental tone at 440 Hz, and the same spectrum of harmonics at frequencies of 880, 1320, 1760 Hz and so on. The relative amplitudes of the harmonics, and their variations over time, are what give the instrument its characteristic sound. Nerves from the regions of the cochlea excited by these frequencies send signals to the brain, and presumably a brain module recognizes the different pitches, similar to the way we recognize different colors - although at this level the recognition is probably not conscious. Now what? How on earth does the brain disentangle the overlapping saxophone and trumpet harmonic series, and re-assemble them, so we hear two distinct instruments, each playing a single note, rather than a mishmash of pitches? Levitin suggests that a difference of a few milliseconds in time between the arrival of the two harmonic series is the basis for this amazing feat. Directional clues might be used as well. I also suspect that our memory of what each of these instruments sounds like when played solo helps in this process, . The brain module that does this will even fix the bass response of an inferior stereo system. If an instrument plays a note that produces tones at 39, 78, 117, 156 Hz and so on, but your stereo system can't produce a 39 Hz tone, your brain will fill in the gap and you will hear a 39 Hz pitch! This phenomenon is called "restoration of the missing fundamental." All of this processing occurs automatically, largely in parallel, and without any conscious effort.

Different pitches heard at the same time are processed by a module that extracts harmony. Different pitches heard at different times are processed by another module that extracts melody. Again we know this because Sacks describes patients who have lost one capability without affecting the other. Sacks describes a gifted musician who had a stroke. Suddenly he was unable to recognize a tune as simple as "happy birthday." Yet his perception of pitch and rhythm was intact, and he could read music and hum a melody. So the problem was specifically an inability of auditory processing of a sequence of pitches.

Emotional response to music is arguably the highest response level - although the most primitive areas of the brain appear to be heavily involved. It also apparently arises from a specialized part of the brain. Sacks gives examples of people who have the full tool kit for music processing, including a strong emotional response, and then suffer an accident. They retain the ability to perceive all of the structural characteristics of music, but completely lose the emotional response. He quotes one patient: "[music] had always been the primary unfailing source that nourished my spirit. Now it just didn't mean anything."

Absolute Pitch

Absolute pitch is an interesting specialization because it is quite rare, occurring in maybe one in ten thousand people, but there is some evidence that all (or many) humans are capable of extracting this information. A person with absolute pitch can identify a note - middle C for example - regardless of the source, which can be a musical instrument, a car engine, or the wind. It is frequently described as being similar to perceiving color, and people with absolute pitch have a hard time understanding why others are aurally "color blind."

It turns out that absolute pitch is much more common among Chinese and Vietnamese than most other populations. Their languages rely heavily on tonal quality, and it is speculated that they learn absolute pitch as part of their acquisition of language. It also appears that the capability of learning absolute pitch disappears at the same age that our ease of learning a language becomes diminished. Absolute pitch is also much more common among people with early blindness. I wonder if the area of the brain used to perceive color is co-opted in this case and applied to sound.

There is speculation that absolute pitch was important in our evolutionary past, perhaps as part of a proto-language, but has become much less relevant (for most of us) at the present. So the capability could well be an evolutionary relic.

Spatial Imaging

The brain also has modules that extract information on sound source location and the characteristics of the space you are located in. Location depends primarily on differences on the order of 0.5 milliseconds in time of arrival between your two ears, but also on reflections from the pinna where time differences are on the order of 0.1 milliseconds. A spectrum of sounds arriving along with a time-delayed copy will have its amplitude contour altered by partial comb filtering. This only works when both the direct and delayed signal are present in the same ear, so the brain could be using this amplitude information for the pinna reflections. The amplitude difference between the two ears is also used. As noted in the section on "Music and the Human Ear," I created an audio test file where both time and amplitude differences between the ears were retained, and two other files where only one difference was retained. I was surprised to observe that I could detect the location almost equally well in all three cases. The brain adapts to the available data, and wrings out every last drop of information. These three audio files can be found in the Sound Demo Section so you can listen for yourself; scan to the bottom of the page and look for three HRTF files. Headphones should be used, but it works with my computer speakers as well. The HRTF processing is described in the same section.

Reverberation provides information on the size of the room you are in, or the presence of large reflecting objects if you are outside. I think most people have probably had the experience of sensing the size of a room, even in total darkness. This is essentially a primitive version of the sonar system used by bats, but using time differences in echoes of ambient sound. This information, and sound source location, have obvious utility for defensive purposes, but the results of this processing have also been co-opted for musical enjoyment. Any music lover who has heard an organ in a grand cathedral is keenly aware of the difference when the same music is played in a small room. It is possible to add artificial reverb, but it is not possible to get rid of the early reflections in a small room, and the brain is not fooled at all.

Audiophiles are fanatic about the importance of stereo imaging. Sacks describes patients who lost their spatial imaging capability and suddenly found that music had become "flat and lifeless." The acoustic characteristics of a concert hall are universally recognized as a critical factor in musical enjoyment. All of this is abundant evidence of the importance of the spatial aspect of music processing.


It is easy to imagine that rhythm was a very early form of musical expression. Anyone can pick up a stick, and rhythmically beat it against something. A universal human reaction is a synchronized motor response; foot tapping or other body movements. This appears to be (almost) a uniquely human trait. Sacks quotes Aniruddh Patel: "there is not a single report of an animal being trained to tap, peck, or move in synchrony with an auditory beat." However the Patel video I reference shows a bird that he now believes really is dancing. This linking of auditory and motor systems depends on interactions between the auditory and the dorsal premotor cortex, and only the human brain (or perhaps the birdbrain as well) has a functional connection between these areas. Patel goes on to argue that rhythm ".. cannot be explained as a by-product of linguistic rhythm" and must have evolved separately from speech.

Rhythm can have a amazing unifying effect on a group of people. Anybody who has heard Stevie Wonder's rendition of "Fingertips," and the electrifying effect it had on the crowd, is aware of this. My son frequently participates in drum circles, and I vividly remember an experience I had many years ago in Yosemite National Park. A bunch of us were in a meadow, making random noises on various kinds of percussion instruments. After a few minutes somehow a unifying rhythm emerged. Everyone tuned into it and played ecstatically. I totally lost my sense of "I" and for a few fleeting minutes a ragtag mob became a single throbbing organism. A few other times in my life I have had a transcendent experience dancing to very loud, very rhythmic music, and it was intimately connected with the fact that I was one member of a group caught up in the same frenzy.

Another context where rhythm is a crucial unifying component (for better or for worse) is martial music, and some religious celebrations, such as gospel music.


I would like to end with a quote from Sacks: "Music, uniquely among the arts, is both completely abstract and profoundly emotional. It has no power to represent anything particular or external, but it has a unique power to express states or feelings. Music can pierce the heart directly; it needs no mediation." And all of this due to the most incredible musical instrument of all, the human brain.

Back to the top

To the table of contents.