Audibility of Phase Distortion

Any musical segment can be decomposed into harmonic components - that is, a sum of fixed frequency tones. There is universal agreement that changes in the relative amplitude of these components are audible. The "frequency response" of a sound system is normally understood to mean the amplitude response vs. frequency. But the relative phase of the harmonic components can also vary. Technically this is distortion, and it can have a huge effect on the time-variation of sound pressure on your eardrum. But is it audible?

In case you have any doubt, listen to two .wav files with identical spectrum magnitudes: three piano notes as originally recorded {128 kb], and the same three notes with modified phase [128 kb]. The difference is gross. (Several readers have correctly pointed out that the phase modification, described below, would never occur with real audio equipment. It is intended simply as a demonstration that two files with identical spectrum magnitudes can sound completely different). The observation that phase is very audible under some conditions is certainly not new, and is stated emphatically in an excellent paper by Lipshitz (However it doesn't seem to be universally accepted for some odd reason). A better question is: are phase changes audible under realistic listening conditions? Here the answer is not so clear-cut. I'm going to review some of the evidence, provide you with the opportunity to do realistic listening tests, and you will have to make up your own mind. Another good paper on this topic by Hawksford and Greenfield can be downloaded here (scan down to publication C23).

Arguments for Inaudibility

As noted in the Lipshitz paper, in 1843 G. S. Ohm wrote that the relative phase of the harmonic components has no audible effect. This was widely accepted for some time, but is now known to be incorrect. A revised premise is stated by Hartmann (page 257): "In general the relative phase between two signal components should be irrelevant if the two components are separated by more than a critical bandwidth. Phase should not matter then because there is no neural element [of the cochlea] that sees both components." However this revised argument is also inconsistent with results discussed below regarding tones an octave apart - which is more than a critical bandwidth.

There is extensive data that indicates that under normal listening conditions, with real music, even experienced listeners have great difficulty perceiving phase effects. To quote from a recent survey paper by a top engineer at Harman International, Dr. Floyd Toole: "It turns out that, within very generous tolerances, humans are insensitive to phase shifts. Under carefully contrived circumstances, special signals auditioned in anechoic conditions, or through headphones, people have heard slight differences. However, even these limited results have failed to provide clear evidence of a 'preference' for a lack of phase shift. When auditioned in real rooms, these differences disappear.."(The piano note demo shows that if you accept a broad definition of "contrived" you can get more than a "slight" difference).

As discussed in other sections I have done a lot of listening tests comparing 1st and 4th order crossovers, which have very different phase responses. Arny Krueger provides test files and a double-blind test program and you can do these tests yourself. In contrast to the artificial phase modification used for the piano notes, these have very realistic phase responses. I didn't find any differences that could not be attributed to amplitude. If there are audible differences due to typical crossover phase responses, they are pretty subtle, and in my opinion minor compared to other problems in the reproduction of music.

Arguments for Audibility

Hartmann notes, (page 509) that if non-linearities are present, the relative phase between two frequencies producing intermodulation distortion can drastically affect the amplitude of the distortion. In fact, as he shows, in the case of intermodulation between a fundamental tone and its second harmonic, it is possible for the fundamental tone to completely disappear for certain (improbable) combinations of amplitude and phase! Therefore the level of distortion produced by a loudspeaker can depend on the phase response of a crossover feeding it. But the distortion could be either higher or lower, so this argument says that a flat phase response is different, not better, than a non-flat response. It also doesn't address the issue of audibility of the differences.

Regarding cochlea non-linearities (see discussion of ear nonlinearities in another section), Hartmann states (page 257) that if two tones are in different critical bands, intermodulation products are not expected, and thus would not introduce phase sensitivity. But harmonic distortion products due to ear nonlinearities do introduce phase sensitivity. It is well known that a loud tone can make an otherwise audible weaker tone completely inaudible (masking). In a very interesting paper, Nelson and Bilger show that the masking level of a tone for its 2nd harmonic depends on the relative phase of the tones. The difference is a large as 30 dB. Apparently this is a result of a second harmonic produced by nonlinearity in the ear itself, adding constructively or destructively with the externally produced 2nd harmonic. As noted in their paper, this effect varies quite a bit among individuals, and when I tested myself, I couldn't detect much difference. But for some people this effect could change perception of harmonic distortion produced by a sound system, again for better or worse. This effect can also alter perception of harmonics naturally produced by musical instruments, so here we have a valid argument that fidelity can be audibly degraded by a realistic phase change.

The Lipshitz paper is often cited as one of the best studies of this subject, so it is worth restating the author's conclusions (given in the abstract) in their entirety; quote:

1) Even quite small midrange phase nonlinearities can be audible on suitably chosen signals.

2) Audibility is far greater on headphones than on loudspeakers.

3) Simple acoustic signals generated anechoically display clear phase audibility on headphones.

4) On normal music or speech signals phase distortion appears not to be generally [emphasis by the authors] audible, although it was heard with 99% confidence on some recorded vocal material.

End quote.


One of the tests mentioned in the Lipshitz paper is a combination of two tones where the phase "slips 3600...every few seconds." Fundamentally frequency is the rate of phase change, so a phase slip of 3600 in four seconds is the same as saying one frequency is increased or decreased by 1/4 Hz. I duplicated this test with two tones at 200 and 400 Hz, compared with two tones at 200 and 400.25 Hz, and indeed I could distinguish a difference. But it seems to me that I could be hearing a 1/4 Hz beat tone between a 2nd harmonic at 400 Hz created by ear nonlinearities and the 400.25 tone.

The Lipshitz paper also refers to an experiment where the phase of a signal containing a 200 Hz and a 400 Hz tone is reversed: "No one has failed to hear the timbral change with phase, and discern the polarity reversal on this signal with unvarying accuracy." I have created two .wav files with equal amplitudes of a 200 Hz tone at a reference phase of zero, and a 400 Hz tone at a phase of 90-degrees. Nelson and Bilger determined that this is the most sensitive phase setting. The first .wav file [215 kb] contains 2 identical 1/2 second bursts, separated by 1/8 second; for the second .wav file [215 kb] the phase is inverted in the second burst (see note). This is also a realistic case since some audio equipment inverts phase. If you open these files using CoolEdit you can zoom in and see the waveforms. I found that at low volume levels I could not distinguish between the two, but at a moderately loud level I can easily score 100% using Arny's blind tester. However it turns out that if you do a FFT of the two files the spectrum amplitudes are not identical. The spectrum amplitude of a single burst and a single phase-reversed burst is of course identical. But the spectrum of two sequential bursts is different. This illustrates the fact that it can be very tricky to avoid collateral amplitude changes when investigating the effect of phase. The differences in the spectrum amplitudes are small, but I still think it muddies the water.

In a correspondence with Dr Lipshitz he pointed out that it the phase of the second tone is 0-degrees instead of 90-degrees, that the asymmetry in the positive and negative directions is greater. In other words, in one case there is a large positive excursion in the waveform, and in the phase-reversed case a large negative excursion. If you theorize that the phase effect is due to the ear behaving like a half-wave rectifier, the 0-degree case would appear to be the most distinguishable. If you use CoolEdit to examine the waveforms in the sample files given here, you will see that the positive and negative excursions are exactly equal. But when I do the test, the 90-degree case is actually easier to distinguish. And as noted above this is consistent with the Nelson and Bilger masking tests. So perhaps the ear does behave like a half-wave rectifier, but perhaps constructive and destructive interference with ear harmonic distortion is occurring as well.

The final conclusion of the Lipshitz paper is, I believe, the most important. It directly addresses the question of audibility with real music under realistic conditions. I interpret this result to mean that a flat phase response is a meaningful goal for a sound system, even if it may not be the most important goal. Certainly if I could obtain an accurate phase response at relatively little cost, I'll take it.

The Sample .wav File of Three Notes

The notes in the original file were Fourier transformed, the phase of each frequency component randomized, then inverse Fourier transformed, to obtain the second file. Theoretically the magnitude of the frequency spectra of the two files should be identical. Due to quantization errors there are small differences, but a comparison of FFT's of the two files shows agreement within 0.1 dB for the uppermost 56 dB, and within 1 dB for the uppermost 86 dB. This is much closer than the pair of files for the phase reversal test. If you wish to confirm this for yourself, note that you will not get agreement if you use a windowed transform (which CoolEdit uses). The window attenuates the signal towards each end of the time sample, which affects the spectrum differently in the two cases. (For example, suppose the signal were a single spike. The theoretical spectrum magnitude is exactly the same regardless of where the spike appears in the time interval, but with a windowed transform the spectrum magnitude would drop towards zero if the spike were near one end).


I would like to thank Dave Dal-Farra for sharing the results of his research on this question, for stimulating my interest in aspects such as the relationships between phase and distortion, and for bringing the Lipshitz paper to my attention.


Note: If the phase of the 400 Hz tone is shifted by 180 degrees, after one cycle of the 400 Hz tone it just repeats, but the 200 Hz tone has undergone a 180 degree phase shift. So shifting the phase of the 400 Hz tone by 180 degrees produces exactly the same waveform as shifting both tones by 180 degrees.

Back to Music and Human Ears

To the main table of contents