`
`On the Perception of Phase Distortion*
`
`HIDEO SUZUKI, SHIGERU MORITA, AND TAKEO SHINDO
`
`Consumer Products Research Laboratory, Mitsubishi Electric Corporation, Kamakura, Kanagawa, Japan
`
`Differencesin the sound qualities of linear- and nonlinear-phase loudspeakers are evaluated
`by simulating the phase responses of loudspeakersusingall-pass filters. Test results indicate
`that the perception of phase distortion is highly dependent on individual ability, and thatit is
`easier to detect phase distortion by headphonelistening rather than by loudspeakerlistening.
`Instantaneous frequency and instantaneous envelopeare tentatively introduced as representa-
`tions ofa transient signal to give someindication of how transient signals sound.
`
`0 INTRODUCTION
`
`For many years, phase perception has been the subject of
`a controversy among acoustic engineers. Some argue that
`phase shift is essentially immaterial, while others believe
`that phase responseis as important as amplitude response. It
`seemsto usthat those who neglect the importance of phase
`response are referring to the type of phase changesthat
`occur gradually over a wide frequency region. In contrast,
`mostof the phase-oriented people, in testing, use synthetic
`signals that undergo large phase changeswithina relatively
`narrow frequency band. We think that both opinions are
`correct, and the important thing is the degree to which the
`sound quality is affected by the phase response. This should
`be investigated quantitatively and systematically for many
`kinds of phase responses and sound sources. Quite a few
`studies have been done on the perception of phase changes
`[1]-[7]; so we wouldlike to lay stress onthe effect of the
`phase response of a loudspeaker.
`Actual
`loudspeakers, even the so-called linear-phase
`loudspeakers, have nonlinear phase responses. The object
`of our study is to determine whetherthese phase distortions
`have any effect, and if so, to what extent do they degrade
`high-fidelity reproduction. The variety of signals and phase
`responses make our problem more difficult to deal with
`quantitatively. Using all-passfilters is one way to meetthis
`problem, because all-pass filters have constant amplitude
`responses and can simulate the phase responses of loud-
`speakers.
`
`* Manuscript received 1979 March; revised 1980 February.
`
`1 PHASE AND GROUP DELAY OF SINGLE-POLE
`FILTERS
`
`The all-pass filter, one of the nonminimum-phase net-
`works, is a very convenientdevicefor studying theeffect of
`phase response on sound quality because the phase response
`can be changed independent of the amplitude response.
`Generally the transfer function H(w), with amplitude re-
`sponse A(w) and phase response (w), is represented by the
`following equation:
`
`H(@) = A(a) - 0%),
`
`(1)
`
`Forall-pass filters A(@) is a constant such that A(@) = Ao.
`The group delay #, of Eq. (1) is defined as [8]:
`1 = — 2b)
`an
`do
`
`(2)
`
`Single-pole all-passfilters composed of active elements
`are shown in Fig. 1. The transfer functions Ha(w) and
`(a) for types A and B are given, respectively, by
`
`Hy(w) =]-eia@) = eli2 tanfifo}
`
`Ay(w) = 1- e)on@ = eiiz-2 tanVif)
`
`(3)
`
`(4)
`
`wherefy is the 90-degree phaseshift frequency given by the
`equation
`
`fo = 1/2m7R oC.
`
`(5)
`
`Eqs. (3) and (4) indicate the phase difference of 7. The
`group delays of H,(w) and Ag({w) are given by the same
`
`570
`
`JOURNALOF THE AUDIO ENGINEERING SOCIETY, 1980 SEPTEMBER, VOLUME28, NUMBER 9
`
`Sony Exhibit 1029
`Sony Exhibit 1029
`Sony v. MZ Audio
`Sony v. MZ Audio
`
`
`
`PAPERS
`
`equation:
`
`tga, = 1/tfo{l + (f/fo)*}.
`
`(6)
`
`For small f(<< fy), t, is constant with L/mf, and gradually
`decreases to zero as the frequency increases. The phase
`responses @,(w) and g(w) and the group delay f, are
`shownin Fig. 2.
`
`2. SIMULATION OF LOUDSPEAKER PHASE
`RESPONSE BY ALL-PASSFILTERS
`
`Now wewill show how we can simulate the phase re-
`sponse of a loudspeaker using single-pole all-pass filters.
`The phase response of a two-way loudspeaker system is
`shown in Fig. 3, curve a. The fundamental resonancefre-
`quency of a woofer with an enclosure is 65 Hz, while the
`tweeter is connected in reversed polarity at a crossover
`frequency of 1.5 kHz. The phaseresponseof a phase shifter
`consisting of two single-pole all-pass filters connected in
`series is shownin Fig. 3, curveb. Here fy of the phase-lead
`type is 70 Hz, and that of the phase-lag type is 2 kHz. The
`phase responses of an actual loudspeaker system and the
`phase shifter are in good agreement. This loudspeaker can
`be made phase linear by the conventional
`linear phase
`technique, resulting in the response shownin Fig. 3, curve
`c. This phase response is the one of a single-pole all-pass
`filter of the phase-lead in which f, = 70 Hz. It is clear, in
`this case, that the difference between the so-called linear-
`
`Co
`
`Ry
`
`Ra
`
`C,
`
`(a)
`
`(b)
`
`Fig. 1. Single-pole all-pass filters composed ofactive elements.
`(a) Phaselag. (b) Phaselead.
`
`o
`
`180
`o4 06
`
`
`
`
` 1 L 4 1 I ed
`
`
`
`
`
`02
`0406 10
`@
`a
`6
`10
`FREQUENCY (xf, Hz)
`Fig. 2. Phase responses and group delaysof single-pole all-pass
`filters.
`
`
`PHASE(DEG)
`
`
`DELAY(x0.32/t,SEC)
`
`PERCEPTION OF PHASE DISTORTION
`
`and nonlinear-phase loudspeaker systems exists only if we
`have a single-pole all-passfilter of the phase-lag type with
`f{) = 2 kHz. Therefore the advantage of linear-phase re-
`sponse can be evaluated by checking the effect of this
`all-pass filter on sound quality.
`
`3. TRANSIENT SIGNALS FOR HEARING TEST
`
`Ourpreliminary tests indicated that single-pole all-pass
`filters do not affect the sound quality of such continuous
`signals as rectangular or sawtooth waves wherethe subjects
`and the setup are the same as for the experiment to be
`described in Section 4. Therefore we used the transient
`signals of short duration as shown in Fig. 4, where they are
`designated as S1,52,R1,R2, and T1, T2. The timeinterval
`of one cycle Ty for each signal was chosen such that Ty =
`2/fo, where fy is the 90-degree phaseshift frequency of the
`phase-lag type all-passfilter defined by Eq. (5).
`What weactually hear is a sound, and sound signals
`converted from electrical signals by a loudspeaker or a
`headphonehavedifferent waveforms. These waveformsare
`shownin Fig. 5 for $2, R2, and T2, includingthe electrical
`output of an amplifier. The unfiltered waveform isat the top
`in each block,
`the filtered waveform is at bottom. The
`loudspeaker used for the hearing test was a bass-reflex
`two-way system with a 30-cm woofer and a 5-cm tweeter.
`Loudspeaker measurement was madein a listening room
`with a reverberationtime of 0.3 second, and a dummy head
`wasused for the headphone measurement.
`
`4 EXPERIMENT
`
`Fig. 6 shows a schematic diagram ofthe instrumentation.
`Thetransient signals generated by a function generator are
`phase shifted by the phase-lag type all-pass filter. Switch
`positions | and 2 give the unfiltered andfiltered signals,
`respectively. Intervals of the transient signals were changed
`according to the frequencies of 300 Hz and 1 kHzselected
`for fy in our experiment (see Fig. 4). The unfiltered and
`filtered signals were given to the subjects through a loud-
`speaker or a headphonein either a listening room or an
`anechoic chamber. Attheir peaks, the levels of the transient
`signals measured by a B&K 2606 measuring amplifier were
`about 100 dB soundpressure levelat the listening position.
`The loudness of the headphonetest was about the same as
`that of the loudspeakertest. All the subjects participating in
`our experiment were members of:the audio engineering
`group ofour laboratory, with ages ranging from 25 to 35.
`
`90 o
`
`
`
`PHASE(DEG)
`
`‘
`
`es
`
`;
`
`;
`
`éf /
`
`10000
`6
`FREQUENCY (Hz)
`
`2
`
`-180
`
`Fig. 3. Phase responses of an actual loudspeaker and all-pass
`filters. a Two-way loudspeaker system, tweeter is reversed in
`polarity; b —combination of phase-lead and phase-lag type all-
`pass filters; c —phase-lead type all-passfilters.
`
`JOURNALOF THE AUDIO ENGINEERING SOCIETY, 1980 SEPTEMBER, VOLUME 28, NUMBER 9
`
`571
`
`
`
`SUZUKI ET AL.
`
`PAPERS
`
`They were not tested for normalhearing, butall of them had
`enough: experience: to check the sound qualities of audio
`equipments.
`There are four combinations of unfiltered signals (posi-
`tion 1) and filtered signals (position 2): 1-1, 1-2, 2-1,
`and 2—2. For each combination the first signal, A, was
`given to the subjects five times with 1-second intervals.
`After this, with the same intervals of 1 second, the second
`signal, B, was given five times. This process was repeated
`five timés. We thus have a row of 50 transient signals which
`is
`represented as A-A-A-A-A-B-B-B-B-B-A-A-A:A‘A
`--- B-B-B-B-B, where A and B denote the first and sec-
`ond signals, respectively, and - stands for the 1-second
`pause between signals.
`It was necessary to present the
`same signal several
`times consecutively, because only
`one presentationofa transient signal with very short dura-
`tion was not enough for the subjects to grasp and keep in
`mindits sound quality. Also a numberof comparisons were
`necessary to make a decision for one specific combination
`because the difference of the sound qualities between
`filtered and unfiltered signals was so small.
`
`: ANN
`
`Theneach subject was asked whetherthefirst and second
`signals sounded the sameordifferent. If the listener judged
`sequences |—1 and 2—2 as the same and sequences | —2
`and 2-1 as different, the score was counted correct. From
`50 judgments the percentage of correct answers was ob-
`tained for each subject. If a subject could detect the effect of
`phaseshift perfectly, the score would be 100 percent. On
`the contrary, if the listener could not distinguish any differ-
`ence, the score is about 50 percent. Our impression from the
`experimentis that a percentage of correctness in the low 70
`percentis a reasonable line for separating subjects who can
`distinguish phase shift from those who cannot. (If this
`experiment had been conducted by the normal A-B-—A
`paradigm, the results would have been much worse.)
`
`5 RESULTS AND DISCUSSIONS
`
`The percentages of correct answers for each subject are
`shown in Fig. 7, where fy = 300 Hz, and a loudspeakerina
`listening room was used. Subject A scored more than 75
`percent for every signal tested. Other figures above 75
`percent were scored by subject B for $2 and T2, by subject
`C for T1 and by subject H for $2. The scores of the other
`subjects are lower than 75 percentfor all signals. Widely
`distributed figures show the important fact that the sensitiv-
`ity to the change of phase responseis highly individual. To
`judge from the mean values, S2 seems to be the easiest
`signal when detecting changes in sound quality.
`The results conducted in an anechoic chamberare shown
`
`
`
`in Fig. 8, where fy = 300 Hz, and signals $2 and R2 were
`used. There seems to be no meaningful difference between
`the results of Fig. 7 and those of Fig. 8, indicating that the
`reverberation of the listening room does not have much
`
`
`
`FUNCT ION-
`ALLPASS.
`
`
`
`
`
`
`| Aor BJaMpLiciER (|
`GENERATOR
`| FILTER
`A
`meB
`
`
`
`[ 1
`
`1 STENTING ROOM
`or ANECHOIC CHAMBER
`
`Fig. 6. Schematic diagram of instrumentation.
`
`
`
`100
`
`30
`
`80
`
`70
`
`60
`
`
`
`
`
`
`
`Fig. 4. Transient signals used for hearing test.
`
`mame
`AAPMle
`
`
`"Pka ai
`
`
`
`
`
`1?
`
`
`
`AMP OUTPUT
`
`HP OUTPUT
`
`SP OUTPUT
`
`Fig. 5. Waveforms of amplifier (AMP), headphone (HP), and
`loudspeaker (SP) output for $2, R2, and T2. Units of the horizon-
`tal axes are 4.4°ms for amplifier and headphone and 8.9 ms for
`loudspeaker.
`
`50
`PERCENTAGEOFCORRECTANSWERS
`
`572
`
`JOURNALOF THE AUDIO ENGINEERING SOCIETY, 1980 SEPTEMBER, VOLUME 28, NUMBER9
`
`ao
`
`MEAN
`
`SUBJECT
`
`Fig. 7. Percentages of correct answers of loudspeaker listening
`for various signals in a listening room, where fy = 300 Hz.
`
`
`
`distortion in loudspeaker reproduction, and from this aspect
`we can interpret the results of Figs. 7—10 in two ways.
`First, some people certainly can hearthe phase distortionin
`the low frequenciesof the type shownin Fig. 2 when highly
`artificial signals are used. In this sense, phase distortion is
`not permissible for high-fidelity reproduction, A nonlin-
`ear-phase multiway loudspeaker system that has a relatively
`low crossover frequency, say at several hundred hertz, may
`have some coloration due to the phase distortion. Second,
`Further results of headphonelistening are shownin Fig.
`however, as the average valuesindicate, most subjects were
`10, where the frequencyfy was raised to 1 kHz and, accord-
`muchless sensitive to the phase distortion. In fact, no one
`ingly,
`the intervals of signals were shortened to 3/10.
`found even the slightest change in sound quality by the
`It is evident that even for headphone listening the phase
`phase shift when popular music from several commercial
`shift of Fig. 2 when f, = 1 kHz is not discernible. The
`disks was used for a qualitative loudspeakerlisteningtest.
`effect of the all-pass filter is to give a frequency-dependent
`Whenwe listen to S2 via a loudspeaker or headphone,
`time delay to the system, but the resolution of the ear
`two pitches can be discerned, a louder low pitch and a
`in the time domain may not be enoughfor the perception
`quieter high pitch. It is difficult to tell which comes first, but
`of this delay. For fg = 1 kHz the group delay is 0.32 ms in
`when welisten toafiltered signal, the higher pitch is much
`the low-frequency region.
`moredistinct than thatof the unfiltered signal. This may be
`Our major concernis to find a permissible level of phase
`explained as follows. When S2 is presented without the
`all-passfilter, the low-frequency and high-frequency com-
`ponents of the waveform are perceived almost simulta-
`neously, and the low-frequency content partially masks the
`high-frequency components. WhenS2 is presented through
`the all-pass filter, the low-frequency components are de-
`layed relative to the high-frequency components. This al-
`lowsthe high-frequency componentsto be perceived before
`the low-frequency maskingstarts, giving the sensationof a
`high-frequency tone just at the beginning of thefiltered
`pulse. This could be further explained quantitatively by
`using the filter bank model[9].
`Another way, we think, to successfully explain the per-
`ception of signal S2 is to use the instantaneous frequency
`and the instantaneous envelope. These are defined as fol-
`lows. When werepresentan arbitrary time function f(t) by
`
`
`
`00
`
`
`
`90
`
`80
`
`70
`
`60
`
`50
`
`ANSWERS 40
`PERCENTAGEOFCORRECT
`
`>PrPOmoe G
`
` PERCENTAGEOFCORRECTANSWERS
`
`A
`
`8
`
`CG
`
`E
`D
`SUBJECT
`
`F
`
`Fig. 8. Percentages of correct answers of loudspeakerlistening
`for $2 and R2 in an anechoic chamber, where fy = 300 Hz.
`
`
`
`B
`
`c
`
`E
`D
`SUBJECT
`
`F
`
`MEAN
`
`PAPERS
`
`PERCEPTION OF PHASE DISTORTION
`
`effect on phase perception. The effect of reverberation may
`also be discussed comparing the results of loudspeaker and
`headphonelistening. Fig. 9 illustrates the results of head-
`phone listening for fg = 300 Hz. Headphone listening
`shows muchgreatersensitivity than loudspeakerlistening.
`The reason for this may be that the headphone sound has
`greater detail, which is lost in loudspeaker reproduction
`becausethe difference of the frequency responserather than
`because of the room reverberation.
`
`f(t) = AQ) + cos b(t)
`
`(7)
`
`where A(f)is called the instantaneous envelope and (7) the
`instantaneous phase, A(t) and $(f) are given, respectively,
`
`100
`
`50 A
`
`
`
`
`
`
`wo °o
`
`80
`
`~ oO
`
` PERCENTAGEOFCORRECTANSWERS
`
`
`
`8
`
`c
`
`E
`D
`SUBJECT
`
`F
`
`G
`
`MEAN
`
`Fig. 9. Percentages of correct answers of headphonelistening for
`various signals, where fy = 300 Hz.
`
`Fig. 10. Percentages of correct answers of headphonelistening
`for S2 and R2, where fy = 1 kHz.
`
`JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1980 SEPTEMBER, VOLUME 28, NUMBER 9
`
`573
`
`
`
`PAPERS
`
`change on the quality of transient sound is much smaller
`compared to the one we expect when wesee the change of
`the waveform dueto the phasedistortion.
`For music signals the quantity of phase distortions used
`in our experiment was not enough to be detected by any
`subject. It will be our future goalto find out the permissible
`level of phase distortion for music sources.
`
`800
`
`800
`
`FREQUENCY(Hz)
`
`AMPL!TUDE
`
`°
`
`oOo—inv
`
`—_—
`
`bo.
`Qo
`
`!
`
`o
`
`oO-nN_—
`
`mM
`
`m
`
`SN
`2 te tt tobt
`0
`20
`40
`20
`40
`TIME (MSEC)
`
`SUZUKI ET AL.
`
`by
`
`AQ)
`
`9
`
`S(t)
`= ———S4
`cos[tan™'{—f,()/fC)}]
`
`o(t) = tan {-f(o/f}.
`
`g
`®)
`
`(9)
`
`Instantaneous frequency is obtained by differentiating Eq.
`(9) with respectto time:
`fio -f'O-fO fi M}
`PD + F209}
`
`w(t) =
`
`(10)
`
`where the time function f(r) is the Hilbert transform off(t),
`whichis given by
`
`f(t) = fe) * (/rrt).
`
`where * denotes convolution.
`
`(11)
`
`The instantaneous frequency and the instantaneous en-
`velope of $2 for headphone reproduction are shownin Fig.
`11. The waveform wasdigitalized by an analog-to-digital
`converter with a sampling time of 50 ws and a sampling
`number of 1024. The instantaneous envelope is shown by
`relative quantity, and the negative portionsofits curve were
`reversed in sign by program manipulation. Fig. 11 shows
`that the instantaneous frequency of the unfiltered S2 is
`about 340 Hz at the start and about 150 Hz during the
`remaining portion. These frequencies seem to correspond
`to the two pitches of the sound. When S2 is filtered, the
`peak of the instantaneous frequency becomes much higher
`at the start than for the unfiltered S2. It is reasonable to think
`in this case that the higherpitch will be enhanced when we
`listen to this filtered signal. The small value of instantane-
`ous envelope at the time of maximum instantaneousfre-
`quency for both filtered and unfiltered S2 showsthat the
`higher pitch will sound quieter than the lowerpitch.
`The above discussion may show that the instantaneous
`frequency and the instantaneous envelope are sometimes
`successful in explaining the perception of phase distortion
`of transient sounds.
`
`6 CONCLUSION
`
`For a long time too much of our attention has been
`directed to amplitude response, perhaps because measure-
`ments were so easy. Another reason might be that ampli-
`tude response has such an extremely large influence on the
`sound quality that we tend to forget the other characteristic,
`phase response.
`Many high-grade loudspeaker systems have been de-
`signed without any considerations of the phase response.
`But once wehavestarted to investigate phase response and
`have the tools for measurement,
`it is no longer easy to
`ignore the effect of phase response. No one can deny the
`necessity of linear phase response to reproduce a waveform
`with high accuracy. Still, we must be careful that reproduc-
`tion of the waveform with high accuracy does not become
`the sole criterion for high-fidelity reproduction. The results
`obtained by our experiment showthat the effect of phase
`
`(a)
`
`(b)
`
`Fig. 11. Instantaneous frequencies (IF) and instantaneous en-
`velopes (IE) for headphonelistening. (a) unfiltered S2. (b) filtered
`82.
`
`7 REFERENCES
`
`{1] H. D. Harwood, ‘‘Audibility of Phase Effects in
`Loudspeakers,’’ Wireless World (1976 Jan.).
`[2] B. B. Bauer,
`‘‘Audibility of phase.distortion,”’
`Wireless World (1974 Mar.).
`[3] R. C. Cabot, M. G. Mino, D. A. Dorans, I. S.
`Tackel, and H. E. Breed, ‘‘Detection of Phase Shifts in
`Harmonically Related Tones,’’ J. Audio Eng. Soc., vol.
`24, pp. 568 —571 (1976 Sept.).
`[4] H. von Fleischer, ‘‘Gerade wahrnehmbare Phasen-
`anderungen bei Drei-Ton-Komplexen,”’ Acustica, vol. 32,
`p. 44 (1975).
`_
`[5] H. von Fleischer, ‘‘Uber die Wahrnehmbarkeit von
`Phasenanderungen,”’ Acustica, vol. 35, p. 202 (1976).
`[6] J. Blauert,
`‘‘Group Delay Distortions -in Elec-
`troacoustical Systems,’’ J. Acoust. Soc. Am., vol. 63, p.
`1478 (1978).
`[7] R. Berkovitz and B. E. Edvardsen, ‘‘Phase Sensitiv-
`ity in Music Reproduction,’’ presented at the 58th Conven-
`tion of the Audio Engineering Society, New York, 1977
`November 4-7.
`(8] R. C. Heyser, ‘‘Loudspeaker Phase Characteristics
`and Time Delay Distortion,’’ J. Audio Eng. Soc-, vol. 17,
`p. 130 (1969).
`[9] C. L. Searle, J. Zachary Jacobson, and S. G. Ray-
`ment, ‘‘Stop Consonant Discrimination Based on Human
`Audition,’’ J. Acoust. Soc. Am., vol. 65, p. 799 (1979).
`[10] J. B. Thomas, Statistical Communication Theory
`(Wiley, New York, 1969).
`
`The authors’ biographies appeared in the July/Augustissue.
`
`574
`
`JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1980 SEPTEMBER, VOLUME 28, NUMBER 9
`
`