throbber
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/320895294
`
`Differences in Voice Quality Between Men and Women: Use of the
`Long-Term Average Spectrum (LTAS)
`
`Article(cid:100)(cid:100)in(cid:100)(cid:100)Journal of Voice · April 1996
`
`DOI: 10.1016/S0892-1997(96)80019-1
`
`CITATIONS
`128
`
`4 authors, including:
`
`Humberto Manuel Trujillo Mendoza
`University of Granada
`
`213 PUBLICATIONS(cid:100)(cid:100)(cid:100)1,279 CITATIONS(cid:100)(cid:100)(cid:100)
`
`SEE PROFILE
`
`READS
`1,634
`
`Some of the authors of this publication are also working on these related projects:
`
`FOREIGN FIGHTERS AND EUROPEAN SECURITY. PSYCHOSOCIAL PARAMETERS OF RADICALIZATION. View project
`
`Desarrollo de herramientas psicológicas y computacionales de ayuda a la decisión para la prevención de la radicalización islamista de corte yihadista.
`View project
`
`All content following this page was uploaded by Humberto Manuel Trujillo Mendoza on 09 November 2017.
`
`The user has requested enhancement of the downloaded file.
`
`1
`
`APPLE 1013
`
`

`

`Journal of Voice
`Vol. 10, No. 1, pp. 59-66
`© 1996 Lippincott-Raven Publishers, Philadelphia
`
`Differences in Voice Quality Between Men and Women:
`Use of the Long-Term Average Spectrum (LTAS)
`
`Elvira Mendoza, Nieves Valencia, Juana Mufioz, and *Humberto Trujillo
`
`Department of Personality, Evaluation and Psychological Treatment, and *Area of Methodology of the Behavioral
`Sciences, Departament of Social Psychology, University of Granada, Granada, Spain
`
`Summary: The goal of this study was to determine if there are acoustical
`differences between male and female voices, andif there are, where exactly do
`these differences lie. Extended speech samples were used. The recorded read-
`ings of a text by 31 women and by 24 men were analyzed by means of the
`Long-Term Spectrum (LTAS), extracting the amplitude values (in decibels) at
`intervals of 160 Hz over a range of 8 kHz. The results showed a significant
`difference between genders, as well as an interaction of gender and frequency
`level. The female voice showedgreaterlevels of aspiration noise, located in the
`spectral regions corresponding to the third formant, which causes the female
`voice to have a more “‘breathy’’ quality than the male voice. The lower spec-
`tral tilt in the women’s voices is another consequence of this presence of
`greater aspiration noise. Key Words: Long-Term Average Spectrum—Voice
`quality—Genderdifferences—Breathiness—Aspiration noise—Spectraltilt.
`
`The ability of the human ear to identify an indi-
`vidual’s gender on the basis of voice quality, re-
`gardless of linguistic content, has been discussed
`previously by various investigators (1,2). Yet, the
`perceptual parameters or various strategies used to
`discriminate between male and female voices are
`not well understood. O’Kane (1) believes that this
`discrimination appears to be performed routinely by
`humanlisteners by extracting a limited number of
`perceptual cues; these may include various socio-
`logical factors such as cultural stereotyping. How-
`ever, Murray and Singh (3) have suggested thatlis-
`teners are able to distinguish a speaker’s gender on
`the basis of such acoustic characteristics as stress
`and pitch levels,
`in addition to nasality versus
`hoarsenessin male and female voices, respectively.
`In speech studies involving genderidentification,
`the acoustic correlates usually submitted to judg-
`ments of listeners have beenrelated to a set ofla-
`
`Accepted March 15, 1995.
`Address correspondence and reprint requests to Dr. Elvira
`Mendoza at Facultad de Psicologfa, Campus Universitario de
`Cartuja, 18071 Granada, Spain.
`
`ryngeal and supralaryngeal parameters. Regarding
`the laryngeal variables, the importance given to the
`fundamental frequency (F,) as an indicator of the
`speaker’s sex is noteworthy (4-7). The woman’s
`pitch has a higher frequency value than the man’s
`pitch, although the absolute values of this differ-
`ence are in question. Depending on the study, wom-
`an’s pitch is higher by as much as 0.45 times (8) to
`1.7 times (9) and even to an octave (10). Given that
`an inverserelation exists between the mean Fy and
`the membranousvocalfold’s length, the physiolog-
`ical substratum appears to reside in the greater
`length of the male vocal folds (11). Daniloff et al.
`(12) stated, ‘‘An individual’s modal frequency is
`governed in large part by the physical size, shape,
`and massof vocal folds and larynx. Also in part, our
`vocal habits and training accustom us to select a
`frequency range that is comfortable, so that modal
`frequency is the result of a compromise between
`personal] habit and optimum mechanical buzz fre-
`quency”’ (pp. 203-4). Nevertheless, some sociolin-
`guistic studies have suggested that these differences
`in voice quality across sexes may be due more to
`
`59
`
`2
`
`

`

`60
`
`E. MENDOZA ET AL.
`
`sociocultural than to physiological factors, sincelis-
`teners are able to distinguish between male and fe-
`male voices even when the speakers are children
`(i.e., when the speaker’s laryngeal physiology may
`be identical across sexes) (13).
`Regarding variables of the vocal tract’s reso-
`nance (VTR), research is even more scarce. Early
`research considered that the vocal tract’s contribu-
`tion to the perception of the speaker’s genderlay in
`the formants frequencies (14). Bladon (15) detected
`that the vowels emitted by men presented narrower
`formant bandwidths with a less profound drop (a
`flatter profile) in the spectrum than the vowels gen-
`erated by women. Others have suggested that there
`is a greater amplitude of the first harmonic com-
`pared with the second in the female voice, as op-
`posed to that of the male voice (9,16). However,
`this difference may come from the interaction be-
`tween Fy and harmonic structure. Klatt and Klatt
`(9) have suggested that the voice differentiation be-
`tween the sexes comes from the generation of a
`noisier aspiration in women’s larynxes compared
`with that in men. These greater levels of aspiration
`noise, centered in the high-frequency spectral re-
`gions correspondingto the third formant, make the
`female voice present a more ‘‘breathy”’ quality than
`the male voice (17). As a consequenceofthis aspi-
`ration noise in high frequencies, it can be expected
`that the source spectrum has a lower spectraltilt,
`given that upon increasing the aspiration noise in
`the third formant, the general spectral tilt is slower.
`Léfqvist and Manderson(18), using the Long-Term
`Average Spectrum (LTAS) as an analytic proce-
`dure, determinedthis overalltilt of the source spec-
`trum through the ratio of energy between 0-1 and
`1-5 kHz. However, Klatt and Klatt (9) established a
`greater spectral drop in ‘‘breathy’’ voice down to
`~2 kHz (—18 dB/octave in breathy vowels com-
`pared with —12 dB/octave in laryngealized and
`modal vowels), maintaining the aspiration noise
`from that frequency onwards. The differences be-
`tween the establishmentof the type of spectral drop
`proposed by both authors and probably due to the
`fact that Léfqvist and Manderson (18) centered
`their investigation on pathological voices rather
`than on normal voices as in the Klatt and Klatt
`study (9).
`Despite the great body of knowledge that has pro-
`gressively accumulated concerning the differences
`between the anatomy, physiology, and acoustics of
`male and female voices, very few attempts have
`been madeto classify both types of voices by means
`
`Journal of Voice, Vol. 10, No. 1, 1996
`
`of objective acoustic measures, with the exception
`of the studies by Childers et al. (19,20), Childers
`and Wu (21), and Wu and Childers (2). Following
`the line of these investigators, we tried to discover
`the acoustical differences between male and female
`voices by meansof the mid-to-long averaging tech-
`niques, as the LTAS. This type of analysis has
`proven to be most valuable as an averaging measure
`because it looks at long speech segments anddisre-
`gards linguistic contents.
`Besides, there are very few studies that base male
`and female voice differentiations on long-term av-
`eraging measures. Tarnédczy and Fant (22) com-
`pared the spectra of male and female Hungarian,
`Swedish, and German speakers with the objective
`of studying the differences in the LTASdueto vari-
`ations among these languages. Although the results
`were confusing with respect to the main objective,
`Tarnézky and Fant (22) were able to detect differ-
`ences across sexes in the different languages. The
`above-mentioned differences centered in the 0.7—
`1.5 kHz range for male speakers, and in the 1-2 kHz
`range for women. The differences between speak-
`ers’ sex were greater than expected.
`In another study, Schlorhaufer et al. (23) com-
`pared the LTASof different genders and age
`groups. Five men, five women, andfive children,
`all German speakers, were studied. Although the
`spectra of these subject groups demonstrated differ-
`ences, the researchers did not attempt to quantify
`these differences. Wu and Childers (2) conducted a
`study aimed at establishing different templates for
`both sexes, stating that gender information should
`be invariant, phoneme independent, and speakerin-
`dependent for a given gender. They added that
`these conditions can be better ensured by employ-
`ing long-term averaging measures. Following their
`suggestions, we believe that averaging measures,
`such as LTAS, emphasize the specific information
`of each subject’s gender.
`Improvement of the systems of synthesis of the
`female voice has been one of the major goals of
`previous studies using methodological procedures
`similar to ours. That is the reason for studying in
`depth the objective acoustic differentiation between
`male and female voices. Given that the majority of
`current voice synthesizers function with male
`voices, it is difficult to obtain voice synthesis of
`women’s or children’s voices with an acceptable
`level of naturalness. According to Titze (11), this
`may be due to the fact that the main parameteruti-
`lized in the generation of synthesized voices has
`
`3
`
`

`

`VOICE QUALITY IN MEN AND WOMEN
`
`61
`
`been the Fy. However, the differential synthesis of
`male and female voices implies much more than a
`mere scale of Fy, and somebasic differences in the
`phonatory and articulatory mechanisms need to be
`considered. Titze’s suggestion leads one to believe
`that a great advancein the acoustic differentiation
`of male and female voices is required in order to
`improve the current systems of analysis and syn-
`thesis of both genders’ voices. Klatt and Klatt (9)
`stated that the principal difficulty in achieving this
`objective stemmed from the diversity of acoustic
`indexes employed in the majority of the studies.
`Acoustic phonetics has established the frequency
`distribution of formants as the relevant variable in
`the identification of sounds, generally doing so by
`specifying a formant’s frequency for its central
`value. The determination of these central values be-
`comes more difficult as the fundamental value is
`increased. Due to the existence of the higher level
`in the Fy voices of women and children, the systems
`of analysis lose resolution, which impedesthe eval-
`uation of the formant’s frequency points in these
`cases (24). Furthermore, the informal observations
`of Klatt (17) suggest that the vocal spectra obtained
`from female voices do not completely conform to
`the all-pole model, possibly because of their tra-
`cheal joint and source-filter interactions. Titze (11)
`questions whether the source-filter theory of
`speech production would have followed the same
`developmentif the earlier models had been based
`on female voices.
`In the present study, wefirst attempted to deter-
`mine, by means of the LTAS, if a spectral profile
`characteristic of a speaker’s gender exists and, if
`so, to delineate the existing differences between
`male and female voice profiles obtained by this
`method. Second, we sought to demonstrate that the
`differences between both types of voices can be
`attributed to the existence of aspiration noise in the
`spectral regions corresponding approximately to
`the third formant. As described earlier (9), it is be-
`lieved that this causes female voices to be emitted
`with a more “‘breathy’’ quality than that of male
`voices.
`
`METHOD
`
`Subjects
`Fifty-five subjects (24 men and 31 women),
`whose ages ranged from 20 to 50 years (with an
`average of 28 and 30 years, respectively), partici-
`pated voluntarily in this study. All subjects were
`
`native Spanish speakers. None had a history of
`speech or auditory problems, and none suffered
`from colds or respiratory infections during the
`length of their involvementin the study. The voices
`of all subjects were determined to be normal (non-
`dysphonic) by two expert speech-language pathol-
`ogists.
`
`Experimental task
`The experimental task involved the reading of a
`standard text, taken from the Spanishtranslation of
`Lewis Carroll’s Alice in Wonderland, which lasted
`~3 minutes and was composed of three paragraphs.
`The subjects were instructed to read the textin their
`natural voice and at a normal speed. All recording
`samples took place in a soundproof room at the
`University of Granada’s Voice Laboratory. The re-
`cording microphone washeld at a distance of 20 cm
`from the mouth, in order to avoid possible aerody-
`namic interferences (25).
`
`Apparatus
`The recording was performed with an AKG D 222
`EB microphone with a flat response and a SONY 77
`ES Digital Audio Tape (DAT) with a samplingfre-
`quencyof 48 kHz, keeping the volumen of the DAT
`between — 30 and — 20 dB. The voice samples were
`introduced via a direct connection to a DSP Sona-
`Graph, model 5500 (Kay Elemetric), and were an-
`alyzed with the LTASportion of the Voice Analysis
`Program. LTAS calculates a power magnitude
`spectrum across the frequency range of the input
`signals. LTASis different from the power spectrum
`in that it includes only voiced segments, and it con-
`tinuously averages the input signal for 30-90 s. The
`advantage of screening out unvoicedsignals is that
`these unvoiced signals may corrupt the average of
`the voiced segments, and it can mask the informa-
`tion of the voice source (18). The LTAS program
`screens the input signal for voiced information
`based on a simple zero-crossing and energycriteria
`(26). The program was adjusted to include spectra
`of voice signals, and its discrete power spectrum
`was added to the accumulated average. The pro-
`gram will not include spectral signals if voicing is in
`doubt.
`The following elements were selected for the
`analysis: a frequency range of 8 kHz, an input shap-
`ing in FLAT, maintenance of the memory’s channel
`at 38 s, a transform size of 128 points, the channel
`sensitivity at 45 dB, and the AC-coupled option.
`
`Journal of Voice, Vol. 10, No. 1, 1996
`
`4
`
`

`

`62
`
`E. MENDOZA ETAL.
`
`Acoustic analysis
`The acoustic analysis was conducted with the
`second paragraphof the text in order to avoid any
`influence of possible vacillations at the beginning of
`the reading and anyfall in intensity or intonation at
`the end of the text.
`The analysis was performed on the amplitude val-
`ues, in decibels, at intervals of 160 Hz, thus, ob-
`taining a total of 50 measurements for each subject,
`corresponding to the values, which in turn corre-
`spondedto each ofthe frequency levels in the total
`range of 8 kHz (0.160, 0.320, 0.480, 0.600 ... 8
`kHz).
`
`RESULTS
`
`For evaluating the possible differences in the dis-
`tribution of energy between the male and female
`spectra and, as such, to assess at what frequencies
`they may exist, an analysis of variance (ANOVA) 2
`x 50 was conducted. The analysis began with the
`gender factor (G) at two levels, male and female,
`
`and the frequencylevel factor (L) in kHz, along 50
`frequency levels. The amplitude, measured in deci-
`bels, was analyzed as the dependentvariable. Fig-
`ure 1 presents the means in each frequency level.
`The results of the ANOVA are shownin Table 1. A
`significance level of 0.05 in the sex factor and a
`level of 0.001 in the level frequency factor were
`used. As Table | shows, there was a significant
`main effect for the sex factor [F(1,53) = 6678; p <
`0.013]. Likewise, the main effects for the level fre-
`quency factor were significant [F(49,257) =
`1509.978; p < 0.001). Significant differences were
`also seen in the interaction between these two fac-
`tors S x L [F(49,2597) = 9.336; p < 0.001].
`Given the first objective of the study, the signif-
`icant interaction differences between speakers’
`voices according to sex were analyzed with a one-
`way ANOVAforeach frequencylevel. The results
`indicated that the spectral amplitude of women’s
`voices is greater (p < 0.001) in the following fre-
`quencylevels: 0.8, 0.96, 2.88, 3.04, 4.16, 4.32, 4.48,
`
`4.64, 4.80, and 4.96 kHz.
`
`0.16
`
`0.96
`
`176
`
`256
`
`496
`416
`336
`FREQUENCY LEVELS (KHz)
`
`5.76
`
`656
`
`7.36
`
`—~— MALES
`—S~ FEMALES
`FIG. 1. Graphic representation of the mean values of amplitude (in decibels) corresponding to female and male voicesin each frequency
`level analyzed (in kilohertz).
`
`Journal of Voice, Vol. 10, No. 1, 1996
`
`5
`
`

`

`VOICE QUALITY IN MEN AND WOMEN
`
`63
`
`TABLE1. Results of analysis of variance for the mixed factorial design G X (L), being
`G the gender factor (men and women), manipulated between subject, and L the level
`frequency factor (50 levels), manipulated within subjects. The dependent variable is the
`amplitude in decibels
`
`Source
`
`Sum of squares
`
`Gender(G)
`Error
`Level (L)
`Level x Gender
`Error
`
`584.547
`4,638.928
`381,473.979
`2,358.630
`13,389.684
`
`“py = 0.013. p < 0.001.
`
`Degrees of
`freedom
`
`1
`53
`49
`49
`2,597
`
`Mean square
`
`584.547
`48.135
`7,785,183
`48.135
`5.156
`
`F
`
`6.6787
`
`1509.978°
`9.336°
`
`Later a discriminant analysis using amplitude as
`the criterion factor and the frequencylevels as the
`prediction factor was conducted to assess ifall the
`frequency levels were equally important in the dif-
`ferentiation of voices across gender. The results of
`this analysis are presented in Table 2. As seen, the
`frequency levels included in the gender discrimina-
`tion equation for those acoustic factors are 0.96,
`1.44, 1.92, 3.04, 3.20, 3.36, and 8 kHz. The classi-
`fication of subjects in this study through the dis-
`criminant function was found to be 100%.
`To evaluate the question of whetheror not female
`voices presented greater levels of aspiration noise
`in the spectral regions corresponding to the third
`formant and a lowerspectraltilt than male voices,
`
`TABLE 2. Discriminate analysis utilizing amplitude as
`the criterion factor and the frequency levels as the
`predictable factor
`
`Variable
`
`F to enter
`remove
`
`U
`Approximate
`Degrees of
`
` statistic F statistic freedom
`
`
`
`F096
`F144
`F192
`F304
`F320
`F336
`F800
`
`7.326
`7156
`4.031
`44.916
`5.321
`4.567
`4.949
`
`0.1468
`0.1613
`0.1849
`0.5413
`0.1105
`0.1214
`0.1332
`
`56.958
`50.942
`55.103
`44.916
`54.024
`48.610
`52.081
`
`5.49
`5.49
`4.50
`1.53
`7.47
`7.47
`6.48
`
`Classification Matrix
`No. of cases classified
`into group
`
`Group
`Women
`Men
`Total
`
`Percent correct
`100.0
`100.0
`100.0
`
`Women
`31
`0
`31
`
`Men
`0
`24
`24
`
`Jackknifed Classification
`No.ofcasesclassified
`into group
`
`
`Men
`Percent correct
`Women
`Group
`0
`100.0
`31
`Women
`24
`100.0
`0
`Men
`
`
`100.0 31Total 24
`
`
`as Klatt and Klatt (9) suggested, the energy concen-
`tration at the level of the third formant and the over-
`all tilt of the spectrum source were analyzed. The
`amplitude of the frequency points had previously
`been examined, showingsignificantly higher values
`for female voices. In analyzing the overalltilt of the
`spectral source, the ratio of energy between 0~I
`kHz and 1-5 kHz was calculated, as suggested by
`L6fqvist and Manderson (18). The results indicated
`that this ratio is greater among male speakers (mean
`= 5.215; SD = 1.286) than among females (mean =
`4.565; SD = 0.731). These differences are statisti-
`cally significant [F(1,53) = 5.600; p < 0.022].
`
`DISCUSSION
`
`The results of this study showed that (a) signifi-
`cant differences were present between genders in
`the distribution of energy throughout the analyzed
`frequency values, taken from voice samples. This is
`reflected in the interaction effects of the gender and
`the frequency level factors found in the ANOVA.
`(b) Significant differences were not found in all of
`the spectrum’s frequency levels, but rather were
`concentrated in the frequencies between 0.80 and 5
`kHz, particularly in the frequencies 0.96, 1.44, 1.92,
`3.04, 3.20, and 3.36 kHz. According to the results of
`the discriminant analysis, this is the spectral region
`that best differentiates the speaker’s gender. (c) The
`spectra corresponding to women’s voices showed a
`lower overall tilt; this was found on the ratio of
`0-1/1-5 kHz. (d) The LTAS, as an average measure
`of continuous voice signals, is a useful instrument
`for detecting these sex-related differences and for
`determining the spectral regions where such differ-
`ences are centered.
`From the results of the discriminantanalysis, it is
`seen that the frequency points of 0.96, 1.44, 1.92,
`3.04, 3.20, 3.36, and 8.00 kHz are most important in
`
`Journal of Voice, Vol. 10, No. 1, 1996
`
`6
`
`

`

`64
`
`E. MENDOZA ETAL.
`
`voice quality differentiation. Within the above-
`mentioned frequency points, those corresponding
`to 3.04, 3.20, and 3.36 kHz are located in the spec-
`tral regions near the third formant, and the higher
`values correspond to the female voices. The impli-
`cations of these results agree with the proposal of
`Klatt and Klatt (9) that the acoustic characteristics
`of female voices lead to a ‘‘breathier’’ quality than
`in male voices. These authors, as indicated in the
`introduction, suggest that this quality can be ex-
`plained by a longer opening and the presence of a
`posterior opening between the vocal folds, which
`would generate aspiration noise in the region of the
`third formant.
`Klatt and Klatt (9) locate another consequence of
`these physioanatomical characteristics in the lower
`spectraltilt, because of the greater concentration of
`aspiration noise in higher frequencies. L6fqvist and
`Mandersson (18) indicate a way of quantifying the
`general spectral tilt via LTAS. They determined the
`energy drop in the spectra of hyperfunctional voices
`by the ratio 0-1/I-5 kHz. The present study has
`confirmed the differences between male and female
`voices to be in this ratio. As the lower values were
`registered in the female voices, a slower generaltilt
`wasseen in this group. However,as seen in Fig.1,
`the spectral tilt in women’s voices is greater by
`>1.60 kHz. This means that the general lowering of
`the spectral tilt (until 5.0 KHz) is due to a greater
`concentration of energy in the higher frequencies
`(1.60-5.0 KHz), according to Klatt and Klatt (9).
`These results suggest that the spectral tilt ratio
`should locate the cut-off point between high and
`low frequencies at 1.60 kHz (0-1.60/1.60—-5.0) in-
`stead of at
`1 kHz, as proposed by L6fqvist and
`Manderson (18). Webelieve that the different cutoff
`point put forth by the latter authors is due to their
`having attempted to distinguish between hyper- and
`hypofunctional voices, whereas this study’s sub-
`jects presented with no vocal pathology.
`An ANOVAwasconducted with this new cutoff
`point, 1.60 KHz, to see whether this value could
`establish clearer differences between male and fe-
`male voices. As such, thestatistical significance in-
`creased =[F(1,53) = 9.023; p < 0.004].
`According to the discriminant analysis, another
`frequency point, 8.0 kHz, exists that indicates dif-
`ferences between the speaker groups. In all likeli-
`hood, another procedure of acoustical analysis is
`necessary to further investigate this question.
`The existence of noisy energy in frequencies
`>8.0 kHz has already been studied. Shoji et al. (27)
`
`Journal of Voice, Vol. 10, No. I, 1996
`
`studied the energy present in frequencies >8.0 kHz
`in vowel emission by normal subjects. These au-
`thors detected significant differences in the energy
`distribution between vowels /a/ and /u/. Following a
`methodology similar to that of Shoji et al., we dis-
`covered differencesin the configuration of the spec-
`tral energy in the regions ranging from 6-10 kHz
`and 10-16 kHz between vowels, and between dys-
`phonic and nondysphonic speakers (28). We believe
`that the differentiation established by the discrimi-
`nant analysis at the frequency point 8.0 kHz should
`move in this direction. Nevertheless, as LTASre-
`quires a great amount of memory when dealing with
`long speech segments, our currently available
`equipmentdoesnot allow the study of the spectral
`zone >8.0 kHz using LTAS.
`Ourresults agree with those of Klatt and Klatt (9)
`regarding the presenceof greater aspiration noise in
`the region of the third formant in the female voice,
`noise that causes, according to these authors, the
`female voice to present a ‘‘breathier’’ quality than
`the male voice. This quality may be possibly due to
`learning/imitation of models and perhapsrestricted
`to American women. The existence of similar ef-
`fects in the results of analysis of the speech of Span-
`ish womenindicate that this characteristic may not
`be restricted exclusively to one female nationality
`subject group. It would be necessary to study this
`particular aspect in various other subject groups be-
`fore generalizing this finding.
`The differences in the methodological procedures
`in this study and in previous studies makeit difficult
`to compare results. The materials that have served
`as stimuli in the acoustical differentiation of the
`speaker’s sex have consisted of syllables (29), sus-
`tained vowels (30), and vowels in syllabic contexts,
`as well as prolonged voiced and unvoiced fricatives
`(2,20). The VTR parameters used most frequently
`in these studies have been the frequency, ampli-
`tude, and bandwidth of the first four formants.
`However, using a long-term averaged spectrum,
`such as LTAS, one cannotaffirm that the points of
`greater amplitude that appear along this spectrum
`correspondto formantvalues as theyare relative to
`specific sounds. In addition, the procedures em-
`ployedin the earlier studies differ from those of the
`present study: electroglottography,
`inverse filter-
`ing, spectral analysis, and linear predictive coding
`(LPC) analysis prevail in the literature, whereas,
`this study used the LTAS. Nevertheless, despite
`the procedural and analytical differences between
`previous studies and the present research, the re-
`
`7
`
`

`

`VOICE QUALITY IN MEN AND WOMEN
`
`65
`
`sults of this study coincide with those found by
`other authors, and in our case with speech samples
`that were natural and independent of phonetic con-
`tent.
`With the data obtained in this study, we intended
`to identify the acoustical physiological relations in
`the human voice. The existing body of knowledge
`on LTASdoesnot permit the identification of these
`relations nor does it necessarily have to be thefinal
`goal of acoustical investigation. Our intention has
`been to contribute a model with which to compare,
`using sufficient statistical evidence, the profile of
`the spectral energy’s distribution in male and fe-
`male voices averaged on a long-term basis. It was
`our intention to contribute significant evidence that
`would aid in improving the current systemsof syn-
`thesis and recognition of women’s voices.
`The determination of a spectral area, correspond-
`ing approximately to the third formant, particularly
`sensitive to the differential establishment of male
`and female voice models, can be seen as oneofthe
`more important contributionsof this study. Accord-
`ing to the data, it is this area of the spectrum that
`presents a significantly different profile in both
`sexes and toward which more investigative efforts
`should be directed. Future research mightuse a per-
`ceptive validation instrument in looking at spectral
`representations for the two voice groups. This
`might include, for example, first maskingorfiltering
`out the spectral regions that are irrelevant in voice
`identification before having listeners decide wheth-
`er a particular LTAS sample correspondsto a male
`or female voice.
`To conclude, different profiles of energy distribu-
`tion in the spectrum can be established for male and
`female voices, and these differences, apparently,
`are due to the presenceofgreater aspiration noise in
`the women’s voices. This causesthe female voice,
`in contrast to the male voice to present a ‘‘breath-
`ier’ quality. Becauseofthis, the spectraltilt in wo-
`men’s voices is smaller than that in men’s voices.
`Finally, the LTASis a technique thatis sufficiently
`sensitive for detecting these differences.
`
`Acknowledgment: This study was supported by
`DGICYT(Direccién General de Investigacién Cientifica
`y Técnica), Ministerio de Educacién y Ciencia (Spain),
`Project PS93-0203.
`
`REFERENCES
`
`1. O'Kane M. Recognition of speech and recognition of
`speaker sex: parallel or concurrent processes? J Acoust Soc
`Am 1900;82 (suppl 1):S84.
`
`10.
`
`il.
`
`12.
`
`13.
`
`14.
`
`15.
`
`16.
`
`17.
`
`18.
`
`. Wu K, Childers DG. Gender recognition from speech. Part I.
`Coarse analysis. J Acoust Soc Am 1991;90:1820—40.
`. Murry T, Singh S. Multidimensional analysis of male and
`female voices. J Acoust Soc Am 1980;68: 1294-300.
`. Henton CG. Tact andfiction in the description of female and
`male pitch. J Acoust Soc Am 1987;82 (supp! 1):S91.
`. Hollien H, Malzik E. Evaluation of cross-sectional studies of
`adolescent voice changes in males. Speech Monograph
`1967 ;34:80-4.
`. Saxman J, Burk K. Speaking fundamental frequency char-
`acteristics of middle-aged female. Folia Phoniatr (Basel)
`1967519: 167-72.
`. Stoicheff M. Speaking fundamental frequency characteris-
`tics of non-smoking female adults. J Speech Hear Res 1981;
`24:437-41.
`. Monsen RB, Engebretson AM. Study of variations in the
`male and female glottal wave. J Acoust Soc Am 1977;62:981-
`93.
`. Klatt DH, Klatt LC. Analysis, synthesis and perception of
`voice quality variations among female and male talkers. J
`Acoust Soc Am 1990;87:820-57.
`Linke CE. A study of pitch characteristics of female and
`their relationship to vocal effectiveness. Folia Phoniatr
`(Basel) 1973;25:173-85.
`Titze IR. Physiological and acoustic differences between
`male and female voices. J Acoust Soc Am 1989;85: 1699-707.
`Daniloff R, Schuckers G, Feth L. The physiology of speech
`and hearing. An introduction. Englewood Cliffs, NJ: Pren-
`tice-Hail, 1980.
`Woods N, College L. It’s not what she says. It’s the way that
`she says it: The influence of speaker-sex on pitch andinto-
`national patterns. Res Speech Percept Indiana University
`1992;18 (Progress Report):84-95.
`Coleman RO. A comparison of the contributions of two
`voice quality characteristics to the perception of maleness
`and femaleness in the voice. J Speech Hear Res 1976;19:
`168-80.
`Bladon A. Acoustic phonetics, auditory phonetics, speaker
`sex and speech recognition: a thread. In: Fallside F, Woods
`A, eds. Computer speech processing. Englewood Cliffs, NJ:
`Prentice-Hall, 1983.
`HentonC, Bladon R. Breathiness in normal female speech:
`inefficiency versus desirability. Lang Commun 1985;5:
`221-7.
`Klatt DH. Detailed spectral analysis of a female voice. J
`Acoust Soc Am 1986;81 (suppl 1):S80.
`Léfqvist A, Manderson, B. Long-time average spectrum of
`speech and voice analysis. Folia Phoniatr (Basel) 1987;39:
`221-9.
`. Childers DG, Wu K, Hicks DM. Factors in voice quality:
`acoustic features related to gender. Proceeding of IEEE In-
`ternational Conference of Acoustics, Speech Signal Pro-
`cessing 1987;1:293-6.
`Childers DG, Wu K, Hicks DM, Yegnarayana B. Voice con-
`version. Speech Commun 1989;8:147-58.
`Childers DG, Wu K. Genderrecognition from speech. Part
`Il. Fine analysis. J Acoust Soc Am 1991;90:1841-56.
`Tarnéczy T, Fant G. Some remarks on the average speech
`spectrum. Speech Transmission Laboratory. Quarterly
`Progress and Status Reports (RoyalInstitute of Technology,
`Stockholm) 1964;4:13-4.
`Schlorhaufer W, Miller WG, Hussl B, Scharfetter L. En-
`ergieverteilung und Dynamik bei der Mutationsfistelstimme
`im Vergleich zur Normalstimme. Folia Phoniatr (Basel)
`1972;24:7-18.
`O’Shaughnessy D, ed. Speech communication: human and
`machine. Woburn, MA: Addison-Wesley, 1987.
`Titze IR, Winholtz WS. Effect of microphone type and
`
`20.
`
`21.
`
`22.
`
`23.
`
`24.
`
`25.
`
`Journal of Voice, Vol. 10, No. 1, 1996
`
`8
`
`

`

`66
`
`E. MENDOZA ETAL.
`
`placement on voice perturbation measurements. J Speech
`Hear Res 1993;36:1177-90.
`26. Rabiner L, Schafer R, eds. Digital processing of speech sig-
`nals. Englewoods Cliffs, NJ: Prentice-Hall, 1978.
`27. Shoji K, Regenbogen E, Yu JD, Blaugrund E. High-
`frequency components of normal and dysphonic voices. J
`Voice 1991;5:29-35.
`28. Valencia N, Mendoza E, Mateo I Carballo G. High-
`
`29.
`
`30.
`
`frequency components of normal and dysphonic voices. J
`Voice 1994;8:157-62.
`Nittrouer S, McGowan RS, Milenkovic PH, Beehler D.
`Acoustic measurements of men’s and women's voices: a
`study of context effects and covariations. J Speech Hear
`Res 1990;33:761-75.
`Kuwabara H, Ohgushi K. Experiments on voice qualities of
`vowels in males and females and correlation with acoustic
`fe

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket