`Horner et a1.
`
`[54] DIGITAL VOICE DETECTION APPARATUS
`AND METHOD USING TRANSFORM
`DOMAIN PROCESSING
`.
`.
`.
`_
`.
`[75] Inventors’ Rebel? w‘ Hnmer’ Whlmer’ Khelm
`V. Cal, Brea; Ronald L. Bergen,
`Irvine; Keith A. Lane, Huntington
`Beach, all of Calif.
`[73] Assignee: Hughes Aircraft Company, Los
`Angeles, Calif:
`[21] A l N 555 114
`,
`pp .
`0.:
`,
`[22] Filed:
`Jul. 19, 1990
`
`5
`2g
`[51] Int. Cl. .........................................
`/42/ 43’ 45 /46
`[2%]
`""" """"""""""""
`’
`’
`’
`’
`[
`1
`1e 0 earc """"""""""" "
`381/47, 49, 50; 364/513.5; 367/135, 198, 199,
`43
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`3,566,035 2/1971 Noll et a1. ........................... .. 381/41
`4,058,676 11/1977 Wilkes et a1. ...... ..
`.
`4,219,695 8/1980 Wilkes et a1. ...... ..
`4,829,574 5/1989 Dewhurst et a1. .............. .. 381/43 X
`
`US005365592A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,365,592
`Nov. 15, 1994
`
`4,884,247 11/1989 Hadidi et al. ....................... .. 367/43
`
`.
`
`,,
`
`.
`
`,
`
`, pp.
`
`—
`
`.
`
`OTHER PUBLICATIONS
`“Cepstrum Pitch Determination”, The Journal of the
`Acoustical Society of America vol. 41; 1967 pp.
`293_309_ A Non
`“ .
`.
`’
`.'
`‘
`PDigital
`gggocessiéllgz, 51(gppenheim & Schafer,
`rent1ce-
`_
`.
`Primary Examiner-Tod R. Swann
`Attorney, Agent, or Fzrm-W. K. Denson-Low
`5
`STRA
`[ 7]
`AB
`Cr
`A waveform characterizer apparatus is disclosed for
`extracting cepstrum pitch and spectral properties of a
`waveform signal such as the baseband audio output of a
`receiver. The apparatus employs Fourier processing,
`.
`.
`.
`.
`cepstral processing, magnitude detectlon, loganthms
`processing, frequency selective ?ltering and time/fre
`quency windowing to extract cepstrum pitch and spec
`tral rolloff characteristics which can then be used to
`determine the signal type. One application of the inven
`tion is in a digital voice/squelch apparatus.
`
`27 Claims, 8 Drawing Sheets
`
`2575GT
`
`1
`
`Micro Motion 1059
`
`
`
`US. Patent
`
`Nov. 15, 1994
`
`Sheet 1 of 8
`
`95
`
`365,592
`
`
`
`
`
`.E—U‘Bk“WKNR.\xbk\\
`
`\SQQ>\\\$
`
`.\xbk\0‘
`
`
`
`
`
`{\QVVSWW\\V\\0\QV§xxhflflN$K
`
`~Qk‘kxxkMum‘Q.VQxth‘Nxxk
`
`
`.mu‘wk5KwkMKQ_.-..-_I..5_:..fi:TJ-..4112:5.____3.p.r.7r.VV>y;__<:4_...;.;1.4________Flnthlllhlllhuaz
`will..-|::rxlsrl.L
`
`TI|\<
`
`Tux
`
`2
`
`
`
`
`
`US. Patent
`
`Nov. 15,1994
`
`Sheet 2 of 8
`
`5,365,592
`
`2048
`
`#024
`
`‘2048
`
`|
`l
`.06400 356
`356’.
`
`|
`
`l
`
`l
`l
`.09600 356
`MID.
`
`l
`
`l
`
`I
`. A280 SEC.’
`[/VD
`
`F/G.2A
`
`//0
`mo
`90
`a0
`
`70
`60
`50
`40
`
`3o
`20
`
`:IIIIIIIII
`[Illll
`
`/0
`
`l
`/000
`
`l
`l
`2000 3000 4000
`
`F/G.2B
`
`3
`
`
`
`US. Patent
`
`Nov. 15, 1994
`
`Sheet 3 of 8
`
`5,365,592
`
`20.00
`
`Z4 66 I
`
`23.33
`
`Z800
`
`20-64»
`
`/2 33
`
`
`
`W1. (/5
`
`A9, 00
`0
`
`|
`
`.500
`
`Fl 6 . 4
`
`/000
`
`.6667
`
`.3313
`
`I
`
`“.6667 '
`
`_/000
`0.0
`
`l
`
`4000
`
`Fl 6.
`
`1
`
`|
`l
`l
`/000
`4000
`3000
`2000
`£00
`2500
`3500
`FEEQuEA/CY
`
`|
`
`l
`
`l
`
`l
`I
`8000
`A500
`A200
`2.000
`7/ME X/0
`
`I
`
`l
`3.200
`2:400
`2800
`
`4
`
`
`
`US. Patent
`
`Nov. 15, '1994
`
`Sheet 4 of s _
`
`5,365,592
`
`A 000
`
`.6667 -
`
`.3333 '
`
`0.0 '
`
`1 3333 -
`
`"' .6667 -
`
`" A 000
`7/
`00
`4000
`
`l
`
`FIG.
`
`24.00
`
`24.60 ‘
`
`22.00
`
`20.“
`
`I
`
`/9-33
`
`/8.09
`0
`
`VHL U!
`
`F/ . 6 }
`
`|
`|
`|
`i
`
`|
`
`l
`
`I
`
`|
`|
`|
`3. Z 00
`Z400
`/. 600
`8000
`A200
`2.000
`2800
`77M£ X/0
`
`|
`
`I
`|
`
`l
`
`l
`:
`l
`| A-F I
`I
`l
`l l
`l
`|
`' M00
`'aaaa
`.2000
`A500
`.2500
`rezyuz/vcy
`
`l
`
`|
`
`500
`
`waa
`saw
`
`5
`
`
`
`US. Patent
`US. Patent
`
`Nov. 15, 1994
`Nov. 15, 1994
`
`Sheet 5 of 8
`Sheet 5 of 8
`
`5,365,592
`5,365,592
`
`5K.05.
`
`4 N 6F.
`
`N§\
`
`WWSNNNNN
`
`«353%kw»
`>\\QxQSV
`
`9x
`
`_
`
`‘30Qanx
`
`
`
`_I|lll|!|.lllllllllJ
`
`__________
`
`_.
`
`6
`
`\
`
`
`
`>\\Q\§§V\
`
`
`
` ___________._.§rllllllllll5:5mum.“NmmmL
`
`6
`
`
`
`
`
`
`
`
`
`
`US. Patent
`
`Nov. 15, 1994
`
`Sheet 6 of 8
`
`5,365,592
`
`wwgm
`
`7
`
`
`
`US. Patent
`
`Nov. 15, 1994
`
`Sheet 7 of 8
`
`5,365,592
`
`mdi.
`
`k NDQkkS“
`
`8
`
`
`
`US. Patent
`
`Mma,m.N
`
`8
`
`5,365,592
`
`sNk\
`
`
`
`owkumumfixN39,VWWMW‘WUW.Wfififihx‘m.
`f_.Qmmxaxuvw
`
`Q.9K
`
`
`
`ww\v»?Nw\QQ
`
`.
`
`wasSwm.Sflwfixfiw
`
`
`N§§kw>$$<\xWMWVMufibkhww.‘Qw5‘kamags?
`
`knxkx§o9<§
`
`9
`
`
`
`
`
`1
`
`5,365,592
`
`DIGITAL VOICE DETECTION APPARATUS AND
`METHOD USING TRANSFORM DOMAIN
`PROCESSING
`
`BACKGROUND OF THE INVENTION
`The present invention relates to voice communica
`tion systems, and more particularly to a technique for
`detecting characteristics of a received signal in the fre
`quency or transform domain to detect received voice
`signals.
`Present voice detection (squelch) techniques use one
`of the following approaches:
`1. “Zero crossing” of the received signal in the time
`domain are counted to determine the mean fre
`quency, and compare the mean frequency against 1
`KHz to determine the existence of voice. This
`technique does not take advantage of the entire
`audio spectrum and has a high false alarm rate.
`. The cross-correlation of voice signal with tone is
`calculated to determine pitch period. This tech
`nique is corrupted heavily by noise and is also
`time-consuming.
`. An out-of-band CW tone used to allow the receiver
`to detect transmission. A disadvantage of this tech
`nique is that energy is spent on the CW tone, thus
`reducing the amount of power available for voice
`transmission. In addition, this technique requires
`the transmitter to send the CW tone and therefore
`it cannot be implemented in existing radios without
`circuit modi?cation.
`
`40
`
`SUMMARY OF THE INVENTION
`In accordance with the invention, a waveform cha
`racterizer apparatus is disclosed for determining cep
`strum pitch and spectral rolloff properties of an input
`signal waveform. The apparatus comprises means for
`digitizing the audio signal waveform to provide a digital
`waveform signal, and means for providing the cepstrum
`of the audio signal waveform. The apparatus further
`includes cepstral processing means for isolating the
`pitch period of the audio signal waveform as a single
`peak in the cepstrum located at the period of the signal
`and determining the peak pitch magnitude value, and
`45
`means for determining the spectral rolloff of the audio
`signal waveform from the cepstrum of the audio signal
`waveform.
`In a preferred embodiment, the means for providing
`the cepstrum of the audio waveform comprises means
`50
`for transforming the digitized audio signal waveform
`into the frequency domain, such as a FFT, and means
`for deconvolving the impulse response and periodicity
`of the frequency domain signal to provide a decon
`volved digital signal. The deconvolving means may be
`implemented by means for squaring the magnitudes of
`the transformed spectral data, and and performing a
`logarithm function on the squared data means for trans
`forming the deconvolved digital signal back into the
`time domain to provide the cepstrum of the audio signal
`waveform.
`
`55
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`These and other features and advantages of the pres
`ent invention will become more apparent from the fol
`lowing detailed description of an exemplary embodi
`ment thereof, as illustrated in the accompanying draw
`ings, in which:
`
`65
`
`2
`FIG. 1 illustrates a simpli?ed block diagram of a
`waveform characterizer apparatus in accordance with
`the invention.
`FIGS. 2A and 2B show an exemplary voice wave
`form signal in the time and frequency domain of an
`exemplary input signal to the voice characterizer of
`FIG. 1.
`FIG. 3 illustrates the overlapping of frame processing
`utilized by the system of FIG. 1.
`FIG. 4 illustrates the signal waveform of the loga
`rithm of the squared spectral data, i.e., the output of
`element 78 of FIG. 1.
`FIG. 5A illustrates the cepstrum of the input signal
`performed by the system of FIG. 1; FIG. 5B shows the
`zeroing of all cepstral samples of the cepstrum of FIG.
`5A except those between zero and T’.
`FIG. 6 illustrates the frequency domain transforma
`tion of the smoothed cepstrum signal.
`FIG. 7 is a simpli?ed hardware block diagram of a
`digital voice squelch system embodying the invention.
`FIG. 8 is a schematic block diagram further illustra~
`tive of the digital signal processor employed in the
`system of FIG. 7.
`FIG. 9 is a block diagram of the analog signal circuit
`of the system of FIG. 7.
`FIG. 10 is a simpli?ed ?ow diagram illustrative of the
`operation of the system of FIG. 7.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`The baseband audio bandwidth output of a receiver
`system contains information transmitted from some
`other location. If a detailed knowledge is available
`about the type of information being transmitted, the
`method of transmission used, and the time at which that
`signal is transmitted, then detection and timely process
`ing of that information is straightforward. If, however,
`this information is not available, then the correct pro
`cessing of the received signal is more difficult.
`This invention comprises a technique that can be used
`to extract characteristics from a baseband signal and use
`these characteristics to determine the type of signal
`present in the receiver output. The technique can be
`used to detect the presence of a number of different
`types of modulated signals even when these signals are
`corrupted by noise. The invention uses Fourier process
`ing, cepstral processing, magnitude detection, logarith
`mic processing, frequency selective ?ltering and time/
`frequency windowing to separate the signal into charac
`teristics which can then be used to determine the signal
`type.
`-
`Most transmissions can be modelled as an impulse
`train convolved with some impulse response character
`istic. Voice, for example, is generally modelled as a
`vocal chord excitation (a periodic impulse train) con
`volved with the impulse response of the vocal tract.
`This periodic impulse train can be detected by the use of
`a deconvolution technique known as the cepstrum. The
`result of the cepstrum is the separation of the impulse
`train characteristic from the impulse response charac
`teristic of the system. The impulse train transforms into
`a single peak located at the pitch period of the signal,
`while the response characteristic transforms into the
`time domain response of the system. See, e.g., Digital
`Signal Processing, Oppenheim & Schafer, Prentice-Hall,
`1975, at paragraph 10.7.1, pages 512-519.
`The present detection technique uses digital signal
`processing in the transform domain to detect and char
`
`10
`
`
`
`5,365,592
`4
`3
`tion 78 deconvolves the combination of the impulse
`acterize RF or baseband signals such as voice, M-ary
`train and the impulse response in the frequency domain.
`FSK or PSK. The characterization can then be used for
`veri?cation of reception, tracking or demodulation.
`An N point inverse FFT 80 is then performed on the
`logarithm output data, the resulting output being the
`Waveform Characterizer Procedure and Algorithm
`cepstrum of the original input signal (FIG. 5A).
`Cepstral processing isolates the pitch period of the
`A simpli?ed block diagram of a waveform character
`input signal as a single peak in the cepstrum located at
`izer apparatus is shown in FIG. 1. The characterizer
`apparatus 50 comprises:
`the period of the signal. This peak is analogous to an
`(a) circuitry for generating in-phase and quadrature
`autocorrelation function. A pitch detector 86 locates
`components of an incoming signal, e.g., a signal
`the pitch peak ATwithin a range '1' to t1 to t2 and stores
`received at antenna 52; in this embodiment this
`the peak magnitude value in memory. The values t1 and
`circuitry includes downconverting mixers 54 and
`t; are predetermined pitch periods which correspond to
`56, 90° phase shift device 58 and bandpass ?lters 62
`the minimum and maximum expected values for the
`signal in question. The maximum peak is located and the
`and 64.
`(b) analog-to-digital converters 66 and 68 for digitiz
`peak value A Trecorded. The peak values of K consecu
`ing the in-phase and quadrature signals;
`tive frames are then combined and the sum compared
`(0) memory devices 70 and 72 to store the digitized
`against a threshold value T1. The value of T1 is deter
`signal during analysis; in a preferred embodiment,
`mined by the pitch and rolloff threshold estimator 90.
`the memory devices are random access memories;
`The audio spectrum is smoothed in the following
`(d) a time window function 74 for performing, e.g., a
`manner. All cepstral samples except those between 0
`Hamming window;
`and T’ are removed by writing zeroes in that area of the
`(e) a forward fast Fourier transformer (FFT) 76 to
`cepstrum (FIG. 5B). This operation, performed by the
`transform the time domain digital signals into the
`time window function 82, removes the repetitive im
`frequency domain;
`pulse component of the signal. A forward FFT 84 is
`(f) a log function 78 to deconvolve the impulse re
`then performed on the cepstrum to transform it back
`sponse and periodicity of the signal;
`into the frequency domain. The result is a smoothed
`(g) an inverse FFT 80 to transform the frequency
`spectrum of the original input signal (FIG. 6).
`domain signal into the cepstral time domain;
`In the rolloff detector 88, spectral rolloff is measured
`(h) a time window function 82 to remove the pitch
`by taking the energy in two frequency bins, F1iAf/2
`period from the cepstrum;
`and FZiAf/Z, where Af is the frequency bin size, and
`(i) a forward FFT 84 to transform the cepstrum back
`comparing their relative magnitudes A Tiand BT,~. This is
`into the frequency domain;
`done by summing a range of data points around both
`(1') a pitch detector 86 for detecting the pitch of the
`frequencies. The difference in energy in the two bins,
`signal;
`E(F1iAf/2)—E(f2iAf/2), is calculated. The values of
`(k) a rolloff detector 88' responsive to the frequency
`K consecutive frames are combined and the result com
`domain, smoothed spectrum for detecting the spec
`pared against a threshold value, T2, determined by esti
`tral rolloff;
`mator 90. This is accomplished by the following rela
`(l) a pitch and rolloff threshold estimator 90; and
`tionship:
`(m) combining logic 92 responsive to the pitch and
`rolloff to detect voice.
`The input audio signal, with bandwidth W, is ana
`lyzed for two properties, cepstrum pitch and spectral
`rolloff. An exemplary input signal voice waveform is
`illustrated in FIGS. 2A and 2B in both the time domain
`and the frequency domain. In operation, the waveform
`45
`characterizer 50 works as follows. The input signal
`(FIG. 2) is downconverted, and in-phase and quadra
`ture components are digitized by analog-to-digital con
`verters (ADC) 66 and 68 at a sample rate Rs (higher
`than twice W to avoid aliasing). The samples are stored
`50
`in memory (RAMS 70 and 72). The data is read out of
`RAMS 70 and 72 in blocks of N points (corresponding
`to a frame duration of T=NRS) and after application of
`a Hamming window 74, the data is processed by cep
`strum processor 75. First the data is transformed into
`55
`the frequency domain using an N point FFT 76. The
`memory pointer is then shifted by N/2 and another N
`point block is processed. This N/2 overlapping allows
`more voicing decisions per second to be made while
`maintaining length N. This process is shown in FIG. 3.
`The output of the FFT 76 is a list of complex num
`bers. The magnitude of each number a+ib is obtained
`by (aZ+b2)§ (taking the square root is not important
`since it is only a scaling) to obtain the magnitudes (am
`plitudes) of each number. Thus, after an N point FFT is
`performed on the input data by FFT 76, the magnitude
`spectrum is calculated and a logarithm function 78 is
`performed on the spectral data (FIG. 4). The log func
`
`Voice detection is indicated by the combine logic 92
`if AT is greater than or equal to T1 or if A Energy is
`greater than or equal to T2.
`Example Design-—Voice Squelch
`A waveform characterizer in accordance with the
`invention can be employed, for example, in a receiver
`voice/squelch system. Human voice contains several
`unique properties which can be used to distinguish it
`from background noise and interfering signals. A typi
`cal voice waveform is shown in FIG. 5. The human
`voice waveform has the following characteristics:
`1. Pitch Period—Voice is a periodic waveform with a
`constant pitch created by impulses from the vocal
`chords. The periodicity of the vocal chord im
`pulses can be detected by transforming the signal
`into its corresponding cepstrum. The periodicity of
`the impulse train creates a cepstral peak with a
`location corresponding to the period. This peak
`can be detected by cepstrum processing.
`Noise is generally an uncorrelated process. It is there
`fore not periodic and no cepstral peak is expected at the
`output of a cepstral processor. Thus, cepstrum process
`ing can be used to reliably detect voice transmission
`with a low false alarm rate.
`
`30
`
`35
`
`65
`
`20
`
`25
`
`K
`2 Ali",
`1
`i
`
`1
`
`ET,- = A Energy
`
`1
`
`11
`
`
`
`5,365,592
`5
`2. Spectral Rolloff-—The frequency response of
`human voice consists of several formants, the reso
`nant frequencies of the vocal cavity. These for
`mants are typically low frequencies (500 Hz-1400
`HZ), and the spectral energy of these formants is
`considerably higher than the spectral energy at
`higher frequencies. The presence of voice can be
`detected by measuring the spectral rolloff (formant
`detection) of the voice spectrum.
`RF noise, on the other hand, is generally a white
`process in a narrow bandwidth. The noise spectrum
`roughly ?at over the audio band. Thus, spectral rolloff
`measurement can reliably detect the presence of voice
`with a low probability of false alarm.
`The following is a list of requirements which would
`be desirable in a squelch design. The channel quality is
`assumed to be such that the received signal has a signal
`to-noise ratio of at least 10 dB to insure reliable commu
`nication. The audio bandwidth of the radio is assumed
`to be 300 Hz to 3000 Hz, a standard for SSB HF radios.
`l. The probability of false alarm due to extraneous
`noise should be less than one every ?fteen minutes.
`2. The maximum processing delay should be 0.5 sec
`onds. If this long of a delay is necessary, some
`method of data buffering should be used so that no
`information is lost in transmission.
`3. The probability of detection within the specified
`processing delay should be greater than 99%.
`4. After completion of speech, the channel should
`stay open .for approximately one second to allow
`for normal pauses in speech.
`5. The probability that squelch will close during
`speech should be less than 10—3.
`6. The performance of the squelch should not be
`language dependent.
`7. Operation of the squelch should be invisible to the
`operator. No manual adjustments should be neces
`sary for optimum performance.
`8. The squelch design should be single-ended. In
`other words, no special transmission schemes
`should be used. This will insure that any radio can
`be retro?tted with the squelch circuitry and will
`operate properly on any communication channel.
`The following design parameters have been consid
`ered in analysis of the waveform characterizer.
`Sampling Rate (Rs)
`The analog-to-digital (A/D) sampling rate (RS) must
`be greater than twice the audio bandwidth of the radio
`to avoid aliasing. A standard audio bandwidth of 3.0
`KHz dictates that sampling occur at more than 6.0
`KHz. 8.0 KHz can be used in order to allow reconstruc
`tion of the voice with minimal distortion from ?ltering.
`An A/D resolution of 12 bits allows sufficient dynamic
`range (72 dB) of the input signal.
`Frame Length, FFT size (N)
`In order for the cepstral peak to be constructed, the
`analysis frame must be of suf?cient duration to contain‘
`60
`enough impulses to de?ne the period of the impulse
`train. Four impulses should be sufficient, and literature
`indicates a typical worst-case period of 15 milliseconds.
`A requirement of at least four impulses per frame leads
`to an analysis frame duration of at least 60 milliseconds.
`In addition, to reduce complexity and increase speed
`in the FFT, the number of samples in the analysis frame
`should be a power of two. A frame length of 512 points
`
`6
`results in a 64 msec frame. This number of points will
`give a frequency resolution of about 16 Hz.
`Cepstrum Pitch range (t1, t2)
`Literature suggests that the pitch period of human
`voice typically falls between 3 msec and 15 msec. These
`values can be chosen to be the bounds of the cepstral
`pitch search. In a 512 point frame, these values corre
`spond to points 24 and 120.
`Spectral Rolloff frequencies (F1, F2)
`Though the frequency response of different speakers
`varies, the general shape of the human vocal response is
`fairly predictable. The location of the formants in
`voiced speech for males are approximately 500 Hz, 1400
`Hz, and 2300 Hz, with the ?rst formant having the
`highest amplitude. Formant locations for female speak
`ers could be expected to be slightly higher, with the ?rst
`formant located around 800 Hz.
`The upper frequency must be chosen to be above the
`third formant (2300 Hz) and below the upper cutoff
`(e.g., 3000 Hz). A lower frequency (F1) of 800 Hz and
`an upper frequency (F2) of 2800 Hz can be chosen for
`one example. A frequency bin size (At) of 400 Hz can be
`used to measure the energy in each location.
`Frame Combinations (K)
`The number of frames combined before a threshold
`comparison is made will greatly affect the operation of
`the squelch. Increasing the number of frames increases
`the processed speech energy and thus increases the
`probability of detection. However, if the number of
`frames is too large, the dead space between syllables.
`will be included in the measurement and probability of
`detection will drop. Simulation data shows that the
`shortest expected syllable length is four to ?ve analysis
`frames (160 to 192 msec); therefore a value of ?ve
`frames can be used in an exemplary design.
`Exemplary Implementation of a Digital Voice Squelch
`System
`Referring now to FIG. 7, a simpli?ed hardware block
`diagram of a digital voice squelch system embodying
`the invention is shown. The system 100 processes the
`audio input signal from the receiver 102. The analog
`audio signal AUDIO IN is fed to an analog-to-digital
`converter (ADC) 104 which digitizes the signal. The
`digitized signal is then fed to a digital signal processor
`(DSP) 106 and to a digital delay circuit 108. The DSP
`106 performs the processing described above to detect a
`voice signal on the audio input signal from the receiver
`102. The delayed digitized signal from the digital delay
`circuit 108 is fed to a digital-to-analog converter (DAC)
`110 to convert the delayed digitized signal back to ana
`log form. The analog signal is then fed to a multiplexer
`circuit 112 as one selectable input signal. The other
`inputs to the multiplexer are the signal AUDIO IN and
`ground. The DSP 106 controls the particular input to
`the multiplexer 112 to be output to the volume control
`circuit 114 by a select signal SEL. Thus, the output of
`the multiplexer 112 can be selected to be the delayed
`version of the audio input signal, the undelayed signal
`AUDIO IN, or ground. If the audio signal does not
`contain voice information, the DSP 106 can squelch the
`audio output signal by selecting the ground input. The
`output of the volume control signal, AUDIO OUT, is
`fed to an audio transducer 116, comprising a speaker or
`headphone, for example.
`
`40
`
`45
`
`50
`
`55
`
`65
`
`12
`
`
`
`5,365,592
`8
`7
`FIG. 8 shows a block diagram of an exemplary imple
`bus master and can pass bus control to the slave proces
`mentation of the DSP 106. The DSP 106 shown here
`sor 132 by writing a start command to the processor
`132. The slave processor 132 then takes control of the
`comprises a master processor 130 and a slave processor
`data bus 148 and when ?nished, issues an interrupt to
`132. A Motorola 68000 microcomputer is suitable for
`the master processor 130, indicating that the master
`use as the master processor 130. A Zoran Vector Signal
`processor 130 can resume processing.
`processor device, is suitable for use as the slave proces
`The parallel interface and timer 147 (PIT) provides
`sor 132. The DSP 106 further comprises ROMS 134 and
`an interrupt to the master processor 130 every 32 milli
`136 which store codes for the master and slave control
`seconds to signal that it is time to start processing a new
`ler devices, respectively. The ROM 138 is used as a
`lookup table to provide the logarithmic conversion
`block of data. The PI/T 147 also generates the control
`to the audio output multiplexer 112, allowing voice to
`function (block 60, FIG. 1).
`be transmitted or squelched, depending on the output of
`Address decode logic circuits 140 and 142 are pro
`the cepstrum algorithm or the mode of operation (ac
`vided for the respective master and slave processors 130
`tive or bypass). The PI/T 147 also controls when data is
`and 132.
`allowed to ?ll up in the input FIFO 144, storing the
`The digitized audio input data is provided to an input
`amount of audio data that is received during the cep
`FIFO buffer 144. The DSP 106 employs address, data
`strum processing time.
`and control buses 146, 148 and 150 to exchange address,
`All decoding, timing, and glue logic is performed by
`data and control signals among the respective compo
`a total of ?ve programmable array logic devices. One
`nents of the DSP 106. The input data is passed onto the
`device 140 is used for master processor 130 address
`data bus 150 in response to control signals.
`decoding, another device 142 for slave processor 132
`The DSP 106 further comprises a random access
`address decoding. Another device 140 includes a state
`memory 146, a parallel interface and timer device 148,
`machine used by the master processor 130 to read and
`which may comprise a type 68230 device, and a bus
`arbitration and interrupt logic circuit 150. The logic
`write to the control registers of the slave processor 132.
`Another device 150 is used for interrupt and bus arbitra
`circuit 150 receives timing data from the interface and
`25
`tion logic; and another device 152 is used to generate
`timer circuit 148, and controls the interrupt routines of
`the analog control and input FIFO control signals. The
`the master and slave processors 130 and 132.
`The system 100 further comprises a power supply 120
`decoding requires all memory accesses to be word >
`length, and requires that the 68000 microcomputer used
`providing +5 V, +12 V, and ~12 V.
`The analog signal section of the system 100 is shown
`as the master processor 130 be operated in the supervi
`in further detail in FIG. 9. The ADC 104 comprises a
`sor mode.
`scaling ampli?er 104A, a sample and hold intergrated
`Three clocks are used for the DSP 106, 20 MHz for
`the slave processor 132, 10 MHz for the master proces
`circuit device 104B, and a 12 bit ADC device 104C. The
`maximum input signal is 2.0 V peak. The scaling ampli
`sor 130, and 256 KHz for various timing functions.
`FIG. 10 shows a simpli?ed functional ?ow diagram
`?er 104A scales the input signal to the undistorted maxi
`of the processing of the analog audio data by the system
`mum allowed input of the ADC device 104C. The
`of FIG. 5. At step 160 the analog data is digitized (ADC
`ADC device 104C is issued a convert pulse every 125
`104), and the digitized data is processed (step 162) to
`microseconds (8 KHz) by the analog control circuit
`window, fast Fourier transform and perform the magni
`150. The DAC 110 consists of a D/A converter device
`tude squared functions. The processing functions of step
`110A, a scaling ampli?er 110B, and a forth order But
`162 are performed by the slave processor 132 in this
`terworth ?lter 110C. The output of the DAC 110 is fed
`to the multiplexer 112, whose output drives the output
`embodiment.
`At step 164 the logarithmic conversion function is
`volume control circuit 114. The circuit 114 comprises
`another scaling ampli?er 114B, and two output buffers
`performed, under control of the master processor 130,
`by use of the log lookup table stored in ROM 138. Step
`114A and 114D. The ?rst output sealer 110B scales the
`166 represents the inverse FFT function and magnitude
`output of the DAC device 110A back down to the level
`squared function performed by the slave processor 132.
`of the input signal AUDIO IN. The maximally ?at ?lter
`At step 168 peak detection and tracking functions are
`110C has a cutoff frequency of 3.5 KHz to ?lter out the
`performed by the master processor 130. At step 170
`sampling images (centered at multiples of 8 KHz). The
`analog multiplexer is controlled by the DSP 106, allow
`another FFT function and magnitude square function is
`performed by the slave processor 132. The spectral
`ing the output audio to be transmitted only when voice
`rolloff of tile resultant signal is then processed by the
`is detected or allowing audio to be transmitted continu
`ally during bypass modes of operation. The output of
`master processor 130, and the voice detection decisions
`the multiplexer 112 is buffered (114A), scaled (114D),
`are made.
`The following is a summary of important characteris
`and then output to an audio tapered potentiometer
`114C. The output of the potentiometer 114C is then
`tics of the waveform characterizer and an application
`thereof for digital squelch.
`buffered and output to the transducer 116.
`The DSP 106 receives 12 bits of sampled data from
`Waveform Characterization
`the ADC 104 at a 8 KHz clock rate. The data is sent to
`l. Waveform characterizer circuit processing per
`the 2 K input FIFO 144 and to a 4 K data storage FIFO
`60
`buffer 154 which performs the function of the digital
`formed in the transform domain with FFT and logarith
`mic processing is simple to implement.
`delay device 108 (FIG. 7).
`2. The waveform characterization technique is appli
`As described above, the DSP 106 has two processors
`cable to a broad range of signal modulations including
`130 and 132 on the data bus. Each processor 130 and 132
`SSB voice, PSK, and, teletype. Cepstrum processing is
`has its own code ROM (8K>< 16), devices 134 and 136,
`sensitive to interference signals such as FSK, PSK and
`and together they share a common data RAM 146
`CW transmission. This fact indicates that the cepstrum
`(8KX16). The slave processor 132 alone can read data
`can be used to detect and possibly characterize radio
`from the input FIFO 144. The processor 130 acts as the
`
`40
`
`55
`
`65
`
`15
`
`20
`
`45
`
`13
`
`
`
`5
`
`15
`
`5,365,592
`10
`frequency transmission. The properties associated with
`1. A waveform characterizer apparatus for determin
`ing cepstrum pitch and spectral rolloff properties of an
`voice that allow for cepstral detection are the presence
`input signal waveform, comprising:
`of a cepstral peak and a unique spectral pro?le. The
`voice cepstral peak can be slowly moving from 3 msec
`means for digitizing the input signal waveform to
`provide a digital waveform signal;
`to 15 msec, while the voice spectral content at 2500 Hz
`is much smaller than that at 800 Hz. Digital signals, such
`means for providing the cepstrum of the input signal
`as FSK and PSK, also exhibit similar characteristics.
`waveform including means for transforming the
`digitized input signal waveform into the frequency
`The periodic cepstral peaks indicate the ?xed baud rate
`of the transmission, and the spectral distribution identi
`domain which includes memory means for storing
`?es the modulation waveform used. Thus, the unique
`said digital samples in memory, means for reading
`spectrum and cepstrum characteristics of the PSK and
`said digital samples out of said memory means in
`FSK makes the cepstral processor an excellent candi
`blocks of N samples corresponding to a frame dura
`tion of T=NR, and means for transforming said
`date for use as a waveform characterizer. Characteriza
`tion ability would allow for automatic detection and
`respective blocks of digital data samples into the
`frequency domain by a fast Fourier transform algo
`routing of a signal to the proper receiver, such as a
`modem teletype or speaker, for demodulation, thus
`rithm;
`7
`freeing the operator to concentrate on other tasks. An
`means for deconvolving the impulse response and
`other bene?t is the ability to track and identify multiple
`periodicity of the frequency domain signal to pro
`signals simultaneously and automatically. The received
`vide a deconvolved digital signal; and
`waveform can be characterized in the following man
`means for transform the deconvolved digital signal
`ner. The cepstral peak of a voice signal will be located
`back into the time domain to provide the cepstrum
`of the input s