throbber
1
`
`SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENT
`
`By
`
`Abdul Wahab bin Abdul Rahman
`
`School of Applied Science,
`
`Nanyang Technological University,
`
`Nanyang Avenue,
`
`Singapore 639798
`
`First Year Report for the degree of PhD.
`
`In
`
`Applied Science
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 1 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 1
`
`

`
`2
`
`Abstract
`
`The increasing demand for digital cellular telephony and other new services including
`
`multimedia communications prompted numerous studies on implementing algorithms
`
`for low rate speech coding below 4.8 kbits/s using available DSP processors on the
`
`market. In addition there are needs to enhance the speech quality subject to both
`
`degradations due to road, engine and wind noise and the echoes present in the near-end
`
`speaker side--sources effecting the car phone input. All of these tasks must be achieved
`
`with a single DSP chip in order for the system to be both cost-effective and power
`
`efficient and thus widely accepted. This dissertation research proposes:
`
`1)
`
`2)
`
`3)
`
`4)
`
`To study both analytically and experimentally the above degradation.
`
`To develop sound signal processing algorithms to combat these imperfections.
`
`To address architectures for real-time implementation.
`
`To implement them on a DSP platform using state-of-the-art devices and
`
`reconfigurable systems.
`
`Firstly, the impediment to the speech quality in a vehicular chamber is the echo
`
`generated by the leakage of the far-end speaker and is mixed with the speech from the
`
`near-end speaker and transmitted as a composite signal. The first task of the proposed
`
`speech enhancer is to adaptively cancel these echoes. This necessitates the inclusion of
`
`near-end speaker activity detection.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 2 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 2
`
`

`
`3
`
`Secondly, in the vehicular hands-free cellular communication framework, it has been
`
`observed that the degradation in the intelligibility and the general quality of the cellular
`
`speech due to the engine, road, and wind noise components is equally disturbing as the
`
`vehicular echoes. Hence, the second important task of the enhancer is to combat these
`
`imperfections in the cellular speech.
`
`This last point, in particular, makes a form of beam forming based on a microphone
`
`array followed by an adaptive filtering process as a conceptually sound candidate for our
`
`speech enhancer. The most simple form of beamforming is called the delay and sum
`
`beamforming, which compensates the delay of the target signal and sums the signals in
`
`the beam so that the target signals have the same phase while the interfering signals
`
`exhibit different phase. The delay and sum beamforming technique will be used to first
`
`follow the genuine speaker. Then it will adaptively cancel the noises coming from the
`
`interfering speakers, the engine, the wind, especially critical when the windows of the
`
`vehicle are down, and the road noise coming from other vehicles and the road itself.
`
`There are a few studies in the literature on this for speech recognition in a hands-free
`
`telephone set-up [1-7]. These imperfections would be look into and a unified approach
`
`develops to combat all of these different types of degradations.
`
`In addition the research will develop architectures for real-time implementation on a
`
`single chip DSP as part of the next generation digital cellular phones operating at 4.8
`
`KB/s or less. At the time of this proposal, the preferred DSP platform is the
`
`TMS320C4x family from the Texas Instruments, Inc. However, the algorithms develop
`
`will be reconfigurable.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 3 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 3
`
`

`
`4
`
`1. Introduction
`
`It has been in the public debate for some time that vehicles in the future would need to
`
`detect, process, and communicate significantly more information. They will be between
`
`the vehicle and the driver, among the people in the vehicle and between the vehicle and
`
`the outside world, including other vehicles, road itself, and the Advanced Traffic
`
`Management and Information Systems (ATMIS).
`
`The driver and other passengers may want to communicate with the outside world
`
`verbally, or to have a conference call. These activities have been traditionally handled
`
`by car phones or short-wave radios, where the underlying signal is the band-limited
`
`voice grade waveform. These signals are transmitted over a communication channel,
`
`which is extremely corrupted by echoes both in the transmission link and inside the
`
`chamber of the vehicle. In addition, there are also natural and man-made noise from
`
`numerous sources, and interfering signals from other channels, passengers, and audio
`
`information subsystem present. It is commonly accepted that the next generation car
`
`telephones will be totally digital cellular and the volume of applications will increase.
`
`However, a number of ills will not go away and a speech processing system will be
`
`required to tackle them. Some of the tasks for the research will be the noise suppression,
`
`echo cancellation, source
`
`localisation, speaker
`
`identification, speech coding,
`
`compression and transmission by digital means.
`
`This report will discuss the spectral dissection of various degradations in vehicular
`
`environment followed by proposal for a cost-effective model for the speech enhancer
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 4 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 4
`
`

`
`5
`
`system and the introduction to a re-configurable digital signal-processing concept. The
`
`detail discussion on the propose speech enhancer system covers:
`
`1. Identifying the various man-made noises and categorising them into different
`
`optimal sub-bands so that noise cancellation and suppression can be accurately
`
`achieved.
`
`2. Handling of echoes from the near-end and the far-end speakers to adaptively cancel
`
`them.
`
`3. Genuine speaker identification and employing the beam former to follow the
`
`genuine speaker and then to adaptively cancel the noise coming from the interfering
`
`speaker, the engine, the wind and the road noise.
`
`4. Since these noises can be categorise into optimal sub-bands multi-rate signal
`
`processing can be employed to improve the computing performance of the speech
`
`enhancer.
`
` Finally, the conclusion, summary and schedule of the proposed speech enhancer system
`
`architecture for the PhD research.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 5 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 5
`
`

`
`6
`
`2. Spectra of Vehicular Disturbances
`
`In order to justify various components of the proposed system, it would be appropriate
`
`that the various ills mentioned above be observed visually. To study the problem
`
`carefully and to gather road data, a field test was performed. A compact van was
`
`equipped with a DAT tape recorder and a low-cost low-pass microphone. There were
`
`two passengers to act as interference sources in addition to the driver. A database of 40
`
`minutes long recordings under 16 different experimental conditions were recorded by
`
`travelling along the city streets and two expressways in Singapore for a number of
`
`hours. The data were captured onto a hard disk using the speech I/O unit of a digital
`
`signal processing development system. The clock rate was set at 8,000 samples/s, which
`
`is the Nyquist frequency after properly band-limiting the signal to the voice-grade
`
`service bandwidth of the next generation digital cellular phones.
`
`Figure 1 shows a single plot of a stationary vehicle with ignition just started. Each frame
`
`consists of 1024samples and the peak at about the 50th frames shows the maximum
`
`revolution of the engine during the initial start up. Since the windows are all up and the
`
`vehicle is not moving and no other speakers are talking, the spectrum clearly reflects the
`
`engine and air-con noise which dominate the 0 to 150Hz range.
`
`Figure 2 shows two plots of the vehicle moving along a minor road across speed
`
`regulating strips with varying distance. Notice the cyclic nature of the road noise
`
`occurring between 0 to 100 Hz. The spectrum also indicate that the road noise are low
`
`frequencies less then 200 Hz.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 6 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 6
`
`

`
`7
`
`Figure 1. Spectrum of the engine noise in frequency ranges 0-1,000 Hz for a stationary
`
`vehicle during the initial ignition of the engine
`
`Figure 2 Spectrum of the road noise in frequency ranges 0-1,000 Hz and 1,000 – 4,000
`
`Hz for a vehicle moving across a speed regulating strips.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 7 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 7
`
`

`
`8
`
`The spectrum of the engine noise is presented in two plots in figure 3 while the vehicle
`
`is moving at a nominal speed of 60 km/h. The windows were rolled up and the chamber
`
`was quiet. There was not any other vehicle in the vicinity and it was not possible to
`
`detect wind noise inside the vehicle. As it can be seen from these two plots, the engine
`
`noise does have any effect above 200 Hz. It can be seen from figure 1, 2 and 3 that the
`
`engine noise should be very easily tackled.
`
`Figure 3. Spectrum of the engine noise in frequency ranges 0-1,000 Hz and 200-1,000
`
`Hz for a vehicle moving at 60 km/h (windows rolled up and quiet inside the chamber.)
`
`Figure 4 display the spectra in three plots in the frequency ranges 0-1,000 Hz, 200-1,000
`
`Hz, and 1,000-4,000 Hz. In this case, the vehicle is stationary with the windows down.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 8 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 8
`
`

`
`9
`
`Figure 4.The spectrum of the stationary engine noise, ambient wind noise, and
`
`interference from vehicles passing by. The windows are down and the speakers are
`
`silent. The frequency ranges are from 0-1,000 Hz, 200-1,000 Hz and 1,000-4,000 Hz,
`
`respectively.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 9 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 9
`
`

`
`10
`
`There was a heavy vehicle moving at about 50 km/h and the levels of the ambient road
`
`noise and the wind noise were rather significant. In addition to the very-low frequency
`
`components of the previous case representing the engine noise, we have two additional
`
`spectral regions to consider. As it can be seen from this figure, there is considerable
`
`information in the frequency range between 200-400 Hz. We believe this is coming
`
`from the ambient wind noise and the wind generated by vehicles passing by and the road
`
`noise coming from the tire friction on pavement. Suppression of this degradation is not
`
`as simple as the previous one since it exhibits a slowly varying random behaviour.
`
`Nevertheless, a slowly adaptive filtering process should be able to minimise its effects.
`
`Noise components in the frequency range 1,000 Hz -4,000 Hz exhibit a coloured noise
`
`spectrum in a widely spread fashion. Since this spectrum is covering the complete
`
`speech frequency range, it is very difficult to tackle. Source localisation based on
`
`adaptive beamforming followed by a trainable and quickly adapting estimation and
`
`cancellation scheme will be needed to suppress the contributions from these sources.
`
`Finally, in Figure 4, we display similar spectra under more severe conditions. This time,
`
`the vehicle is travelling at a speed of 60 km/h with windows rolled up; there are other
`
`vehicles passing by; the driver is trying to communicate and the two passengers kept
`
`talking. The first spectrum is very similar to the one in Figure 1,2 and 3. However, the
`
`noise in the low frequency range 200-400 Hz is drastically reduced in comparison to
`
`Figure 4.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 10 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 10
`
`

`
`11
`
`Figure 5. The Spectrum when the driver is trying to communicate and two passengers
`
`kept talking in a moving vehicle at a speed of 60 km/h with windows rolled up. As
`
`before, frequency ranges are from 0-1,000 Hz, 200-1,000 Hz and 1,000-4,000 Hz,
`
`respectively.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 11 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 11
`
`

`
`12
`
`In the last figure, it is possible to observe the formant structure of the speech. We
`
`believe this will be one of the most frequently encountered scenarios and the speech
`
`enhancement task will be very demanding since all three speakers are talking and their
`
`acoustical echoes are riding on all other ills. It is impossible to completely eliminate all
`
`the degradations in this case. But the advanced speech enhancement features of the
`
`proposed system will be able to improve the quality of speech to permit uninterrupted
`
`communication.
`
`Lastly we would like to present the spectrum of a female and male voice respectively
`
`showing clearly all the formants of the speech as shown by figure 6a and 6b.
`
`Figure 6a. The spectrum of a female voice with frequency range from 0 - 2,000 Hz and
`
`2,000 – 4,000 Hz. (using CELP coder)
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 12 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 12
`
`

`
`13
`
`Figure 6b. The spectrum of a male voice with frequency range from 0 - 2,000 Hz and
`
`2,000 – 4,000 Hz. (using CELP coder)
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 13 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 13
`
`

`
`14
`
`3. The Enhanced Speech Processing and Communication System
`
`The speech quality of the emerging totally digital cellular phones will, to a greater
`
`extent, depend on the speech quality available at the near-end transmitter of the
`
`communication link. Despite this, most research efforts have been directed towards
`
`speech coding techniques, channel transmission issues of cellular telephony and noise
`
`control and optimisation [1][2][3][4][16]. Recently there has been some research
`
`interest on the effects of ambient acoustical noise in the vehicular environment
`
`[17][18][19], but most of the work on echo cancellation research are carried out in
`
`classroom or a conference room environment. The longest distance of a mid-size car is
`
`about 5m, which corresponds to 16ms delays at a sampling rate of 8,000 samples per
`
`second. At this distance the IS-54 industry standard would require 128-tap FIR filter to
`
`cancel off the echo. Throughout the world it is observed that a significant percent of
`
`cellular phone users are in vehicular chambers, cars, trucks, buses, and public
`
`transportation systems where degradations due to echoes, interferences, and various
`
`types of noise are severe. Recently, some research results, which address some of these
`
`problems, have been reported [6][7][8][9][10][11][12].
`
`An ideal solution to these is to have an enhanced speech processing and communication
`
`system with re-configurable and multi-tasking architecture. The system should be able
`
`to locate an intended speaker, cancel echoes generated inside the vehicle, combat
`
`various noise, and jamming signals as well as handle all the speech processing,
`
`compression, transmission, reception, and data and network communication tasks. In
`
`Figure 7, we present a block diagram of the proposed speech processing and
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 14 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 14
`
`

`
`15
`
`communication system. Speech input to the system will be provided by a microphone
`
`array strategically positioned on the dashboard to capture various signals from speech,
`
`different types of noise, echoes and other interferences. The front-end CODEC will have
`
`a set of 16-bit analogue-to-digital (A/D) and digital-to-analogue (D/A) converters with
`
`sampling rate of between 8,000-10,000 samples per second.
`
`Before any processing task, the system should be able to locate and identify the primary
`
`speaker. That is, the system must focus to its primary user. Speech from other people in
`
`the vehicle, from the hi-fi systems, echoes, engine noise, road noise, wind noise, noises
`
`from standing nearby and passing by vehicles will be considered unwanted input signals
`
`and hence, our objective is to eliminate them, or at least, suppress them significantly.
`
`This, in turn, will improve the quality of the speech from the genuine user.
`
`One of the most annoying impediments to speech quality in a vehicular chamber is the
`
`echo generated by the leakage of the far-end speaker. When the near-end speaker (i.e.
`
`the driver) or any of the passengers in the car speaks, this echo is mixed with his/her
`
`speech and transmitted as a composite signal. Thus, the first task of the proposed speech
`
`enhancement system is to adaptively cancel the echo during non-speech periods.
`
`However, it should not work as a canceller when the near-end speaker speaks. In other
`
`words, no adaptation is to be performed when the near-end speaker talks. This
`
`necessitates the inclusion of a near-end speaker activity detection mechanism. In our
`
`literature survey [6][7][8][9][10][11][12], we have noticed that some researchers have
`
`used a coefficient adaptation algorithm based on the least-mean-squared (LMS) error
`
`criterion for echo cancelling. Albeit being very successful in echo cancellation, the basic
`
`LMS technique is not very effective in tackling other degradation.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 15 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 15
`
`

`
`16
`
`TRANSMITTER/
`RECEIVER
`
`SPEECH CODER
`Y 8.0 VCELP
`Y 4.8CELP
`Y 2.4 MELP
`Y 32Kb/s LD-CELP
`Y ADPCM
`Y 16Kb/s LD-CELP
`
`CANCELLER/
`ENHANCER
`
`CODEC
`
`16-bit A/D and D/A converters
`gain control for each microphone
`and speaker output
`
`Sampling rate 8 to 10 Ks/sec
`
`Figure 7. The Block Diagram of the Proposed Speech Processing and Communication
`
`Microphones Array Beam former
`
`System
`
`Secondly, in the vehicular hands-free cellular communication framework, the engine,
`
`road, and wind noise components need to be considered. It has been observed that the
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 16 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 16
`
`

`
`17
`
`degradation in the intelligibility and the general quality of the cellular speech due to this
`
`imperfection is equally disturbing as the echo of the previous section. Hence, the second
`
`objective of the enhanced speech processing and communication system is to combat
`
`these imperfections of the cellular speech or data. Although there are some recent
`
`studies and analyses on the spectra of these noise sources [7][8][9], they are not directly
`
`applicable here since these noise sources have statistically different spectral behaviour.
`
`For instance, the engine noise is significantly correlated with the engine RPM and
`
`therefore, it is rather deterministic. On the other hand, the road and wind noises are
`
`stochastic in nature and spread over a frequency range [17].
`
`The worst class of degradation is from the interspeaker interference. In this case, the
`
`primary signal and the interfering signals have similar spectra [28]. Thus, it is an
`
`extremely difficult problem to tackle. This was the main reason why we are proposing
`
`the inclusion of speaker tracking and identification capabilities in this speech processing
`
`system.
`
`This last point, in particular, suggests a type of beamforming structure based on a
`
`microphone array [20][23][25][26][27] followed by an adaptive filtering scheme.
`
`Beamforming techniques, which have found important applications in radar, sonar,
`
`radio astronomy, geophysics, and biomedical signal processing applications, appear to
`
`be a conceptually sound candidate for our speech enhancement task.
`
`The most simple form of beamforming is called the delay and sum beamforming, which
`
`compensates the delay of the target signal and sums the signals in the beam so that the
`
`target signals have the same phase while the interfering signals exhibit different phase.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 17 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 17
`
`

`
`18
`
`Here we propose to use the delay and sum beamforming technique. First, it follows the
`
`genuine speaker and then adaptively cancels noises coming from the interfering
`
`speakers, the engine, the wind --especially critical when the windows are down-- and the
`
`road noise coming from other vehicles and the road-tire friction. There are some studies
`
`on
`
`this method
`
`for speech
`
`recognition
`
`in a hands-free
`
`telephone set-up
`
`[6][7][8][9][10][11][12]. Figure 7 shows the structure of the proposed enhancer with the
`
`microphone array and the A/D converters (Dn+1) as the inputs. The output of the system
`
`is a cleaned speech to be transmitted after compression.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 18 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 18
`
`

`
`M1
`
`M2
`
`M3
`
`D1
`
`D2
`
`D3
`
`Mn+1
`
`Dn+1
`
`19
`
`Genuine
`Speaker
`Tracker
`
`Speech + noise
`
`Speech output
`
`Gain and Phase
`update for
`Microphones array
`beamformer
`
`- noise and other
`imperfection
`
`FIR1
`
`FIR2
`
`FIR3
`
`FIRn+1
`
`Filter
`coefficient
`update
`
`Figure 7. The Speech Enhancement Circuit.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 19 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 19
`
`S
`S
`

`
`20
`
`4. Re-configurable Digital Signal Processing
`
`The above speech enhancement architecture requires a considerable amount of
`
`computations. Depending on the particulars of the actual speech/speaker detection
`
`circuitry, the beamformer, adaptive filter banks and digital speech compression
`
`algorithm, we anticipate the overall computational complexity to be on the order of 35-
`
`40 million operations per second (MOPS)1. In particular, the 2,400 bits/s U.S.
`
`government standard MELP coder will require 22-25 MOPS [11-12]. The remaining 13-
`
`15 MOPS will be needed for all other tasks. This conservative figure should be
`
`sufficient since all tasks other than the speech compression will be performed in a re-
`
`configurable multi-tasking fashion2. We believe The Texas Instruments, Inc.,
`
`TMS32C4X DSP hardware platform operating at 40 MHz should be able to handle all
`
`the computational needs. In order to have a microphone array size of six or more we
`
`propose the front-end audio input/output unit to have an eight channel aggregate
`
`200,000 Hz A/D rate in a multiplexed fashion and a minimum of two output channels.
`
`Operating of the system will require a scheduler and a memo-passing facility so that
`
`information can be passed from one process to another. A memo in this case will consist
`
`of the type of processing requirement, the placement of data in memory and, of course,
`
`the originating, and destination units.
`
`
`1 Here we use the term MOP in the framework of the Texas Instruments TMS320C4X DSP systems family.
`2 It should be easy to guess that the computational complexity would increase enormously if the architecture did not have re-
`configurability. That is, the overall computational load would be unacceptably high if the algorithms and circuits for all tasks were
`kept running at all times.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 20 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 20
`
`

`
`21
`
`In addition since the spectral analysis of the noises and echoes indicate the potential of
`
`subdividing them into optimal sidebands, it will be an ideal situation for computation to
`
`be carried out in a much lower sampling rate [29].
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 21 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 21
`
`

`
`22
`
`5. Conclusions and the next step of the project
`
`In this study, we propose a working model for future dashboards in intelligent vehicles.
`
`The system includes a totally digital speech processing and communication system.
`
`Since it is a digital system it will be easily reconfigured to work as an advanced packet
`
`data communication system including fax and electronic mail, voice mail and high-
`
`speed data transfer tasks. We have presented the enhanced speech communication sub-
`
`system and the source tracking and noise cancellation circuitry. However, we would like
`
`to emphasise that the proposed architecture and its components are at its very early stage
`
`of the project and need further deliberation and analysis. In other words, we need to
`
`study the various components in more detail and analysing its interaction. Although
`
`there are many research papers on noise and echo cancellation but they are mainly focus
`
`on a non-vehicular environment. In addition to handling noise and echo problems in a
`
`vehicle the speech enhancer also need to handle the interspeaker interference.
`
`The project should be able to develop expertise in the following areas:
`
`1.
`
`Speech coding and analysis
`
`2. Adaptive Signal Processing
`
`3. Noise and Echo analysis and Cancellation
`
`4. Microphone Array beamformer architecture
`
`5.
`
`The DSP chip (TMS320CXX) and reconfigurabilty
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 22 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 22
`
`

`
`23
`
`The initial studies of noise and echo in a vehicular environment have been completed
`
`and are now at the stage of looking into adaptive cancellation of noises and echoes.
`
`Even though there have been research in these areas but it would require research to
`
`perform cancellation in a sub-band fashion. The sub-band coding techniques employing
`
`the Wavelets concept where each sub-band will be analysed and processed separately
`
`thus improve the computation effort.
`
`In order to analyse the interspeaker interference we need to employ the blind
`
`deconvolution or separation of the signals from the interfering speaker. Techniques
`
`available right now are fairly complicated and would require a lot of computing power
`
`and are not acceptable to the projects. A simplified form is required and in addition the
`
`microphone array beamformer architecture with directive control would proof useful.
`
`This technique needs further discussion in the noisy vehicular environment.
`
`Lastly, the implementations of the speech enhancer into the TMS320Cxx DSP chip
`
`including the MELP speech CODEC will be of great challenge. The reconfigurable
`
`architecture requires smart and quick algorithm to fit into the time slot allocated for the
`
`available MOPS on the DSP chip.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 23 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 23
`
`

`
`24
`
`6. Acknowledgements
`
`I am gratefully indebted to Prof. Dr. Hüseyin Abut, my thesis advisor, and professor, for
`
`his valuable guidance and support. Additionally, I would like to thanks Dr. Tan Eng
`
`Chong for his advice as my thesis supervisor.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 24 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 24
`
`

`
`25
`
`7. References
`
`[1] Thomas E. Miller and Jeffrey Barish, “ Optimizing Sound for Listening in the
`
`Presence of Road Noise”, The International Conference on Signal Processing
`
`Applications and Technology, ICSPAT '93, Santa Clara, Calif., USA, Sept. 28-
`
`Oct. 1 93, Vol. 1, pp. 97-106.
`
`[2] Carlos R. Martins, Moises S. Piedade, INESC and Ceautl Lisboa, “ Fast Adaptive
`
`Noise Canceller using the LMS Algorithm”, The International Conference on
`
`Signal Processing Applications and Technology, ICSPAT '93, Santa Clara, Calif.,
`
`USA, Sept. 28- Oct. 1 93, Vol. 1, pp. 121-127.
`
`[3] Harrison, W. A., J. S. Lim and E. Singer, “A New Application of Adaptive Noise
`
`Cancellation”, IEEE Trans. Acoust., Speech and Signal Processing, Vol. ASSP-
`
`34, No. 1, pp. 21-27, Feb. 1986.
`
`[4] H. Olson, "Electronic Control of Noise, Vibration and reverberation," J. Acoust.
`
`Soc. Am., Vol.28, 1956, pp. 966-972.
`
`[5]
`
`Juha Hakkinen and Mauri Vaananen, “Background Noise Suppressor for a Car
`
`Hands-free Microphone”, The International Conference on Signal Processing
`
`Applications and Technology, ICSPAT '93, Santa Clara, Calif., USA, Sept. 28-
`
`Oct. 1 93, Vol. 1, pp. 300 – 307.
`
`[6] D. Messerschmitt, D. Hedberg, C. Cole, A. Haoui, and P. Winship, "Digital Voice
`
`Echo Canceller with a TMS320C20," in DSP Applications, K.-S. Lin, Ed.,
`
`Prentice-Hall, 1987.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 25 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 25
`
`

`
`26
`
`[7] S. Oh, V. Viswanathan, and P. Papamichalis, "Hands-Free Voice Communication
`
`in an Automobile With a Microphone Array," Proc. IEEE ICASP-92, pp. I-281 -
`
`284, San Francisco, CA.
`
`[8]
`
`I. Claesson, S.E. Nordholm, B.A. Bengtsson, and P. Erickson, "A Multi-DSP
`
`Implementation of a Broad-Band Adaptive Beamformer for Use in a Hands-Free
`
`Mobile Radio Telephone," EEE Trans. on Vehicular Technology, Vol. 40, pp.
`
`194-201, Feb. 1991.
`
`[9] L.J. Griffiths and C.W. Jim, "An Alternative Approach to Linearly Constrained
`
`Adaptive Beamforming," EEE Trans. on Antennas Propag., Vol. AP-30, pp. 27-
`
`34, January 1982.
`
`[10] E. Arkan, "Echo and Road Noise Cancellation in Digital Cellular Telephone,"
`
`M.S. Thesis, San Diego State University, Spring 1994.
`
`[11] E. Arkan, H. Abut, S. Pelling, fj. harris, and G.C. Marques, "Implementation of a
`
`5.0 KB/s Coder for Vehicular Applications: Part: II Acoustic Echo and Noise
`
`Canceller, Proc. of ASILOMAR-1993 Conf. on Sig., Sys. & Computers, pp. 776-
`
`780, IEEE Computer Society Press, 1993.
`
`[12] J. Tardelli, Chair, "US DoD Selection of 2400 BPS Standard," Special Session
`
`SPEC3, Proceedings of IEEE ICASSP-96, Pp. 1137-1164, May 1996, Atlanta,
`
`GA.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 26 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 26
`
`

`
`27
`
`[13] A. McCree, K. Truong, E.B. George, T.P. Barnwell, III and V. Viswanathan, "A
`
`2.4 KBIT/S MELP Coder Candidate for the new U.S. Federal Standard,"
`
`Proceedings of the IEEE ICASSP-96, May 1996, Atlanta, GA.
`
`[14] B.S. Hong, "Adaptive Filtering for Automatic Noise Cancelling in an Interactive
`
`Classroom." Final Year Project Report, No: 74-96, School of Applied Science,
`
`Nanyang Technological University, Singapore, 1997.
`
`[15] B. Widrow and S.D. Stearns, Adaptive Signal Processing, Prentice-Hall,
`
`Englewood Cliffs, N.J., 1985
`
`[16] Nestor Becerra Yoma, Fergus McInnes and Mervyn Jack, “Weighted Matching
`
`Algorithms and Reliability in Noise Cancelling by Spectral Subtraction”, 1997
`
`International Conference on Acoustics, Speech, and Signal Processing
`
`(ICASSP97), Munich, Germany, April 21-24, 1997, pp 1171 – 1174.
`
`[17] Joerg Meyer and Klaus Uwe Simmer, “Multi-channel Speech Enhancement in a
`
`Car Environment using Weiner Filtering and Spectral Subtraction”, 1997
`
`International Conference on Acoustics, Speech, and Signal Processing
`
`(ICASSP97), Munich, Germany, April 21-24, 1997, pp. 1167 – 1170.
`
`[18] L. Arslan, A. MacCree, and V. Viswanathan, “New methods for Adaptive noise
`
`suppression,” Proc. IEEE Int. Conf. Acoustic, Speech and Signal Processing,
`
`ICASSP –95, Detriot, Michigan, pp. 812 – 815, May 1995.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 27 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 27
`
`

`
`28
`
`[19] S. Nordholm, I. Claesson and I. Bengtsson, “Adaptive Array Noise Suppression of
`
`Handsfree speaker input in Cars”, IEEE Trans. On Vehicular Technology, vol. 42,
`
`no. 4, pp. 514 – 518, Nov. 1993.
`
`[20] Walter Kellermann, “Strategies for Combining Acoustic Echo Cancellation and
`
`Adaptive Beamforming Microphone Arrays,” Proc. of 1997 International
`
`Conference on Acoustics, Speech, and Signal Processing (ICASSP97), Munich,
`
`Germany, April 21-24, 1997, pp. 219 – 222.
`
`[21] Shoji Makino, Klaus Sstrauss, Suchiro Shimauchi, Yoichi Haneda and Akira
`
`Nakagawa, "Subband Stereo Echo Canceller using the Projection Algorithm with
`
`Fast Convergence to the true Echo Path,” Proc. of the 1997 International
`
`Conference on Acoustics, Speech, and Signal Processing (ICASSP97), Munich,
`
`Germany, April 21-24, 1997, pp. 299 - 302.
`
`[22] H. Schütze, “Convergence of Acoustic Echo Cancellers for Hands-Free
`
`Telephones Operating Under Feedback Conditions,” IEEE Trans. Speech and
`
`Audio Processing, April 1993, Vol. 1 #2, pp. 257 – 260.
`
`[23] S. Gazor and Y. Grenier, “Criteria for Positioning of Sensors for a Microphone
`
`Array”, IEEE Trans. On Speech and Audio Processing, July 1995, Vol. 3 #4, pp.
`
`294 – 303.
`
`[24] R. Martin and S. Gustafsson, The Echo shaping approach to Acoustic Echo
`
`Control”, Speech Communication, special issue on acoustic echo and noise
`
`control, 20(3-4), January 1997.
`
`/Abdul Wahab/PHD/wahab1.doc/Tuesday, July 22, 1997
`
`Page 28 of 29
`
`Petitioner Apple Inc.
`Ex. 1019, p. 28
`
`

`
`29
`
`[25] Jens Meyer and Carsten Sydow, “Noise Cancelling for Microphone Arrays”,
`
`Proc. of the 1997 International Conference on Acoustics, Speech, and Signal
`
`Processing (ICASSP97), Munich, Germany, April 21-24, 1997, pp. 211 - 214.
`
`[26] Gary W. Elko and Anh-Tho Nguyen Pong, “A Steerable and Variable First-Order
`
`Differential Microphone Array”, Proc. of the 1997 International Conference on
`
`Acoustics, Speech, and Signal Processing (

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket