throbber
(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2006/0120537 A1
`(43) Pub. Date:
`Jun. 8, 2006
`Burnett et al.
`
`US 2006O120537A1
`
`(54) NOISE SUPPRESSING
`MULT-MICROPHONE HEADSET
`
`(76) Inventors: Gregory C. Burnett, Dodge Center,
`MN (US); Jaques Gagne, Los Gatos,
`CA (US); Dore Mark, San Francisco,
`CA (US); Alexander M. Asseily,
`London (GB); Nicolas Petit,
`Burlingame, CA (US)
`Correspondence Address:
`COURTNEY STANFORD & GREGORY LLP
`P.O. BOX 9686
`SAN JOSE, CA 95157 (US)
`(21) Appl. No.:
`11/199,856
`
`(22) Filed:
`
`Aug. 8, 2005
`Related U.S. Application Data
`(60) Provisional application No. 60/599.468, filed on Aug.
`6, 2004. Provisional application No. 60/599,618, filed
`on Aug. 6, 2004.
`
`Publication Classification
`
`(51) Int. Cl.
`A6IF II/06
`GIOK II/I6
`HO3B 29/00
`
`(2006.01)
`(2006.01)
`(2006.01)
`
`(52) U.S. Cl. ............................................. 381/71.6; 381/72
`
`(57)
`
`ABSTRACT
`
`A new type of headset that employs adaptive noise Suppres
`Sion, multiple microphones, a voice activity detection
`(VAD) device, and unique mechanisms to position it cor
`rectly on either ear for use with phones, computers, and
`wired or wireless connections of any kind is described. In
`various embodiments, the headset employs combinations of
`new technologies and mechanisms to provide the user a
`unique communications experience.
`
`
`
`
`
`
`
`230
`
`240
`
`WAD
`Algorithm
`
`Noise
`Suppression
`
`
`
`101
`
`Page 1 of 31
`
`GOOGLE EXHIBIT 1015
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 1 of 21
`
`US 2006/O120537 A1
`
`
`
`w
`
`t
`
`Page 2 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 2 of 21
`
`US 2006/O120537 A1
`
`
`
`W-I’OIH (~001
`
`
`
`
`
`
`
`
`
`ZZZZZZZZZZZZZZA ' ZZZZZZZZZZZZZZZ
`
`% Z
`
`2
`277.277717/7ZZZ
`
`Page 3 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 3 of 21
`
`US 2006/O120537 A1
`
`FIG.1-B
`
`
`
`
`
`
`
`230
`
`240
`
`VAD
`Algorithm
`
`FIG.2
`
`Noise
`Suppression
`
`
`
`101
`
`Page 4 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 4 of 21
`
`US 2006/O120537 A1
`
`
`
`Page 5 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 5 of 21
`
`US 2006/O120537 A1
`
`
`
`Page 6 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 6 of 21
`
`US 2006/0120537 A1
`
`Receive SSM sensor data
`
`Filter and digitize SSM sensor data
`
`3 O O
`
`302
`
`304
`
`Segment and step digitized data
`
`-306
`
`Remove spectral information corrupted by noise
`
`Calculate energy in each Window
`
`Compare energy to threshold values
`
`Energy above threshold indicates voiced speech
`
`Energy below threshold indicates unvoiced speech
`
`308
`
`310
`
`312
`
`314
`
`316
`
`FIG.3
`
`Page 7 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 7 of 21
`
`US 2006/0120537 A1
`
`110
`
`FIG.3-A
`
`110
`
`
`
`13
`
`
`
`p
`
`110
`
`
`
`
`
`
`
`Ø 6.4
`
`112
`
`
`
`ZZ 44
`
`7 1 A
`
`Ø 2.8
`
`Section A-A
`
`Page 8 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 8 of 21
`
`US 2006/O120537 A1
`
`
`
`Page 9 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 9 of 21
`
`US 2006/O120537 A1
`
`0.4
`
`404 "L.A.I.A.A.
`
`402
`
`2
`
`2.5
`
`3
`
`3.5
`
`4
`
`4.5
`
`5
`
`5.5
`
`6
`
`6.5
`
`-0.2
`-0.4
`
`0.1
`0.05
`
`-0.05
`
`-0.1
`
`
`
`2
`
`2.5
`
`3
`
`3.5
`
`4
`
`4.5
`
`6.5
`6
`x 10'
`Time (samples at 8 kHz)
`N--
`FIG.4
`
`5
`
`5.5
`
`Page 10 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 10 of 21
`
`US 2006/O120537 A1
`
`
`
`FIG.4-A
`
`N SYN
`
`Section A-A
`
`Page 11 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 11 of 21
`
`US 2006/O120537 A1
`
`
`
`Page 12 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 12 of 21
`
`US 2006/O120537 A1
`
`
`
`AWay
`from
`Speech
`
`Towards
`Speech
`
`FIG.5
`
`Page 13 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 13 of 21
`
`US 2006/O120537 A1
`
`
`
`504
`506
`508
`
`FIG.5-A
`
`Inside Ear Canal
`520
`
`Ear
`
`In Front of Ear
`512
`
`Page 14 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 14 of 21
`
`US 2006/0120537 A1
`
`
`
`Page 15 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 15 of 21
`
`US 2006/O120537 A1
`
`
`
`FIG.6-A
`
`Page 16 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 16 of 21
`
`US 2006/O120537 A1
`
`
`
`
`
`FIG.7
`
`Mic 1 response
`
`Mic 2 body
`704
`
`Page 17 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 17 of 21
`
`US 2006/O120537 A1
`
`
`
`&
`
`s
`
`s
`
`Page 18 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 18 of 21
`
`US 2006/O120537 A1
`
`
`
`Page 19 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 19 of 21
`
`US 2006/0120537 A1
`
`
`
`0098
`000€
`
`009 Z
`
`000Z
`
`009 I
`
`000 I
`
`009
`
`(
`
`
`
`ZH). ÁðuônbºIII
`
`8’OIH
`
`(gp) apnuseW
`
`Page 20 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 20 of 21
`
`US 2006/O120537 A1
`
`Body
`
`
`
`Earbud Barrel
`
`FIG.8-B
`
`Page 21 of 31
`
`

`

`Patent Application Publication Jun. 8, 2006 Sheet 21 of 21
`
`US 2006/O120537 A1
`
`
`
`FIG.10-B
`
`Page 22 of 31
`
`

`

`US 2006/0120537 A1
`
`Jun. 8, 2006
`
`NOISE SUPPRESSING MULT-MICROPHONE
`HEADSET
`
`RELATED APPLICATIONS
`0001) This application claims the benefit of U.S. Provi
`sional Patent Application Ser. No. 60/599.468, titled “Jaw
`bone Headset’ and filed Aug. 6, 2004, which is hereby
`incorporated by reference herein in its entirety. This appli
`cation further claims the benefit of U.S. Provisional Patent
`Application Ser. No. 60/599,618, titled “Wind and Noise
`Compensation in a Headset’ and filed Aug. 6, 2004, which
`is hereby incorporated by reference herein in its entirety.
`This application is related to the following U.S. patent
`applications assigned to Aliph, of Brisbane, Calif. These
`include:
`0002 1. A unique noise suppression algorithm (refer
`ence Method and Apparatus for Removing Noise from
`Electronic Signals, filed Nov. 21, 2002, and Voice
`Activity Detector (VAD)—Based Multiple Microphone
`Acoustic Noise Suppression, filed Sep. 18, 2003)
`0003 2. A unique microphone arrangement and con
`figuration (reference Microphone and Voice Activity
`Detection (VAD) Configurations for use with Commu
`nications Systems, filed Mar. 27, 2003)
`0004 3. A unique voice activity detection (VAD) sen
`Sor, algorithm, and technique (reference Acoustic
`Vibration Sensor, filed Jan. 30, 2004, and Voice Activ
`ity Detection (VAD) Devices and Systems, filed Nov.
`20, 2003)
`0005. 4. An incoming audio enhancement system
`named Dynamic Audio Enhancement (DAE) that filters
`and amplifies the incoming audio in order to make it
`simpler for the user to better hear the person on the
`other end of the conversation (i.e. the “far end').
`0006 5. A unique headset configuration that uses sev
`eral new techniques to ensure proper positioning of the
`loudspeaker, microphones, and VAD sensor as well as
`a comfortable and stable position.
`All of the U.S. patents referenced herein are incorporated
`by reference herein in their entirety.
`
`FIELD
`0007. The disclosed embodiments relate to systems and
`methods for detecting and processing a desired signal in the
`presence of acoustic noise.
`
`BACKGROUND
`0008 Many noise suppression algorithms and techniques
`have been developed over the years. Most of the noise
`Suppression systems in use today for speech communication
`systems are based on a single-microphone spectral Subtrac
`tion technique first develop in the 1970s and described, for
`example, by S. F. Boll in “Suppression of Acoustic Noise in
`Speech using Spectral Subtraction.” IEEE Trans. on ASSP,
`pp. 113-120, 1979. These techniques have been refined over
`the years, but the basic principles of operation have
`remained the same. See, for example, U.S. Pat. No. 5,687,
`243 of McLaughlin, et al., and U.S. Pat. No. 4,811.404 of
`Vilmur, et al. Generally, these techniques make use of a
`
`microphone-based Voice Activity Detector (VAD) to deter
`mine the background noise characteristics, where “voice' is
`generally understood to include human Voiced speech,
`unvoiced speech, or a combination of Voiced and unvoiced
`speech.
`0009. The VAD has also been used in digital cellular
`systems. As an example of Such a use, see U.S. Pat. No.
`6,453.291 of Ashley, where a VAD configuration appropriate
`to the front-end of a digital cellular system is described.
`Further, some Code Division Multiple Access (CDMA)
`systems utilize a VAD to minimize the effective radio
`spectrum used, thereby allowing for more system capacity.
`Also, Global System for Mobile Communication (GSM)
`systems can include a VAD to reduce co-channel interfer
`ence and to reduce battery consumption on the client or
`subscriber device.
`0010. These typical microphone-based VAD systems are
`significantly limited in capability as a result of the addition
`of environmental acoustic noise to the desired speech signal
`received by the single microphone, wherein the analysis is
`performed using typical signal processing techniques. In
`particular, limitations in performance of these microphone
`based VAD systems are noted when processing signals
`having a low signal-to-noise ratio (SNR), and in settings
`where the background noise varies quickly. Thus, similar
`limitations are found in noise Suppression systems using
`these microphone-based VADs.
`
`FIG. 3: Flow chart of SSM sensor VAD embodi
`
`BRIEF DESCRIPTION OF THE FIGURES
`0011 FIG. 1: Overview of the Pathfinder noise suppres
`sion system.
`0012 FIG. 2: Overview of the VAD device relationship
`with the VAD algorithm and the noise Suppression algo
`rithm.
`0013)
`ment.
`0014 FIG. 4: Example of noise suppression performance
`using the SSM VAD.
`0015 FIG. 5: A specific microphone configuration
`embodiment as used with the Jawbone headset.
`0016 FIG. 6: Simulated magnitude response of a car
`dioid microphone at a single frequency.
`0017 FIG. 7: Simulated magnitude responses for Micl
`and Mic2 of Jawbone-type microphone configuration at a
`single frequency.
`0018 FIG. 1-A: Side slice view of an SSM (acoustic
`vibration sensor).
`0.019
`FIG. 2A-A: Exploded view of an SSM.
`0020 FIG. 2B-A: Perspective view of an SSM.
`0021
`FIG. 3-A: Schematic diagram of an SSM coupler.
`0022 FIG. 4-A: Exploded view of an SSM under an
`alternative embodiment.
`0023 FIG. 5-A: Representative areas of SSM sensitivity
`on the human head.
`0024 FIG. 6-A: Generic headset with SSM placed at
`many different locations.
`
`Page 23 of 31
`
`

`

`US 2006/0120537 A1
`
`Jun. 8, 2006
`
`FIG. 7-A: Diagram of a manufacturing method
`0.025
`that may be used to construct an SSM.
`0026 FIG. 8: Diagram of the magnitude response of the
`FIR highpass filter used in the DAE algorithm to increase
`intelligibility in high-noise acoustic environments.
`0027 FIG. 1-B: Perspective view of an assembled Jaw
`bone earpiece.
`0028 FIG. 2-B: Perspective view of other side of Jaw
`bone earpiece.
`FIG.3-B: Perspective view of assembled Jawbone
`0029)
`earpiece.
`0030 FIG. 4-B: Perspective Exploded and Assembled
`view of Jawbone earpiece.
`0031 FIG. 5-B: Perspective exploded view of torsional
`spring-loading mechanism of Jawbone earpiece.
`0032 FIG. 6-B: Perspective view of control module.
`0033 FIG. 7-B: Perspective view of microphone and
`sensor booty of Jawbone earpiece.
`0034 FIG. 8-B: Top view orthographic drawing of head
`set on ear illustrating the angle between the earloop and
`body of Jawbone earpiece.
`0035 FIG.9-B: Top view orthographic drawing of head
`set on ear illustrating forces on earpiece and head of user.
`0036 FIG. 10-B: Side view orthographic drawing of
`headset on ear illustrating force applied by earpiece to pinna.
`
`DETAILED DESCRIPTION
`The Pathfinder Noise Suppression System
`0037 FIG. 1 is a block diagram of the Pathfinder noise
`suppression system 100 including the Pathfinder noise Sup
`pression algorithm 101 and a VAD system 102, under an
`embodiment. It also includes two microphones MIC 1110
`and MIC 2112 that receive signals or information from at
`least one speech source 120 and at least one noise Source
`122. The path s(n) from the speech source 120 to MIC 1 and
`the path n(n) from the noise source 122 to MIC 2 are
`considered to be unity. Further, H (Z) represents the path
`from the noise source 122 to MIC 1, and H2(z) represents the
`path from the signal source 120 to MIC 2.
`0038 AVAD signal 104, derived in some manner, is used
`to control the method of noise removal, and is related to the
`noise Suppression technique discussed below as shown in
`FIG. 2. A preview of the VAD technique discussed below
`using an acoustic transducer (called the Skin Surface Micro
`phone, or SSM) is shown in FIG. 3. Referring back to FIG.
`1, the acoustic information coming into MIC 1 is denoted by
`m(n). The information coming into MIC 2 is similarly
`labeled m(n). In the Z (digital frequency) domain, we can
`represent them as M(z) and M(z). Thus
`
`(1)
`M(z)=N(z)+S(z)H, (z)
`0039. This is the general case for all realistic two-micro
`phone systems. There is always some leakage of noise into
`MIC 1, and some leakage of signal into MIC 2. Equation 1
`has four unknowns and only two relationships and, there
`fore, cannot be solved explicitly. However, perhaps there is
`
`Some way to solve for some of the unknowns in Equation 1
`by other means. Examine the case where the signal is not
`being generated, that is, where the VAD indicates voicing is
`not occurring. In this case, s(n)=S(Z)=0, and Equation 1
`reduces to
`
`where the n subscript on the M variables indicate that only
`noise is being received. This leads to
`
`M(z) = M2, (3)H (3)
`M
`
`H. (c) = E.
`
`(2)
`
`0040. Now, H (Z) can be calculated using any of the
`available system identification algorithms and the micro
`phone outputs when only noise is being received. The
`calculation should be done adaptively in order to allow the
`system to track any changes in the noise.
`0041 After solving for one of the unknowns in Equation
`1, H(z) can be solved for by using the VAD to determine
`when voicing is occurring with little noise. When the VAD
`indicates voicing, but the recent (on the order of 1 second or
`so) history of the microphones indicate low levels of noise,
`assume that n(s)=N(Z)-0. Then Equation 1 reduces to
`
`which in turn leads to
`
`M2(3)
`H; (c) =
`i
`
`This calculation for H(z) appears to be just the inverse of
`the H (Z) calculation, but remember that different inputs are
`being used. Note that H(z) should be relatively constant, as
`there is always just a single source (the user) and the relative
`position between the user and the microphones should be
`relatively constant. Use of a small adaptive gain for the
`H(Z) calculation works well and makes the calculation
`more robust in the presence of noise.
`0042 Following the calculation of H (Z) and H(Z)
`above, they are used to remove the noise from the signal.
`Rewriting Equation 1 as
`
`allows solving for S(Z)
`
`M(z) - M2(3)H (3)
`(3) =
`it
`
`(3)
`
`Page 24 of 31
`
`

`

`US 2006/0120537 A1
`
`Jun. 8, 2006
`
`Generally, H(Z) is quite Small, and H (Z) is less than unity,
`So for most situations at most frequencies
`H(z)H, (z)<<1,
`and the signal can be estimated using
`(4)
`S(z)sM(z)-M, (z)H, (z)
`Therefore the assumption is made that H(Z) is not needed,
`and H (Z) is the only transfer function to be calculated.
`While H(z) can be calculated if desired, good microphone
`placement and orientation can obviate the need for the H(Z)
`calculation.
`0043. Significant noise suppression can best be achieved
`through the use of multiple Subbands in the processing of
`acoustic signals. This is because most adaptive filters used to
`calculate transfer functions are of the FIR type, which use
`only Zeros and not poles to calculate a system that contains
`both Zeros and poles as
`
`B(3)
`H. (3) Model A(g)
`
`Such a model can be sufficiently accurate given enough taps,
`but this can greatly increases computational cost and con
`vergence time. What generally occurs in an energy-based
`adaptive filter system Such as the least-mean squares (LMS)
`system is that the system matches the magnitude and phase
`well at a small range of frequencies that contain more energy
`than other frequencies. This allows the LMS to fulfill its
`requirement to minimize the energy of the error to the best
`of its ability, but this fit may cause the noise in areas outside
`of the matching frequencies to rise, reducing the effective
`ness of the noise Suppression.
`0044) The use of subbands alleviates this problem. The
`signals from both the primary and secondary microphones
`are filtered into multiple subbands, and the resulting data
`from each subband (which can be frequency shifted and
`decimated if desired, but it is not necessary) is sent to its own
`adaptive filter. This forces the adaptive filter to try to fit the
`data in its own Subband, rather than just where the energy is
`highest in the signal. The noise-suppressed results from each
`subband can be added together to form the final denoised
`signal at the end. Keeping everything time-aligned and
`compensating for filter shifts is essential, and the result is a
`much better model to the system than the single-subband
`model at the cost of increased memory and processing
`requirements.
`0045 An example of the noise suppression performance
`using this system with an SSM VAD device is shown in
`FIG. 4. In the top plot is the original noisy acoustic signal
`402 and the SSM-derived VAD signal 404, the middle plot
`displays the SSM signal as taken on the cheek 412, and the
`bottom plot the cleaned signal after noise Suppression 422
`using the Pathfinder algorithm outline above.
`0046) More information may be found in the applications
`referenced above in the Introduction, part 1.
`0047 Microphone Configuration
`0.048. In an embodiment of the Pathfinder noise suppres
`sion system, unidirectional or omnidirectional microphones
`may be employed. A variety of microphone configurations
`
`that enable Pathfinder are shown in the references in the
`Introduction, part 2. We will examine only a single embodi
`ment as implemented in the Jawbone headset, but many
`implementations are possible as described in the references
`cited in the Introduction, so we are not so limited by this
`embodiment.
`0049. The use of directional microphones has been very
`Successful and is used to ensure that the transfer functions
`H (Z) and H(z) remain significantly different. If they are too
`similar, the desired speech of the user can be significantly
`distorted. Even when they are dissimilar, Some speech signal
`is received by the noise microphone. If it is assumed that
`H(Z)=0, then, as in Equation 4 above, even assuming a
`perfect VAD there will be some distortion. This can be seen
`by referring to Equation 3 and solving for the result when
`H(Z) is not included:
`
`This shows that the signal will be distorted by the factor
`1-H(Z)H, (Z). Therefore, the type and amount of distor
`tion will change depending on the noise environment. With
`very little noise, H (Z) is nearly Zero and there is very little
`distortion. With noise present, the amount of distortion may
`change with the type, location, and intensity of the noise
`Source(s). Good microphone configuration design mini
`mizes these distortions.
`0050. An embodiment of an appropriate microphone con
`figuration is one in which two directional microphones are
`used as shown in configuration 500 in FIG. 5. The relative
`angle f between vectors normal to the faces of the micro
`phones is in a range between 60 and 135 degrees. The
`distances d and d are each in the range of Zero (0) to 15
`centimeters, with best performance coming with distances
`between 0 and 2 cm. This configuration orients one the
`speech microphone, termed MIC1 above, toward the user's
`mouth, and the noise microphone, termed MIC2 above,
`away from the user's mouth. Assuming that the two micro
`phones are identical in terms of spatial and frequency
`response, changing the value of the angle f will change the
`overlap of the responses of the microphones. This is dem
`onstrated in FIG. 6 and FIG. 7 for cardioid microphones. In
`FIG. 6, a simulated spatial response at a single frequency is
`shown for a cardioid microphone. The body of the micro
`phone is denoted by 602, the response by 610, the null of the
`response by 612, and the maximum of the response by 614.
`In FIG. 7, the responses of two cardioid microphones are
`shown with f-90 degrees. The responses overlap, and where
`the response of Mic1 is greater than that of Mic2 the gain G
`
`M1(3)
`M2(3)
`
`is greater than 1 (730), and where the response of Mic1 is
`less than Mic2 G is less than 1 (720). Clearly as the angle
`fbetween the microphones is varied, the amount of overlap
`and thus the areas where G is greater or less than one varies
`as well. This variation affects the noise Suppression perfor
`mance both in terms of the amount of noise Suppression and
`the amount of speech distortion, and a good compromise
`between the two must be found by adjusting funtil satis
`factory performance is realized.
`
`Page 25 of 31
`
`

`

`US 2006/0120537 A1
`
`Jun. 8, 2006
`
`0051. In addition, the overlap of microphone responses
`can be induced or further changed by the addition of front
`and rear vents to the microphone mount. These vents change
`the response of the microphone by altering the delay
`between the front and rear faces of the diaphragm. Thus,
`vents can be used to alter the response overlap and thereby
`change the denoising performance of the system.
`Design Tips:
`0.052 A good microphone configuration can be difficult
`to construct. The foundation of the process is to use two
`microphones that have similar noise fields and different
`speech fields. Simply put, to the microphones the noise
`should appear to be about the same and the speech should be
`different. This similarity for noise and difference for speech
`allows the algorithm to remove noise efficiently and remove
`speech poorly, which is desired. Proximity effects can be
`used to further increase the noise/speech difference (NSD)
`when the microphones are located close to the mouth, but
`orientation is the primary difference vehicle when the micro
`phones are more than about five to ten centimeters from the
`mouth. The NSD is defined as the amount of difference in
`the speech energy detected by the microphones minus the
`difference in the noise energy in dB. NSDs of 4-6 dB result
`in both good noise Suppression and low speech distortion.
`NSDs of 0-4 dB result in excellent noise suppression but
`high speech distortion, and NSDs of 6+dB result in good to
`poor noise Suppression and very low speech distortion.
`Naturally, since the response of a directional microphone is
`directly related to frequency, the NSD will also be frequency
`dependent, and different frequencies of the same noise or
`speech may be denoised or devoiced by different amounts
`depending on the NSD for that frequency.
`0053 Another very important stipulation is that there
`should be little or no noise in Mic1 that is not detected in
`some way by Mic2. In fact, generally, the closer the levels
`(energies) of the noise in Mic1 and Mic2, the better the noise
`suppression. However, if the speech levels are about the
`same in both microphones, then speech distortion due to
`de-voicing will also be high, and the overall increase in SNR
`may below. Therefore it is crucial that the noise levels be as
`similar as possible while the speech levels are as different as
`possible. It is normally not possible to simultaneously mini
`mize noise differences while maximizing speech differences,
`so a compromise must be made. Experimentation with a
`configuration can often yield one that works reasonably well
`for noise Suppression and acceptable speech distortion.
`In Summary, the design process rules can be stated as
`follows:
`0054) 1. The noise energy should be about the same in
`both microphones
`0.055
`2. The speech energy has to be different in the
`microphones
`0056 3. Take advantage of proximity effect to maxi
`mize NSD
`0057 4. Keep the distance between the microphones as
`Small as practical
`0.058
`5. Use venting effects on the directionality of the
`microphones to get the NSD to around 4-6 dB
`0059. In the configuration above, the amount of response
`overlap, and therefore the angle between the axes of the
`
`microphones f will depend on the responses of the micro
`phones as well as mounting and venting of the microphones.
`However, a useable configuration is readily found through
`experimentation.
`0060. The microphone configuration implementation
`described above is a specific implementation of one of many
`possible implementations, but the scope of this application
`is not so limited. There are many ways to specifically
`implement the ideas and techniques presented above, and
`the specified implementation is simply one of many that are
`possible. For example, the references cited in the Introduc
`tion contain many different variations on the configuration
`of the microphones.
`0061 VAD Device
`0062) The VAD device for the Jawbone headset is based
`upon the references given in the Introduction part 3. It is an
`acoustic vibration sensor, also referred to as a speech sensing
`device, also referred to as a Skin Surface Microphone
`(SSM), and is described below. The acoustic vibration
`sensor is similar to a microphone in that it captures speech
`information from the head area of a human talker or talker
`in noisy environments. However, it is different than a
`conventional microphone in that it is designed to be more
`sensitive to speech frequencies detected on the skin of the
`user than environmental acoustic noise. This technique is
`normally only successful for a limited range of frequencies
`(normally ~100 Hz to 1000 Hz, depending on the noise
`level), but this is normally sufficient for excellent VAD
`performance.
`0063 Previous solutions to this problem have either been
`Vulnerable to noise, physically too large for certain appli
`cations, or cost prohibitive. In contrast, the acoustic vibra
`tion sensor described herein accurately detects and captures
`speech vibrations in the presence of substantial airborne
`acoustic noise, yet within a smaller and cheaper physical
`package. The noise-immune speech information provided by
`the acoustic vibration sensor can Subsequently be used in
`downstream speech processing applications (speech
`enhancement and noise Suppression, speech encoding,
`speech recognition, talker verification, etc.) to improve the
`performance of those applications.
`0064. The following description provides specific details
`for a thorough understanding of, and enabling description
`for, embodiments of a transducer. However, one skilled in
`the art will understand that the invention may be practiced
`without these details. In other instances, well-known struc
`tures and functions have not been shown or described in
`detail to avoid unnecessarily obscuring the description of the
`embodiments of the invention.
`0065 FIG. 1-A is a cross section view of an acoustic
`vibration sensor 100, also referred to herein as the sensor
`100, under an embodiment. FIG. 2A-A is an exploded view
`of an acoustic vibration sensor 100, under the embodiment
`of FIG. 1-A. FIG. 2B-B is perspective view of an acoustic
`vibration sensor 100, under the embodiment of FIG. 1-A.
`The sensor 100 includes an enclosure 102 having a first port
`104 on a first side and at least one second port 106 on a
`second side of the enclosure 102. A diaphragm 108, also
`referred to as a sensing diaphragm 108, is positioned
`between the first and second ports. A coupler 110, also
`referred to as the shroud 110 or cap 110, forms an acoustic
`
`Page 26 of 31
`
`

`

`US 2006/0120537 A1
`
`Jun. 8, 2006
`
`seal around the enclosure 102 so that the first port 104 and
`the side of the diaphragm facing the first port 104 are
`isolated from the airborne acoustic environment of the
`human talker. The coupler 110 of an embodiment is con
`tiguous, but is not so limited. The second port 106 couples
`a second side of the diaphragm to the external environment.
`0.066 The sensor also includes electret material 120 and
`the associated components and electronics coupled to
`receive acoustic signals from the talker via the coupler 110
`and the diaphragm 108 and convert the acoustic signals to
`electrical signals. Electrical contacts 130 provide the elec
`trical signals as an output. Alternative embodiments can use
`any type/combination of materials and/or electronics to
`convert the acoustic signals to electrical signals and output
`the electrical signals.
`0067. The coupler 110 of an embodiment is formed using
`materials having acoustic impedances similar to the imped
`ance of human skin (the characteristic acoustic impedance of
`skin is approximately 1.5x10 Paxs/m). The coupler 110
`therefore, is formed using a material that includes at least
`one of silicone gel, dielectric gel, thermoplastic elastomers
`(TPE), and rubber compounds, but is not so limited. As an
`example, the coupler 110 of an embodiment is formed using
`Kraiburg TPE products. As another example, the coupler 110
`of an embodiment is formed using Sylgard(R) Silicone prod
`uctS.
`0068 The coupler 110 of an embodiment includes a
`contact device 112 that includes, for example, a nipple or
`protrusion that protrudes from either or both sides of the
`coupler 110. In operation, a contact device 112 that pro
`trudes from both sides of the coupler 110 includes one side
`of the contact device 112 that is in contact with the skin
`surface of the talker and another side of the contact device
`112 that is in contact with the diaphragm, but the embodi
`ment is not so limited. The coupler 110 and the contact
`device 112 can be formed from the same or different
`materials.
`0069. The coupler 110 transfers acoustic energy effi
`ciently from skin/flesh of a talker to the diaphragm, and seals
`the diaphragm from ambient airborne acoustic signals. Con
`sequently, the coupler 110 with the contact device 112
`efficiently transfers acoustic signals directly from the talk
`er's body (speech vibrations) to the diaphragm while iso
`lating the diaphragm from acoustic signals in the airborne
`environment of the talker (characteristic acoustic impedance
`of air is approximately 415 PaxS/m). The diaphragm is
`isolated from acoustic signals in the airborne environment of
`the talker by the coupler 110 because the coupler 110
`prevents the signals from reaching the diaphragm, thereby
`reflecting and/or dissipating much of the energy of the
`acoustic signals in the airborne environment. Consequently,
`the sensor 100 responds primarily to acoustic energy trans
`ferred from the skin of the talker, not air. When placed
`against the head of the talker, the sensor 100 picks up
`speech-induced acoustic signals on the Surface of the skin
`while airborne acoustic noise signals are largely rejected,
`thereby increasing the signal-to-noise ratio and providing a
`very reliable source of speech information.
`0070 Performance of the sensor 100 is enhanced through
`the use of the seal provided between the diaphragm and the
`airborne environment of the talker. The seal is provided by
`the coupler 110. A modified gradient microphone is used in
`
`an embodiment because it has pressure ports on both ends.
`Thus, when the first port 104 is sealed by the coupler 110.
`the second port 106 provides a vent for air movement
`through the sensor 100. The second port is not required for
`operation, but does increase the sensitivity of the device to
`tissue-borne acoustic signals. The second port also allows
`more environmental acoustic noise to be detected by the
`device, but the device's diaphragm’s sensitivity to environ
`mental acoustic noise is significantly decreased by the
`loading of the coupler 110, so the increase in sensitivity to
`the user's speech is greater than the increase in sensitivity to
`environmental noise.
`0071
`FIG. 3-A is a schematic diagram of a coupler 110
`of an acoustic vibration sensor, under the embodiment of
`FIG. 1-A. The dimensions shown are in millimeters and are
`only intended to serve as an example for one embodiment.
`Alternative embodiments of the coupler can have different
`configurations and/or dimensions. The dimensions of the
`coupler 110 show that the acoustic vibration sensor 100 is
`small (5-7mm in diameter and 3-5 mm thick on average) in
`that the sensor 100 of an embodiment is approximately the
`same size as typical microphone capsules found in mobile
`communication devices. This small form factor allows for
`use of the sensor 110 in highly mobile miniaturized appli
`cations, where some example applications include at least
`one of cellular telephones, satellite telephones, portable
`telephones, wireline telephones, Internet telephones, wire
`less transceivers, wireless communication

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket