throbber
IEEE GLOBECOM 1996
`
`LONDON
`
`November 18 - 22, 1996
`
`
`
`GLOBECOM 96 Lomlon
`
`Communications: The Key to Global Prosperity
`
`GLOBAL |NTERNET'96
`
`Conference Record
`
`Sponsored by the IEEE Communications Society, the IEE and
`
`the UKRI Communications Chapter
`
`ICC
`
`GLOBECOM !!flIITTff!| E
`IEE O
`@@
`
`RPXExhibit 1033
`
`RPX Exhibit 1033
`RPX v. DAE
`
`

`
`IEEE 1996 Global Telecommunications Conference
`
`IEEE catalog number:
`
`96CH35942
`
`ISBN numbers:
`
`softbound edition:
`
`0-7803-3336-5
`
`casebound edition:
`
`0-7803-3337-3
`
`microfiche edition:
`
`0-7803-3338-1
`
`Library of Congress number: 87-640337
`
`Copyrights and Reprint Permissions:
`
`Abstracting is permitted with credit to the source. Libraries are permitted to photocopy beyond the limits
`of U.S. copyright laws for private use of patrons those articles in this volume that carry a code at the
`bottom of the first page, provided the per-copy fee indicated in the code is paid through the
`
`Copyright Clearance Center
`20 Congress Street
`Salem, Massachusetts 01970, USA
`
`Instructors are permitted to photocopy isolated articles for non-commercial classroom use without fee.
`For other copying, reprint, or republications permission, write to
`
`Director, Publishing Services
`IEEE, 445 Hoes Lane, P.O. Box 1331
`
`Piscataway, NJ 08855-1331, USA
`
`All rights reserved. Copyright © 1996 by The Institute of Electrical and Electronics Engineers, Inc.
`
`Additional Proceedings may be ordered from:
`IEEE Service Center
`
`Publications Sales Department
`445 Hoes Lane, P.O. Box 1331
`
`Piscataway, NJ 08855-1331, USA
`
`

`
`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`

`
`49
`
`of the A,, in the output signal, the candidate for A,,
`is chosen from the tolerance interval, whose leftmost
`L’ samples resemble as close as possible AL, where L’
`denotes the length of AL. A computationally efficient
`similarity measure, the cross-correlation coeflicient
`
`L’—1
`
`cc(1/, 6) = Z a:(tx,,_1 + T3, + k + 6) -a:(t$,,-1 + Ty + k)
`k=0
`
`is used
`
`Then the desired time instance is
`
`t3“, = tw,,_1 + T3 + 6*,
`
`6* = argmax(;{cc(z/,
`
`The output step size is fixed to T = %L (L being the
`length of the A,,) and a fixed length hanning window
`is employed for overlap.
`
`input signal x
`search region for ix,
`Txi‘5max'/ Ty
`A1'
`’x1*Txl‘5max Ty*’x1 A2‘
`
`
`
`‘J for comparision with A1
`/ and A2, used to find the
`best positions ix, and txg
`
`pitch period of
`audio signal
`
`
`
`2 Audio Packet Loss Recovery
`through Waveform Substitution
`
`Previous proposals for receiver-only concealment of lost
`
`audio packets substitute the missing signal segment by
`repeating a prior segment. The “Pattern Matching”
`([6],
`technique repeats a correctly received signal
`segment, of which maximum similarity with the lost
`
`segment is assumed. This is accomplished by match-
`ing a sample pattern immediately preceding the gap to
`series of samples received earlier. As entire signal seg-
`ments of at least one packet duration are completely
`repeated, this may cause echoing sounds.
`
`Echoes can be avoided by “Pitch Waveform Replica-
`tion” ([6], [7], [10]), where only one pitch period found in
`the most recently received packet is repeated through—
`out the missing packet. An extension of this technique,
`called “Phase Matching”, provides for synchronization
`
`on both edges of the substitute, reducing a clicking
`distortion caused by the other two methods
`In
`the latter two cases the multiple repetition of the same
`small signal segment can lead to tinny sounds.
`
`All of these techniques have in common that, with
`
`an increased length of the lost segment, the perceived
`quality deteriorates severely. We believe that this de-
`crease in quality is to a large extent due to the spe-
`cific distortions introduced by the different concealment
`techniques. Therefore, in the following sections a new
`method, based on time-scale modification, is proposed.
`
`3 Description of the new method
`
`3.1 Time-scale Modification with
`
`the
`
`Figure 2: Enhanced speed version of the WSOLA algo-
`rithm.
`
`WSOLA Algorithm
`
`An appropriate algorithm must perform time-scale
`modification in real-time, and it may not change the
`pitch frequency to preserve natural sound. The “Wave-
`form Similarity Overlap-Add”
`(WSOLA) algorithm
`presented in [9] meets these requirements.
`
`As shown in Fig. 2, time—scale expansion is achieved by
`extracting segments A,, from the input signal at time
`instances tan} and superimposing them in the output
`
`signal at larger spaced time instances ty,,. The time in-
`stances tm, are selected from a tolerance interval around
`
`t$,,_1 + Tw, where the input step size Tm controls the
`
`time—scaling factor. To achieve a synchronized overlap
`
`3.2 Selection of Parameters
`
`1 shows the principle of error concealment using
`Fig.
`time-scale modification of correctly received packets.
`Distortions due to discontinuities at the boundary of
`the substituted and the next received packet are re-
`duced by overlapping them in the range of 2 ms using
`hanning windows.
`
`As the WSOLA algorithm was originally developed for
`time—scale modification of long speech sequences,
`the
`parameters Tx, 6, Ty and the lengths of A,, and AZ,
`
`

`
`50
`
`must be chosen carefully to preserve the good sound
`
`quality when WSOLA is applied to short signal seg-
`
`certain number of packets, but to withhold these pack-
`ets from the playout buffer.
`
`ments as shown in Fig. 1.
`
`least one pitch period, so that
`include at
`AL must
`the correct synchronization can be found (L’ 2 T,,,m,m,
`where Tp7ma$ denotes the maximum pitch period of a
`speech signal).
`In contrast to [9], we chose the maxi-
`mum possible interval [—T$, Ty — T3,] for the parameter
`6. The length of this interval must as well include at
`least one pitch period (Ty = L/ 2 2 Tpymax). We obtain
`L’ 2 L/ 2 for the relation of the lengths of A,, and AL.
`In our experiments we set L’ = L / 2 ([9]: L’ = L), which
`results in an enhanced speed version (Fig. 2) with no
`
`pereeivable deterioration of the output signal.
`
`In addition, there is the following relation between the
`length of the input series of samples lm and the estimate
`1;” of the number of samples used by the algorithm,
`
`!
`lg” z (N—— 1)Tx +L 3 lm
`
`where N is the number of extracted segments (0 _<_ 1/ <
`N). Another constraint exists between the number of
`samples lam; needed to replace the missing speech seg-
`ment and the length of the output series of samples lfmt
`received from the algorithm:
`
`llout:(N—1)Ty+L:(N+1)
`
`For long speech sequences we have L << lm, so the time-
`scale ratio B = Ty/Tm is approximately equal to the
`expansion ratio 04 = lout/l,-,1. However, for very small
`input lengths the above condition doesn’t hold, which
`results in ,6‘ >> a. This implies extreme time expansion
`
`and poor output quality. Thus it is necessary to chose oz
`as a compromise of output quality and additional delay
`
`introduced. Additionally, an optimization algorithm for
`out _
`the parameter 5
`'
`lam might be employed to
`minimize loss of information. A further suggestion for
`the choice of these parameters can be found in [11].
`
`3.3 Discussion of Delay and Computational
`Complexity
`
`While other waveform substitution techniques repeat in-
`
`formation from correct packets, our new scheme also
`
`modifies these packets themselves. Thus the computa-
`tional effort to execute the algorithm is higher. Addi-
`tionally,
`it. is necessary not only to keep a copy of a
`
`Considering the parameter relations described in the
`
`previous paragraph, we chose to conceal one lost packet
`(160 samples PCM 8 kHz (G.711) ——> 20ms audio) with
`three following packets. Four packets are needed to
`recover from the loss of two consecutive packets.
`In
`
`Fig. 1, packets 1 — 5 have to be queued for WSOLA
`and packets 1 — 3 for the other techniques, if overlap-
`add with correct neighbor packets is performed (for the
`other algorithms, copies of packets preceding packet 1
`
`have to be stored dependent on the length of the search
`area). In communication systems that artificially intro-
`duce audio delay, such as video conferencing systems
`([12]), the extra delay is acceptable. Furthermore, au-
`dio tools like vat and Ne VoT include delay adaptation
`mechanisms ([13]), which add some delay to cope with
`packet delay jitter.
`
`twice the number of additions
`In the above example,
`and four times the number of multiplications have to be
`performed, compared to the other algorithms (see [14]
`for a detailed analysis). Taking into account the abso-
`lute number of operations (m 8- 104 multiplications and
`z 4- 104 additions for the example) and the execution
`time usually available (because of delay adaptation) as
`well as the computing power of today’s average hard-
`ware, the computational effort can still be considered
`as low.
`
`4 Experimental Results
`
`4.1 Test Environment
`
`Four speech signals with different pitch frequencies (two
`male and two female speakers), sampled at 8 kHz,
`were used. The new Time—scale Modification technique
`(TM) was compared to Silence Substitution (S), Pat-
`tern Matching (PM) and Pitch Waveform Replication
`(PWR) using 40 test conditions of 10 seconds each.
`Thirteen non-expert listeners were asked to judge over-
`all quality on a five—category (Mean Opinion Score)
`scale, comparable to test schemes used in [1],
`[2],
`[7]
`and
`Additionally the presence of the disturbance
`components “tinny, metal”, “interrupted, clicking” and
`“echoing, reverberating” for each condition was judged.
`
`loss by suppressing one
`We modelled single packet
`packet within five (“1 of 5”) and double packet loss by
`
`

`
`suppressing two packets within six (“2 of 6”). This de-
`terministic order of dropping packets provides for equal
`comparison conditions. As the sound quality after error
`concealment heavily depends on which information has
`been lost, it is important to suppress the same audio
`
`segments over all comparisons.
`
`“Anchoring” (introducing the quality scale) was done
`by presenting the original signal and a “worst case”
`signal (which contained all disturbance components) to
`the listenener.
`Immediately after that,
`the test sig-
`nals for one speaker were presented to the listeners in
`a random sequence, with pauses of only a few seconds
`between two signals.
`
`4.2 Results
`
`of Subjective
`
`Performance
`
`Tests
`
`Mean Opinion Scores (MOS) were calculated for each
`of the four test signals. Fig. 3 shows the average of
`these values for the different algorithms. A quality en-
`hancement of TM compared to all other techniques can
`be observed (especially for “2 of 6”).
`
`-1015
`
`Elzote
`
`51
`
`percentage of listeners
`100
`
`tinny/metal
`echoing/reverberating
`interrupted/clicking
`
`B0
`
`60
`
`40
`
`Figure 4: Disturbance components
`
`is achieved.
`
`In addition to the subjective performance tests, mea-
`surements in the Internet ([14]), comparable to those
`in [15], were carried out. These measurements showed
`a loss distribution, which allows the application of the
`TM error concealment in the range of 55% to 75% of
`all “loss events” (2 1 consecutive packets are lost).
`
`5 Conclusions
`
`A new error concealment technique for lost audio pack-
`ets based on time-scale modification has been proposed,
`
`and the parameters of the WSOLA time—scaling algo-
`rithm have been adapted to the context of short signal
`segments. Experiments show that typical disturbance
`components of other techniques are reduced and overall
`quality is improved.
`
`worst
`case
`
`PM
`
`PWR
`
`TM
`
`original
`
`References
`
`Figure 3: Mean Opinion Score of overall quality.
`
`Fig. 4 summarizes, how often a specific disturbance
`component was noticed.
`It is shown that the echoing
`sound produced by PM is eliminated completely and
`
`the tinny sound of PWR is reduced by the new TM
`technique. The component “interrupted/clicking” was
`estimated about the same for all techniques and might
`further be improved for TM,
`if phase matching ([8])
`would be incorporated into the TM technique. This
`
`in
`can be done by controlling the time instances tan,
`the WSOLA algorithm such that phase matching be-
`tween the right boundary of the stretched signal and
`the left boundary of the next correctly received packet
`
`[1]
`
`J. Suzuki and M. Taka. Missing packet recovery
`techniques for low-bit—rate coded speech.
`IEEE
`Journal on Selected Areas in Communications,
`SAC—7(5):707—717, June 1989.
`
`[2] M. Yong. Study of voice packet reconstruction
`methods applied to CELP speech coding. In Pro-
`ceedings ICASSP-.92, pages II/ 125-128, March
`1992.
`
`[3]
`
`L.A. DaSilva, D.W. Petr, and V.S. Frost. A class-
`oriented replacement
`technique for lost speech
`packets. IEEE Transactions on Acoustics, Speech
`
`and Signal Processing, ASSP-37(10):1597-1600,
`October 1989.
`
`

`
`52
`
`[14]
`
`H. Sanneck. Fehlerverschleierungsverfahren fur
`
`Sprachiibertragung mit Paketverlust. Master’s
`thesis, Telecommunications Department, Uni-
`versity of Erlangen—Nuremberg, Erlangen, June
`1995.
`
`[15]
`
`J.—C. Bolot, H. Crépin, and A.V. Garcia. Analysis
`of audio packet loss in the Internet.
`In Proceed-
`
`ings of the 5th International Workshop on Net—
`work and Operating System Support for Digital
`Audio and Video, pages 163-174, Durham, NH,
`April 1995.
`
`Effects
`Jayant and S.W. Christensen.
`N.S.
`of packet
`losses
`in waveform coded speech
`and improvements due to an odd—even sample-
`
`IEEE Transactions on
`interpolation procedure.
`Communications, COM-29(2):101—109, February
`1981.
`
`V. Hardman, M. Sasse, M. Handley and A. Wat-
`son. Reliable Audio for Use over the Internet In
`
`Proceedings Inet 95, http://info.isoc.org/HMP/
`PAPER/070/abst.html
`
`D.J. Goodman, G.B. Lockhart, O.J. Wasem, and
`
`W. Wong. Waveform substitution techniques
`for recovering missing speech segments in packet
`voice communications.
`IEEE Transactions on
`
`Acoustics, Speech and Signal Processing, ASSP—
`
`34(6):1449—1464, December 1986.
`
`D.J. Goodman, O.J. Wasem, C.A. Dvorak, and
`
`H.G. Page. The effect of waveform substitution
`on the quality of PCM packet communications.
`IEEE Transactions on Acoustics, Speech and Sig~
`nal Processing, ASSP—36(3):342—348, March 1988.
`
`R.A. Valenzuela and C.N. Animalu. A new voice
`
`In Proceedings
`packet reconstruction technique.
`ICASSP-89, pages 1334-1336, May 1989.
`
`W. Verhelst and M. Roelands.
`
`An overlap-
`
`add technique based on waveform similarity
`(WSOLA) for high quality time—scale modifica-
`tion of speech. In Proceedings ICASSP—93, pages
`554-557, April 1993.
`
`[101
`
`L.R. Rabiner and R.W. Schafer. Digital Process-
`
`ing Of Speech Signals. Prentice Hall, Englewood
`Cliffs 1978.
`
`[11]
`
`A. Stenger, K. Ben Younes, R. Reng and B.
`Girod. A new Error Concealment Technique for
`
`Audio Transmission with Packet Loss to be pub-
`lished in EUSIPCO 96.
`
`[12]
`
`[13]
`
`International Telecommunication Union. Narrow-
`
`Band Visual Telephone Systems and Terminal
`
`Equipment.
`
`ITU-T Recommendation H.320,
`
`Helsinki, March 1993.
`
`H. Schulzrinne. Voice communication across the
`
`internet: A network voice terminal. Technical
`
`Report TR 92-50, Dept. of Computer Science,
`University of Massachusetts, Amherst, MA, July
`1992.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket