`513 Emerson Street
`Pittsburgh, PA 15206
`Cell phone: (412) 916-7386
`Richard M. Stern, Jr.
`Department of Electrical and Computer Engineering
`Carnegie Mellon University
`Pittsburgh, PA 15213
`Phone: (412) 268-2535
`Email: rms@cmu.edu
`Citizenship: U.S.A.
`Ph.D. (1977)
`Automatic speech recognition, auditory perception, acoustics,
`signal processing, biomedical instrumentation
`Electrical Engineering and Computer Science
`Massachusetts Institute of Technology, Cambridge, MA
`M.S. (1972)
`S.B. (1970)
`1995 - present
`1988 - present
`2009 - present
`1995 - 2003
`1982 - 1995
`Electrical Engineering and Computer Sciences University of
`California, Berkeley, CA
`Electrical Engineering
`Massachusetts Institute of Technology, Cambridge, MA
`Professor of Electrical and Computer Engineering Carnegie
`Mellon University.
`Associate Professor and Professor by Courtesy, Language
`Technologies Institute, Computer Science Department, Biomedical
`Engineering Department
`Artist Lecturer, School of Music
`Carnegie Mellon University
`Associate Director of the Information Networking Institute
`Carnegie Mellon University
`Associate Professor of Electrical and Biomedical Engineering
`Carnegie Mellon University
`Visiting Professor in Speech and Communication Sciences,
`Nippon Telegraph and Telephone Electrical Communications Laboratory,
`Tokyo, Japan
`1977 - 1982
`Assistant Professor of Electrical and Biomedical Engineering
`Carnegie Mellon University
`Amazon v. Jawbone
`U.S. Patent 8,280,072
`Amazon Ex. 1014
`Richard M. Stern, Jr.
`1979 - 1981
`1973 - 1976
`Page 2
`Adjunct Assistant Professor of Otolaryngology
`University of Pittsburgh School of Medicine
`Teaching and Research Assistant, Department of Electrical Engineering,
`Massachusetts Institute of Technology
`Distinguished Lecturer, International Speech Communication Association, 2008-2009.
`General Chair, INTERSPEECH International Conference on Spoken Language Processing, September,
`Technical Program Co-Chair, IEEE Workshop on Automatic Speech Recognition and Understanding,
`December 2005.
`Technical Program Chair, 141st meeting of the Acoustical Society of America, June 2002.
`General Chair, DARPA Spoken Language Technologies Workshop, March, 1994.
`Publications Chair, ARPA Spoken Language Technology and Applications Day, April, 1993.
`Publications Chair, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October,
`Chair, standing DARPA Speech and Natural Language Workshop Organizing Committee, 1991 1992.
`Secretary, ARPA Spoken Language Coordinating Committee, 1990 - 1995.
`General Chair, DARPA Speech and Natural Language Workshop, June, 1990.
`International Advisory Board, International Speech Communication Association, 2006 - present.
`International Advisory Board, Center for Speech and Language Technologies, Tsinghua University, Beijing,
`China, 2007 - 2010.
`Chair, Selection Committee for IEEE James L. Flanagan Speech & Audio Processing Award, 2006 - 2008.
`IEEE Signal Processing Society Technical Committee on Audio and Electroacoustics, 1991 1995.
`IEEE Signal Processing Society Technical Committee on Speech, 1993 - 1997.
`Editorial board, Journal of Computer Speech and Language, 1994 - 2010.
`Ongoing collaborative research in binaural hearing with the Department of Otolaryngology at the University
`of Connecticut Medical School, Farmington, CT.
`Member of Institute of Electrical and Electronics Engineers, Acoustical Society of America, International
`Speech Communication Association, Association for Research in Otolaryngology, Audio Engineering
`Richard M. Stern, Jr.
`Page 3
`Reviewer for National Science Foundation, International Speech Communication Association, IEEE, J.
`Acoust. Soc. Amer., Hearing Research, IEEE Transactions on Signal Processing, IEEE Transactions on
`Speech and Language, IEEE Transactions on Systems, Man, and Cybernetics, and Communications of
`the Association of Computing Machinery.
`Fellow, Institute of Electrical and Electronics Engineers (IEEE)
`Fellow, International Speech Communication Association (ISCA)
`Fellow, Acoustical Society of America (ASA)
`Distinguished Lecturer of the International Speech Communication Association, 2008 to 2009
`Allen Newell Award for Research Excellence, Carnegie Mellon University Department of Computer Science,
`IEEE Student Branch Award for Teacher of the Year, Carnegie Mellon University Department of Electrical
`Engineering, 1979
`Joel and Ruth Spira Excellence in Teaching Award (with three colleagues), Carnegie Mellon University
`Department of Electrical and Computer Engineering, 2018
`Papers in Archival Journals
`STERN, R. M., COLBURN, H. S., BERNSTEIN, L. R., AND TRAHIOTIS, C. (2019). “The fMRI data of
`Thompson et al. (2006) do not constrain how the human midbrain represents interaural time delay,” Journal
`of the Association for Research in Otolaryngology 20:305-311.
`W. M., and GOODMAN, D. F. M. (2017). “A Framework for Testing and Comparing Binaural Models,”
`Hearing Research 360:92-106.
`DE LA CALLE SILOS, F., and STERN, R. M. (2017). “Synchrony-based feature extraction for robust
`automatic speech recognition,” IEEE Signal Processing Letters 24:1158-1162.
`FREDES, J., NOVOA, J., KING, S., STERN, R. M., and BECERRA YOMA, N. (2017). “Locally-normalized
`filter banks applied to deep neural network-based robust speech recognition,” IEEE Signal Processing
`Letters 24:377-381.
`KIM, C., and STERN, R. M. (2016). “Power-normalized cepstral coefficients (PNCC) for robust speech
`recognition,” IEEE Trans. on Audio, Speech, and Language Processing 24:1315-1329. [Received IEEE
`Signal Processing Society Best Paper Award, 2019]
`CHO, B. J., KWON,H. Cho, J.-W., KIM, C., STERN, R. M. and PARK, H.-M. (2016). “A subband-based
`stationary-component suppression method using harmonics and power ratio for reverberant speech
`recognition, IEEE Signal Processing Letters, 23:780-784.
`Richard M. Stern, Jr.
`Page 4
`ROMIGH, G. D., BRUNGART, D. S., STERN, R. M., and SIMPSON, B. D. (2015). “Efficient real spherical
`harmonic representation of head-related transfer functions,” IEEE Journal of Selected Topics in Signal
`Processing, 9: 921-930, August 2015.
`perceptually-motivated low-complexity channel normalization technique applied to speaker verification.”
`Computer Speech and Language, 311-27, 2015.
`POBLETE, V., BECERRA YOMA, N., and STERN, R. M. (2014). “Optimizing the parameters characterizing
`sigmoidal rate-level functions based on acoustic features,” Speech Communication, 56:19-34, January
`HERMANSKY, H., COHEN, J. R., and STERN, R. M. (2013). “Perceptual properties of current speech
`recognition technology,” Proc. IEEE 101:1968-1985, September 2013.
`STERN, R. M., and MORGAN, N. (2012). “Hearing is believing: biologically-inspred methods for robust
`speech recognition,” IEEE Signal Processing Magazine 29:34-43, November, 2012.
`CHIU, Y.-H. B., RAJ, B., and STERN, R. M. (2012). “Learning-based auditory encoding for robust speech
`recognition,” IEEE Trans. on Audio, Speech, and Language Processing 20:900914, March 2012.
`KIM, W., and STERN, R. M. (2011). “Mask classification for missing-feature reconstruction for robust
`speech recognition,” Speech Communication, 53:1-11, January 2011.
`PARK, H.-M., and STERN, R. M. (2009). “Spatial Separation of Speech Signals using Continuosly-Variable
`Weighting Factors Estimated from Comparisons of Zero Crossings, Speech Communication Journal,
`51(1):15-25, January 2009.
`SELTZER, M. L., and STERN, R. M. (2006). “Subband Likelihood-Maximizing Beamforming for Speech
`Recognition in Reverberant Environments,” IEEE Transactions of Speech, Language, and Audio
`Processing 14(6): 2109-2121, November 2006.
`RAJ, B., and STERN, R. M. (2005). “Missing-Feature Methods for Robust Automatic Speech Recognition,”
`IEEE Signal Processing Magazine, September 2005.
`KIM, N. S., LIM, W., and STERN, R. M. (2005). “Feature compensation based on switching linear dynamic
`model,” IEEE Signal Processing Letters, 12(6): 473-476.
`SELTZER, M. L., RAJ, B., and STERN, R. M. (2004). “Likelihood-Maximizing Beamforming for Robust
`Hands-Free Speech Recognition,” IEEE Transactions of Speech and Audio Processing, 12(5): 489-498,
`September 2004. [Received IEEE Signal Processing Society Best Student Paper Award, 2007]
`OBUCHI, Y., HATAOKA, N., and STERN, R. M. (2004), “Normalization of Time-Derivative Parameters for
`Robust Speech Recognition in Small Devices,” IEICE Trans. on Information and Systems, 87-D(4):
`1004:1011, April 2004.
`RAJ, B., SELTZER, M. L., and STERN, R. M. (2004), “Reconstruction of Missing Features for Robust
`Speech Recognition,” Speech Communication Journal, 43(4): 275-296, September 2004.
`SELTZER, M. L., RAJ, B,, and STERN, R. M. (2004). “A Bayesian Framework for Spectrographic Mask
`Estimation for Missing Feature Speech Recognition,” Speech Communication Journal, 43(4): 379-393,
`September 2004.
`Richard M. Stern, Jr.
`Page 5
`SINGH, R., RAJ, B., and STERN, R. M. (2001), “Automatic Generation of Sub-Word Units for Speech
`Recognition Systems,” IEEE Trans. on Speech and Audio Proc. 10(2):89-99.
`HUERTA, J. M., and STERN, R. M. (2001). “Distortion-Class Modeling for Robust Speech Recognition
`under GSM RPE-LTP Coding,” Speech Communication Journal, 34:213-225 (invited paper).
`MORENO, P. J., RAJ, B., and STERN, R. M. (1998). “Data-Driven Environmental Compensation for Speech
`Recognition: A Unified Approach,” Speech Communication Journal, 24: 267-85.
`STERN, R. M., and SHEAR, G. D. (1996a) “Lateralization and Detection of Low- Frequency Binaural
`Stimuli: Effects of Distribution of Internal Delay,” J. Acoust. Soc. Amer. 100: 2278-2288.
`STERN, R. M., and SHEAR, G. D. (1996b) “Lateralization and Detection of Low- Frequency Binaural
`Stimuli: Specification of the Extended Position-Variable Model,” Physics Auxiliary Publication Service, AIP
`document E-JASMA-100-2278- 0.175MB via http://www.aip.org/epaps/ epaps.html.
`TRAHIOTIS, C., and STERN, R. M. (1994) “Across-Frequency Interaction in Lateralization of Complex
`Binaural Stimuli,” J. Acoust. Soc. Amer. 96: 3804- 3806 (L).
`STERN, R. M., ZEPPENFELD, T., and SHEAR, G. D. (1991). “Lateralization of RectangularlyModulated
`Noise: An Explanation for Counterintuitive Reversals,” J. Acoust. Soc. Amer. 90: 1901-1907.
`COAST, D. A., STERN, R. M., CANO, G. G., and BRILLER, S. A. (1990). “An Approach to Cardiac
`Arrhythmia Analysis Using Hidden Markov Models,” IEEE Trans. Biomed. Eng. 37: 826836.
`TRAHIOTIS, C., and STERN, R. M. (1989). “Lateralization of Bands of Noise: Effects of Bandwidth and
`Differences of Interaural Time and Phase,” J. Acoust. Soc. Amer. 86: 1285-1293.
`RUDNICKY, A. I., and STERN, R.M. (1989). “Spoken Language Research at Carnegie Mellon,” Speech
`Technology Magazine 4: 38-43.
`STERN, R. M., ZEIBERG, A. S., and TRAHIOTIS, C. (1988). “Lateralization of Complex Binaural Stimuli:
`A Weighted Image Model,” J. Acoust. Soc. Amer. 84, 156-165.
`STERN, R. M., and LASRY, M. J. (1987). “Dynamic Speaker Adaptation for Feature-Based Isolated Letter
`Recognition,” IEEE Trans. on Acoustics, Speech, and Signal Processing 35: 751763.
`STERN, R. M., and COLBURN, H. S. (1985). “Lateral-Position Models of Interaural Discrimination,” J.
`Acoust. Soc. Amer. 77: 753-755.
`STERN, R. M., and COLBURN, H. S. (1985). “Subjective Lateral Position and Interaural Discrimination,”
`Physics Auxiliary Publication Service, AIP document no. PAPS JASMA-77-753-29.
`LASRY, M. J., and STERN, R. M. (1984). “A Posteriori Estimation of Correlated Jointly Gaussian Mean
`Vectors,” IEEE Trans. on Pattern Anal. and Mach. Intel. 6: 530-535.
`CROWLEY, J. L., and STERN, R. M., Jr. (1984). “Fast Computation of the Difference of Low Pass
`(DOLP) Transform,” IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 212-222.
`STERN, R. M., Jr., SLOCUM, J. E., and PHILLIPS, M. S. (1983). “Interaural Time and Amplitude
`Discrimination in Noise”, J. Acoust. Soc. Amer. 73:1714-1722.
`Richard M. Stern, Jr.
`Page 6
`YOST, W. A., GRANTHAM, D. W., LUFTI, R. A., and STERN, R. M., Jr. (1982). “The Phase Angle of
`Addition in Temporal Masking for Diotic and Dichotic Listening Conditions,” Hearing Res. 7: 247-259.
`MURTI, K. G., STERN, R. M., CANTEKIN, E. I. and BLUESTONE, C. D. (1982). “Classification of
`Spectral Patterns Obtained from Eustacian Tube Sonometry,” IEEE Trans. Biomed. Eng. 29: 473-477.
`MURTI, K. G., STERN, R. M., Jr., CANTEKIN, E. I. and BLUESTONE, C. D. (1980). “Sonometric Evaluation
`of Eustachian Tube Function Using Broadband Stimuli”, Annals of Otology, Rhinology, and Laryngology,
`(Suppl. 68) 89, 178-189.
`RUOTOLO, B. R., STERN, R. M., Jr., and COLBURN, H. S. (1979). “Discrimination of Symmetric, Time-
`Intensity Traded Binaural Stimuli,” J. Acoust. Soc. Amer., 66: 1733-1737.
`STERN, R. M., Jr. and COLBURN, H. S. (1978). “Theory of Binaural Interaction Based on Auditory-Nerve
`Data. IV. A Model for Subjective Lateral Position,” J. Acoust. Soc. Amer., 64: 127140.
`Critically-Reviewed Books, Book Chapters, and Theses
`STERN, R. M. and MENON, A. (2020). “Binaural Technology for Machine Speech Recognition and
`Understanding,” Chapter in The Technology of Binaural Understanding, J. Blauert and J. Braasch, Eds.,
`Springer Nature.
`VERGYRI, D., ALWAN, A., and HANSEN, J.H.L. (2017). “Robust features in Deep Learning based Speech
`Recognition,” Chapter in New Era for Robust Speech Recognition: Exploiting Deep Learning, S. Watanabe,
`M. Delcroix, F.Metze, & J. R. Hershey (eds.), pp. 165196, Springer International Publishing.
`STERN, R. M. and MORGAN, N. (2013). “Features Based on Auditory Physiology and Perception,” Chapter
`in Noise-Robust Techniques for Automatic Speech Recognition, T. Virtanen, R. Singh, and B. Raj, Eds.,
`Wiley Press.
`STERN, R. M., WANG, D., and BROWN, G. (2006). “Binaural Sound Localization,” Chapter in
`Computational Auditory Scene Analysis: Principles, Algorithms and Applications, D. Wang and G. Brown,
`Eds., Wiley and IEEE Press.
`STERN, R. M., TRAHIOTIS, C., and RIPEPI, A. M. (2006). “Fluctuations in Amplitude and Frequency
`Enable Interaural Delays to Foster the Identification of Speech-like Stimuli,” Chapter in Dynamics of Speech
`Production and Perception, P. Divenyi, Ed., IOS Press.
`TRAHIOTIS, C., BERNSTEIN, L. R., STERN, R. M., and BUELL, T. N. (2005). “Interaural Correlation as
`the Basis of a Working Model of Binaural Processing: An Introduction,” Chapter in Springer Handbook of
`Auditory Research: Sound Source Localization, R. Fay and T. Popper, Eds., Springer-Verlag.
`STERN, R. M. (2004). “Signal Separation Motivated by Human Auditory Perception: Applications to
`Automatic Speech Recognition,” Chapter in Speech Separation by Humans and Machines, P. Divenyi, Ed.,
`SINGH, R., STERN, R. M., and RAJ, B. (2002). “Signal and Feature Compensation Methods for Robust
`Speech Recognition,” Chapter in CRC Handbook on Noise Reduction in Speech Applications, Gillian Davis,
`Ed., Boca Raton: CRC Press.
`Richard M. Stern, Jr.
`Page 7
`SINGH, R., RAJ, B., and STERN, R. M. (2002). “Model Compensation and Matched Condition Methods for
`Robust Speech Recognition,” Chapter in CRC Handbook on Noise Reduction in Speech Applications,
`Gillian Davis, Ed., Boca Raton: CRC Press.
`STERN, R. M., ACERO, A., LIU, F.-H., and OHSHIMA, Y. (1996). “Signal Processing for Robust Speech
`Recognition,” Invited chapter in Speech Recognition, pp. 351-378, C.-H. Lee and F. Soong, Eds., Boston:
`Kluwer Academic Publishers.
`STERN, R. M., and TRAHIOTIS, C. (1996). “Models of Binaural Perception,” Invited chapter in Binaural
`and Spatial Hearing in Real and Virtual Environments, pp. 499-531, R. Gilkey and T. R. Anderson, Eds.
`New York: Lawrence Erlbaum Associates
`STERN, R. M. (1995). “Robust Speech Recognition,” Invited chapter in Survey on the State of the Art in
`Speech and Natural Language Processing, R. A. Cole et al., Ed.
`STERN, R. M., and TRAHIOTIS, C. (1995). “Models of Binaural Interaction,” Invited chapter in Handbook
`of Perception and Cognition, Volume 6: Hearing, pp. 347-386, B. C. J. Moore., Ed. New York: Academic
`STERN, R. M., Jr. (1976b). Lateralization, Discrimination, and Detection of Binaural Pure Tones, Ph.D.
`Thesis, Electrical Engineering Department, MIT, December, 1976.
`Non-reviewed Book
`STERN, R. M. (2020). Selected Advanced Topics in Digital Signal Processing. Textbook for CMU course
`Invited Conference Presentations
`STERN, R. M. (2017). “Predicting Binaural Lateralization, Interaural Discrimination, and Binaural Detection
`Using the Position-Variable Model,” invited talk at the Macquarie University ARC Laureate Workshop:
`Creating a Sense of Auditory Space, October 2017, Sydney, Australia.
`STERN, R. M, KIM, C., MOGHIMI, A.R., and MENON, A. (2016). “Binaural technology and automatic
`speech recognition,” invited talk at the International Congress on Acoustics, September 2016, Buenos
`Aires, Argentina.
`STERN, R. M. (2016). “Applying Models of Auditory Processing to Automatic Speech Recognition: Progress
`and Promise,” invited keynote talk at the 2015 Meeting of the Information Processing Sociaty of Japan:
`Special Interest Group in Spoken Language Processing, Toyama, Japan, February, 2016.
`STERN, R. M. (2014). “Applying Models of Auditory Processing to Automatic Speech Recognition: Progress
`and Promise,” invited talk at the Frederick Jelinek Memorial Workshop on Meaning Representations in
`Language and Speech Processing, Prague, Czech Republic, July, 2014.
`STERN, R. M. (2014). “Robust Automatic Speech Recognition in the 21st Century,” invited keynote talk at
`the 2014 AFEKA Conference for Speech Processing, Tel Aviv, Israel, July, 2014.
`STERN, R. M. (2014). “Applying Models of Auditory Processing to Automatic Speech Recognition: Promise
`and Progress,” International Symposium on Speech Recognition, University of Zaragoza, Spain, May,
`Richard M. Stern, Jr.
`Page 8
`STERN, R. M. (2011). “Applying Physiologically-Motivated Models of Auditory Processing to Automatic
`Speech Recognition,” invited talk at the Third International Symposium on Auditory and Audiological
`Research, Nyborg, Denmark, August, 2011.
`STERN, R. M. (2010). “The impact of the distribution of internal delays in binaural models on predictions
`for psychoacoustical data,” invited talk at the 161th Meeting of the Acoustical Society of America, Cancun,
`Mexico, November, 2010.
`STERN, R. M. (2009). “New Directions in Robust Speech Recognition: What We Can Learn from Auditory
`Models,” invited keynote address at the Symposium on Frontiers of Research in Speech and Music,
`Gwalior, India, December, 2009.
`STERN, R. M. (2009). “New Directions in Robust Automatic Speech Recognition,” invited keynote address
`at the Workshop on Image and Speech Processing, Hyderabad, India, December, 2009.
`STERN, R. M. (2008). “Applying Physiologically-Motivated Models of Auditory Processing to Automatic
`Speech Recognition: Promises, Progress, and Problems,” Invited keynote address at the ISCA Tutorial and
`Research Workshop on Statistical and Perceptual Audition, Brisbane, Australia, September, 2008.
`STERN, R. M, GOUVEA, E., KIM, C., KUMAR, K., and PARK, H.-M. (2008). “Binaural and Multiple-
`Microphone Processing for Robust Automatic Speech Recognition,” Invited keynote address at the IEEE
`Workshop on Hands-free Speech Communication and Microphone Arrays, Trento, Italy, May, 2008.
`STERN, R. M. (2004). “Signal Processing for Sound Separation and Robust Representation,” Invited
`keynote address at AFOSR/NSF Symposium on Speech Separation and Comprehension in Complex
`Acoustic Environments, Montreal, Quebec, November 2004.
`STERN, R. M. (2003). “Signal Separation Motivated by Auditory Processing: Applications to Speech
`Recognition,” invited review talk at the NSF Symposium on Signal Separation, Montreal, Quebec,
`November, 2003.
`STERN, R. M. (2003). “Signal Processing for Robust Recognition,” invited talk at the NAIST International
`Center of Excellence Symposium, Nara, Japan, March, 2003.
`STERN, R. M. (2002). “Using Computational Models of Binaural Hearing to Improve Automatic Speech
`Recognition Accuracy: Promise, Progress, and Problems,” AFOSR Workshop on Computational Audition,
`Columbus, Ohio, August, 2002.
`STERN, R. M. (2000). “Robust Signal Representations for Automatic Speech Recognition,” Institute for
`Mathematics and Its Applications Workshop on the Mathematical Foundations of Speech Processing and
`Recognition, Minneapolis, Minnesota, September, 2000.
`STERN, R. M. (2000). “The Language of Music,” invited keynote talk presented at the Third International
`Symposium on Text, Speech, and Dialog, Brno, Czech Republic, September, 2000.
`STERN, R. M. (2000). “Tendencias Actuales en el Procesamiento del Lenguaje Hablado y Sistemas
`Conversacionales (Current Trends in Spoken Language Processing and Conversational Systems)”, invited
`keynote talk at the XV Simposium Internacional de Electrónica y Comunicación, Instituto Tecnológico de
`Estudios Superiores de Monterrey Mexico, February, 2000.
`STERN, R. M. (1999). “Tendencias Actuales en el Procesamiento del Lenguaje Hablado y Sistemas
`Conversacionales (Current Trends in Spoken Language Processing and Conversational Systems)”, invited
`Richard M. Stern, Jr.
`Page 9
`keynote talk at the XXIV Simposium Internacional de Sistemas Computacionales, Instituto Tecnológico de
`Estudios Superiores de Monterrey, Monterrey, Mexico, March, 1999.
`STERN, R. M., and TRAHIOTIS, C. (1997). “Binaural Mechanisms that Emphasize Consistent Interaural
`Timing Information over Frequency,” invited keynote talk in Psychophysical and Physiological Advances in
`Hearing, Proceedings of the XI International Symposium on Hearing, August, 1997, Grantham, United
`Kingdom. A. R. Palmer, A. Rees, A. Q. Summerfield, and R. Meddis, Eds., Whurr Publishers, London, 1998.
`STERN, R. M., RAJ, B., and MORENO, P. J. (1997). “Compensation for Environmental Degradation in
`Automatic Speech Recognition,” invited keynote talk presented at the Proc. of the ESCA Tutorial and
`Research Workshop on Robust Speech Recognition for Unknown Communication Channels, April, 1997,
`Pont-au-Mousson, France, pp. 33-42.
`STERN, R. M. (1996). “The Current State of the Art of in Speech Recognition (Estado-da-Arte em
`Reconhecimento de Voz),” invited keynote talk presented at VOICETECH’96, the First Brazilian Workshop
`in Automatic Speech Recognition Campinas, Sao Paolo, Brazil, September, 1996.
`STERN, R. M. (1996). “New Directions in Spoken Language Processing,” invited talk at the Second Joint
`NSF/CONACyT Workshop on Bilateral Collaboration, Jalapa, Mexico, March, 1996.
`STERN, R. M. (1996). “Tendencias Actuales en el Procesamiento del Lenguaje Hablado (Current Trends
`in Spoken Language Processing)”, invited talk at the Universidad Veracruzana, Jalapa, Mexico, March,
`STERN, R. M., and SULLIVAN, T. M. (1996). “Robust Speech Recognition Using Signal Processing Based
`On Binaural Perception,” invited talk presented at the First Forum Acusticum, Antwerp, Belgium, April, 1996.
`STERN, R. M., MORENO, P.J., and RAJ, B. (1996). “Compensation for Speech Recognition in
`Degraded Acoustical Environments,” invited talk at the 132th meeting of the Acoustical Society of America,
`Honolulu, Hawaii, December, 1996.
`STERN, R. M. (1995). “Nuevos Enfoques en Procesamiento de Lenguaje Hablado (New Directions in
`Spoken Language Processing),” invited talk at the Universitat Politecnica de Catalunya, Barcelona, Spain,
`September, 1995.
`STERN, R. M. (1995). “New Directions in Spoken Language Processing,” invited talk presented at the
`Telefónica Investigación y Desarrollo Laboratory Symposium on Spoken Language Processing, Madrid,
`Spain, September 1995.
`STERN, R. M. (1995). “Automatic Speech Recognition using Signal Processing based on Auditory
`Physiology and Perception,” invited paper presented at the 129th meeting of the Acoustical Society of
`America, Washington, D.C., June, 1995.
`MORENO, P. J., RAJ, B., and STERN, R. M. (1995). “Approaches to Environmment Compensation in
`Automatic Speech Recognition,” invited paper presented at the 15th International Conference on Acoustics,
`Trondheim, Norway, Vol. III, pp. 109-112, June, 1995.
`STERN, R. M., and SULLIVAN, T. M. (1994). “Robust Speech Recognition Based on Human Binaural
`Perception,” invited paper presented at the ATR workshop on A Biological Framework for Speech
`Perception and Production, Kansai Science City, September, 1994. Reprinted in ATR technical report TR-
`Richard M. Stern, Jr.
`Page 10
`H-121: Proceedings of the ATR workshop on A Biological Framework for Speech Perception and
`Production, 122 pages, (1995).
`STERN, R. M. LIU, F.-H., SULLIVAN, T. M., MORENO, P. J., and ACERO, A. (1994). “Multiple Approaches
`to Robust Speech Recognition,” invited keynote paper at the Fifth Western Pacific Regional Acoustical
`Conference, Seoul, Korea, August, 1994.
`STERN, R. M. (1993). “Models of Binaural Interaction,” invited keynote paper at the AFOSR Conference
`on Binaural and Spatial Hearing, Wright-Patterson Air Force Base, September, 1993.
`STERN, R. M. (1993). “Psychoacoustical Basis of Machine Speech Recognition,” invited talk at the Annual
`Meeting of the American Association for the Advancement of Science, February, 1993.
`STERN, R. M. (1989). “Recent Progress in Spoken-Language Systems,” invited lecture at the Second
`International Symposium on Artificial Intelligence, Monterrey, Mexico, October, 1989.
`STERN, R. M. (1988). “Overview of Models of Binaural Perception,” invited review paper at the 1988
`National Research Council CHABA Symposium, Washington, D.C., October, 1988.
`STERN, R. M. (1988). “Estado Actual de la Tecnología de Entradas/Salidas de Canales de Voz
`(Overview of Current Voice Input/Output Technologies),” invited keynote lecture at the XIII Simposium
`Internacional de Sistemas Computacionales, Monterrey, Mexico, March, 1988.
`COLE, R. A., STERN, R. M., and LASRY, M. J. (1986). “Performing Fine Phonetic Distinctions: Templates
`vs. Features,” invited talk, reprinted in Invariance and Variability of Features in Spoken English Letters, J.
`Perkell et al., eds., Lawrence Erlbaum, New York.
`Critically-Reviewed Conference Presentations
`VUONG, T., XIA, Y., and STERN, R. M. (2021). “The application of learnable STRF kernels to the 2021
`Fearless Steps Phase-03 SAS Challenge,” Interspeech 2021, September 2021, Brno, Czech Republic.
`XIA, Y., CHEN, L.—W., RUDNICKY, A., and STERN, R. M. (2021). “Temporal context in speech emotion
`recognition,” Interspeech 2021, September 2021, Brno, Czech Republic.
`VUONG, T., XIA, Y., and STERN, R. M. (2021). “A modulation-domain loss for neural-network-based
`real-time speech enhancement,” Proc. IEEE ICASSP 2021, June 2021, Toronto, Canada.
`VUONG, T., XIA, Y., and STERN, R. M. (2020). Learnable Spectro-temporal Receptive Fields for Robust
`Voice Type Discrimination, Interspeech 2018, September 2020, Shanghai, China.
`MENON, A., Kim, C., and STERN, R. M. (2019), “Robust recognition of reverberant and noisy speech using
`coherence-based processing,” IEEE International Conference on Acoustics, Speech, and Signal
`Processing, May 2019, Brighton, United Kingdom.
`XIA, Y. and STERN, R. M. (2018). A Priori SNR Estimation Based on a Recurrent Neural Network for
`Robust Speech Enhancement, Interspeech 2018, September 2018, Hyderabad, India.
`MICHELSON, J., STERN, R. M., and SULLIVAN, T. M. (2018), “Automatic guitar tablature transcription
`from audio using inharmonicity regression and Bayesian classification,” 145th Convention of the Audio
`Engineering Society, October 2018, New York City.
`Richard M. Stern, Jr.
`Page 11
`KIM, C., MENON, A., BACCHIANI, M., and STERN, R. M. (2018), “Source separation using phase
`difference and reliable mask selection,” IEEE International Conference on Acoustics, Speech, and Signal
`Processing, April 2018, Calgary, Alberta, Canada.
`MENON, A., KIM, C., KUROKAWA, U., and STERN, R. M. (2017), “Binaural Processing for Robust
`Recognition of Degraded Speech,” IEEE Automatic Speech Recognition and Understanding Workshop,
`December 2017, Naha, Okinawa, Japan.
`MENON, A., KIM, C., and STERN, R. M. “Robust Speech Recognition Based on Binaural Auditory
`Processing,” Interspeech 2017, September 2017, Stockholm, Sweden.
`N., “Robustness over time-varying channels in DNN-HMM ASR-Based human-robot interaction,”
`Interspeech 2017, September 2017, Stockholm, Sweden.
`YOMA, N. (2016). The use of locally normalized cepstral coefficients (LNCC) to improve speaker
`recognition accuracy in highly reverberant rooms. Proceedings of Interspeech 2016, September 2016, San
`Francisco, California.
`FREDES, J., NOVOA, J., POBLETE, V., KING, S., STERN, R. M., and YOMA, N. B. (2015), “Robustness
`to additive noise of locally-normalized cepstral coefficients in speaker verification,” Interspeech 2015,
`September 2015, Dresden, Germany.
`HARVILLA, M. J., and STERN, R. M. (2015). Efficient Audio Declipping Using Regularized Least Squares.
`ICASSP: International Conference on Acoustics, Speech and Signal Processing, April 2015, Brisbane
`HARVILLA, M. J., and STERN, R. M. (2014). “Least squares declipping for robust speech recognition,”
`Interspeech 2014, September 2014, Singapore.
`MOGHIMI, A. R., RAJ, B., and STERN, R. M. (2014), “Post-masking: A hybrid approach to array processing
`for speech recognition,” Interspeech 2014, September 2014, Singapore.
`KIM, C., CHIN, K. K., BACCHIANI, M., and STERN, R. M. (2014), “Robust speech recognition using
`temporal masking and threshold algorithm,” Interspeech 2014, September 2014, Singapore.
`PARK, H.-M., MACIEJEWSKI, M., KIM, C., and STERN, R. M. (2014). “Robust speech recognition in
`reverberant environments using subband-based steady-state monaural and binaural suppression,”
`Interspeech 2014, September 2014, Singapore.
`MOGHIMI, A. R., and STERN, R. M. (2014), “An analysis of binaural spectro-temporal masking as nonlinear
`beamforming,” IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2014,
`Florence, Italy.
`HARVILLA, M., and STERN, R. M. (2012). “Histogram-based subband power warping and spectral
`averaging for robust speech recognition under matched and multistyle training,” IEEE International
`Conference on Acoustics, Speech, and Signal Processing, March 2012, Kyoto, Japan.
`KIM, C. and STERN, R. M. (2012). “Power-normalized cepstral coefficients (PNCC) for robust speech
`recognition, IEEE International Conference on Acoustics, Speech, and Signal Processing, March 2012,
`Kyoto, Japan.
`Richard M. Stern, Jr.
`Page 12
`KIM, C., KHAWAND, C, and STERN, R. M. (2012). “Two-microphone source separation algorithm based
`on statistical modeling of angle distributions,” IEEE International Conference on Acoustics, Speech, and
`Signal Processing, March 2012, Kyoto, Japan.
`KIM, C., KUMAR, K., and STERN, R. M. (2011). “Binaural sound source separation motivated by auditory
`processing”, IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2011,
`Prague, Czech Republic.
`KUMAR, K., KIM, C., and STERN, R. M. (2011). “Delta-spectral cepstral coefficients for robust speech
`recognition, IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2011,
`Prague, Czech Republic.
`KUMAR, K., RAJ, B., SINGH, R., and STERN, R. M. (2011).”An iterative least-squares techique for
`dereverberation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, May
`2011, Prague, Czech Republic.
`KUMAR, K., SINGH, R., RAJ, B., and STERN, R. M. (2011). “Gammatone sub-band magnitudedomain
`dereverberation, IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2011,