`
`EEE Transactions on Consumer Electronics, Vol. 42, No. 3, AUGUST 1996
`
`AN ASIC IMPLEMENTATION OF THE MPEG-2 AUDIO DECODER
`Sung-Chul Han," Sun-Kook Yoo,** Sung-Wook Park,* Nam-Hun Jeong,*
`Joon-Suk Kim,* Ki-Soo Kim,* Yong-Tae Han,*** and Dae-Hee Youn*
`* A S P Lab., Dept. of Electronic Eng., Yonsei University
`**Dept. of Medical Eng., Yonsei University
`***Research Center. Korea Telecom
`
`Abstract - MPEG-2 audio is the subband coding
`technology used for various audio applications. This
`paper presents a semi-custom ASIC design for the
`MPEG-2 audio decoder. The decoder implemented in
`this paper meets
`the
`requirements
`of MPEG-2
`international standard, and is divided into three parts
`: the preprocessor, the multichannel processor, and the
`synthesis filter. The decoder system has been designed
`in VHDL(VHS1C Hardware Description Language),
`and developed as a single chip in a 0.6 ,m~ CMOS
`semiconductor process.
`
`is capable of decoding MPEG-2 standard multichannel
`audio bitstreams. This paper is also intended to show a
`system architecture for MPEG-2 audio decoder that has
`been carefully designed
`the chip area and
`to reduce
`lessen the difficulty in test and verification.
`the
`of
`Firstly,
`the
`configuration
`and
`features
`implemented system are introduced.
`In this section, the
`basic concept embedded in the system architecture is also
`discussed. Then the main functional blocks comprising
`the system are described in detail.
`
`I. Introduction
`
`ISO(Internationa1 Standardization Organization)
`The
`MPEG-2 audio standard is the result of many efforts that
`have been dedicated to overcoming problems in storage
`and
`transmission of digital audio. MPEG-2 audio
`is
`basically a subband analysis which exploits the human
`auditory characteristics
`to achieve a
`low bitrate with
`minimum perceptual loss of signal quality. It also utilizes
`various multichannel data compression techniques to adopt
`the extension of 5 channels[l][2][3][4].
`applications,
`In many
`digital
`signal processing
`comercially available digital signal processors are used.
`the use of a general-purpose DSP(Digita1 Signal
`But
`Processor)
`is not
`always desirable
`since
`additional
`hardware is required to realize a complicated algorithm
`and some parts of the DSP are not used at all. These
`wasted gates make it impossible to obtain an optimized
`system. Advances in ASIC technology has given us the
`ability to design a large system and realize it on a chip
`in a reasonably short time. In order to be an efficient
`signal
`processing
`system,
`it
`should
`be
`fully
`application-specific by designing a processor core and
`additional
`hardwired
`logic
`that
`exactly meet
`the
`requirement of the algorithm to be implemented.
`The goal of
`this paper
`is to present a real-time
`audio decoder implemented using ASIC technology, which
`
`11. System Configuration and Fuctionality
`
`The MPEG-2 audio decoding process begins with
`receiving encoded bitstreams from transmission channels.
`The received bitstream
`is stored
`in a buffer, and then
`transferred to the decoder when asked for to be analyzed.
`Firstly, the decoder performs analysis on the received
`bitstream. This process consists of header interpretation,
`parameter extraction, and data extraction.. Some important
`to control overall operation of the decoder is
`information
`included
`in the header, and
`it is interpreted to gather
`system control signals and the information on encoding
`In the parameter extraction step, such parameters
`mode.
`as
`bit
`allocation
`information,
`scalefactor
`select
`information, and scalefactor information are extracted. The
`data extracted from the bitstream are quantized samples
`which have been encoded based upon the psychoacoustic
`model.
`the
`by
`are multiplied
`data
`extracted
`The
`corresponding scalefactors to become subband samples,
`which would be equivalent to the output of the analysis
`filterbank
`in
`the encoder
`if encoding and decoding
`process were neglected.
`Since
`the MPEG-2 encoder
`compresses multichannel data and scalefactors using the
`interchannel similarity among the signals and inserts the
`additive multichannel coding information into the header,
`the
`reverse process called
`the multichannel decoding
`
`Manuscript received June 10, 1996
`
`0098 3063/96 $04.00
`
`1996 IEEE
`
`@
`
`Avago Exhibit 2006 – Page 1
`ASUS v. Avago
`IPR2016-00646
`
`
`
`-~~~~~~~~~~~~~
`
`54 1
`
`Han, et al.: An ASIC Implementation of the MPEG-2 Audio Decoder
`
`bit
`
`stream
`
`scf
`
`index
`
`sb sample
`
`samp I e
`
`Figure 1. MPEG-2 Audio Decoder Architecture
`
`should be performed on the header and data acquired in
`the analysis step.
`After the multichannel decoding process the subband
`samples are
`transformed
`to time-domain samples when
`they are processed by
`the
`synthesis
`filterbank. This
`process is the most time-consuming one in the MPEG
`audio decoder[2].
`
`The MPEG-2 decoder contains 3 primary modules
`called the preprocessor,
`the multichannel processor, and
`the sysnthesis filterbank, respectively, as shown
`in
`the
`Figure 1. The architecture of the proposed audio decoder
`has been partitioned considering the effeciency in design
`and verification, and each part has been designed using
`VHDL, synthesized, and then verified with post-synthesis
`simulation.
`
`111. System Descriptions
`
`A. Preprocessor
`
`from an
`information
`Preprocessor extracts header
`to obtain multichannel
`is used
`audio bitstream, which
`modes
`and
`other
`control
`information. Audio
`data
`contained in the bitstream are also extracted, and sent to
`the multichannel processor via a
`buffer. Using
`the
`sampling
`frequency
`information,
`the preprocessor
`also
`information such as Fs and d32
`generates
`timing
`to
`synchronize
`system
`components, where Fs
`represents
`sampling frequency and d32 is the time within which the
`decoder should process 32 samples. A frame contains
`1152 samples so that the time duration of a frame is
`equal to that of 36 occurrences of the d32 signal.
`to the
`Since the decoding process varies according
`header information regarding coding modes, bit allocation,
`and scalefactor select information, the preprocessor should
`employ a data extraction algorithm that can accomodate
`
`in bit allocation
`various coding modes and changes
`scheme. To meet these requirements a specially designed
`microprocessor core is used and the extraction algorithm
`is microprogrammed
`in
`it. The microprocessor
`core
`operates at 54MHz, and, as shown in Figure 2, consists
`of an ALU, on-chip RAM, and an overall controller,
`ROM for programs and data
`tables, and an external
`DRAM interface.
`
`A
`FNDCLSS Medul I
`
`Figure 2. The Structure of
`the Processor Core
`
`the processor
`To obtain an efficient preprocessor,
`core has been designed to have the following features.
`Firstly,
`a
`separate
`logical
`circuit block which
`controls
`the output of
`the channel buffer, has been
`fabricated so that no additional programming efforts for
`buffer control might be necessary. When the processor
`
`Avago Exhibit 2006 – Page 2
`ASUS v. Avago
`IPR2016-00646
`
`
`
`542
`
`IEEE Transactions on Consumer Electronics, Vol. 42, No. 3, AUGUST 1996
`
`is applied to the
`controller reads one bit, 1-bit request
`INR register unit. The MSB of INR register is shifted
`into the memory buffer register MB. After the LSB, the
`the MB, a new 1 byte
`is
`eighth bit,
`is shifted
`to
`transfered from the FIFO to the INR, and then the shift
`operation is repeated. This operation is shown in Figure
`3. If
`the FIFO
`is empty,
`the controller stops all the
`operations of
`the processor and waits
`for
`the new
`bitstream to come in from the channel.
`
`".w, 1 Byte Request
`'
`- % B i
`
`t 7
`
`Figure 3. Bit Extraction from a bitstream
`
`Secondly, an automatic address increment is possible
`for easy sequential access to the memory block without
`an additional programming.
`Among
`the
`information
`extracted from the bitstream, parameters for parsing audio
`data
`such
`as bit-allocation,
`scfsi,
`tc-allocation,
`and
`so
`that
`each
`dynamic-crosstalk
`are
`frequently used,
`Table I . Instruction Set
`
`for
`
`is assigned a separate memory block, and
`parameter
`MPEG-2 decoder sequentially scans these memory blocks
`with the automatic address increament feature.
`Thirdly,
`an
`indirect
`addressing
`is possible
`look-up table search which frequently happens.
`Fourthly, it has an 8-level hardware stack, which is
`used to store the contents of the program counter, loop
`counter, etc.
`lk*16 bit internal RAM and 32
`it has an
`Fifthly,
`registers
`in total. Using
`these the time consumption
`in
`accessing external DRAMS is minimized.
`Finally, it has a 16 by 16 bit multiplier and divider,
`and other
`arithmetic
`logic units.
`It
`also
`supports
`application-specific
`instructions
`for
`analysis
`of
`the
`MPEG-2 bitstream. For example, "bread N" instruction in
`read N bits
`from
`Table 1 enables
`the core
`to
`the
`bitstream.
`One instruction normally consumes 4 clock cycles as
`shown in Table 1, but division and memory readiwrite
`instructions require 20 and 7 cycles, respectively.
`
`B. Multichannel Processor[2]
`
`While the preprocessor takes charge of the bitstream
`analysis under the time constraint for real-time operation,
`the multichannel processor reconstructs subband samples
`from
`the compressed data, and passes
`the
`subband
`samples to the synthesis filter.
`The compressed data result from the normalization,
`such as
`channel matrixing,
`and
`composite
`encoding
`dynamic crosstalk and phantom coding. To reconstruct
`original subband signals, multichannel processor consists
`of a composite decoding unit, a dematrixing unit, a
`denormalization unit, and a control module to control all
`of these units. In addition, an IIR filter is included in the
`multichannel
`processor
`to
`support
`the
`dematrixing
`procedure 2. Figure 4 shows a schmetic diagram of the
`multichannel processor.
`less
`requires
`processor
`Since
`the multichannel
`it operates at 27MHz
`computations
`than other modules,
`the 54MHz system
`clock speed which
`is
`the half of
`is activated by
`clock. This processor
`the d32 signal
`coming from the preprocessor. When the d32 signal goes
`to construct 32 subband
`to high, this processor begins
`signals. The processor begins processing
`for
`the
`first
`for 5 channel data
`subband. After finishing processing
`belonging
`to the first subband, the same processing for
`the second subband starts. The processing continues until
`the entire 32 subbands are processed,
`and
`then
`the
`
`read MB . indexR I
`( i n d e x R I 4 MB I
`I
`7
`write indexR , MBI MB+(indexR)
`ST
`R + 1 +
`4
`push R
`ST- R
`5
`POP R
`R I + R2 + R I
`add R1 , R2
`4
`4
`shiftR N
`sftRN(ACO)+AC(
`4
`shiftL N
`sttLN(ACO)-AC(
`R1
`4
`sub R I , R2
`R I - R2 4
`
`R1 x R2 + R1 4
`R1 , R2
`mult
`div R1 , R2
`Rl/RB-Rl
`(mod+R2)
`R1 - R2
`Status F/F set
`
`o n
`
`1
`
`R+(fndclss
`module)+ R
`INR-MB (N bit: 4 + h
`44-
`INR-MB
`(R;C)I
`((RDCIbit)
`INR-+MB (1 bit)
`Jump,inc indexR
`MB is zero
`J u m p t o Addr.
`Jump
`if Cond. is true
`
`1
`
`1
`
`inc R
`findclass R
`
`Bread N
`Bread RDC
`
`Bacc
`J I Z addr.
`
`.jump addr.
`cjmp addr.
`
`Avago Exhibit 2006 – Page 3
`ASUS v. Avago
`IPR2016-00646
`
`
`
`Han, et al.: An ASIC Implementation of the MPEG-2 Audio Decoder
`
`543
`
`Scale Factor
`
`from Preprocessor
`!
`llfi I
`t t 4 t t
`\ Register File /
`
`I
`
`to
`Synthesis
`Filter Bank
`
`nthskb-d
`m l b m
`
`composite
`decoding
`
`dematrbing
`iir filtering I dematrking
`
`denonnalizatbn
`
`Composite
`Decoding
`Unit
`
`Dematrixing
`Process
`
`Denormalization
`Processing
`Unit
`
`2-Order
`
`Figure 4. The Structure of the Multichannel Pro
`
`multichannel processor goes to the waiting state until the
`to high. Figure 5 explains the
`next d32 signal goes
`behavior of the multichannel processor.
`the
`reconstructs
`The
`composite
`decoding
`unit
`matrixed five channel sigoals(L0, RO, T2, T3, T4) by
`multiplying
`compressed
`data
`by
`the
`corresponding
`scalefactors. To do
`this, a 16 by 16 bit sequential
`multiplier and a
`scalefactor
`table containing
`16 bit
`scalefactors have been
`included. When the multichannel
`processing
`for
`a
`subband begins,
`the multichannel
`processing information and 5 channel data are read and
`stored in separate internal registers. After these data are
`loaded, the composite decoding unit reads the scalefactor
`index corresponding to each channel and reconstructs 5
`channel
`signals
`by multiplying
`the
`data
`by
`the
`scalefactors. The resultant 5 channel signals are stored in
`5 separate data registers.
`The dematrixing unit has an accumulator for addition
`and subtraction
`to reconstruct weighted
`five channel
`data. For the case of dematrixing process 2, the filtered
`signal of (T3+T4)/2 is used. Regardless of the matrixing
`the IIR filtering
`procedure,
`is performed prior
`to
`the
`dematrixing procedure for the convenience of designing
`and the consistency in timing.
`The IIR filter consists of a multiplier, accumulator,
`the
`and memory blocks
`to store
`the past data. For
`simplicity of the hardware, the multiplier
`is designed to
`compute the multiplication of a 16 bit signed and 16 bit
`unsigned
`numbers. The
`negative
`coefficient means
`
`Figure 5. Behavior of the Multichannel Processor
`
`subtraction in the accumulator. Since a second order IIR
`filter is used, 4 memory blocks are provided to store 4
`past input and output samples for 32 subbands.
`Five channel data processed through the dematrixing
`unit are transformed
`into the final subband signals by
`multiplying
`the denormalization
`factor defined by
`the
`dematrixing procedure. This process
`is accomplished by
`the denormalization unit which employs a multiplier
`performing the multiplication of a 16 bit signed and 18
`bit unsigned numbers.
`At the end of the procedures described above, five
`channel signals are stored
`in five registers(L0, RO, T2,
`T3, T4), respectively. Each register and the corresponding
`channel signal are determined from the channel switching
`information.
`Register
`contents
`together with
`the
`corresponding channel and subband information are passed
`to the synthesis filterbank.
`
`C. Synthesis Subband Filter[5]
`
`time-domain
`the
`reconstructs
`filter
`synthesis
`The
`signal
`from
`the subband samples transferred
`from
`the
`multichannel processor. Since synthesis filtering
`is
`the
`most
`time-consuming
`process
`in
`the MPEG
`audio
`decoder, it should be divided into smaller functions and
`each part should work in parallel for real-time operation.
`As
`specified
`in
`the MPEG
`audio
`international
`standard, 32 subband
`samples
`from
`the multichannel
`processor are processed in a few stages till 32 new audio
`samples are built. These processing
`steps can be
`simplified
`to
`two stages: the multiplication of cosine
`matrix
`and
`input
`subband
`samples
`and
`windowing/overlap-add. Two identical MAC units perform
`operations for each step, forming a two-stage pipelined
`structure. Each MAC unit consists of a 16 bit by 16 bit
`array multiplier and a 36 bit accumulator to allow up to
`
`Avago Exhibit 2006 – Page 4
`ASUS v. Avago
`IPR2016-00646
`
`
`
`544
`
`IEEE Transactions on Consumer Electronics, Vol. 42, No. 3, AUGUST 1996
`
`loss
`32
`precision. Also
`in
`accumulations without
`contained
`in the systhesis filter are ROMs to store the
`cosine matrix and window coefficients, RAMS to store
`intermediate data, and controllers to generate addresses for
`internal memory access and provide control signals for
`the MAC units Figure 6 show
`the structure of
`the
`synthesis subband filter
`
`Figure 5. The Structure of the Synthesis Filter
`
`The MPEG-2 subband filter uses 1024 samples of
`intermediate data
`in overlap-add process.
`This
`past
`means
`that 1024 words
`for each channel should be
`available in memory at any time, which is too large an
`amount to be integrated into a single chip. Therefore, a
`memory management unit
`is
`included
`in
`the systhesis
`filter to control an external DRAM.
`Basically, it requires 5 identical systhesis filters to
`transform subband samples from as many channels
`into
`audio signals. In the implemented decoder, however, only
`the operations for 5
`one synthesis
`filter performs all
`channels by time sharing. The way how 5 channels share
`one synthesis filterbank is shown in Figure 7. In the first
`time
`slot
`the MAC-1
`performs
`cosine matrix
`
`time
`-_I)
`
`n
`
`D32
`PipelJne
`conboi
`
`Dsts fransfer
`fmm MAC-?
`lo MAG2
`
`MAC 2
`opsrauon
`
`* The numbem in boxes indicate
`the channel being processed.
`
`Figure 6. The Timing Diagram of the
`Synthesis Subband Filter
`
`multiplication using the data from the first channel, and
`when the intermediate data transfer is over, it repeats the
`same process
`for the second channel. The other unit,
`MAC-2, begins overlap-add process for the first channel
`as soon as the MAC-I starts operations for the second
`channel. From this time on, the two MAC units always
`work
`together except
`for
`the
`last
`time slot. All
`the
`necessary processes
`for
`the
`remaining
`channels are
`performed in pipeline in a similar way.
`The time-domain samples, which are the final result
`systhesis
`filtering,
`are
`temporarily
`stored
`in
`the
`of
`external DRAM, and read back
`later and converted
`to
`serial data to provide convenient
`interface with external
`DACs(Digita1-to-Analog Converters).
`
`IV. Conclusion
`
`and
`implemented
`decoder
`audio
`The MPEG-2
`into a number of
`presented
`in this paper was divided
`to the their functionality. Each module
`parts according
`comprising
`the
`system
`has
`been
`designed
`rather
`independently of one another
`to achieve efficiency
`in
`design and verification.
`The proper operation of each
`module has been verified by comparing
`the result of
`computer simulation with that of post-synthesis simulation.
`All modules operate at the 54 MHz system clock. To
`reduce the chip area, an external DRAM support
`logic
`was installed in the chip., Arithmatic units are designed
`differently at each module deeply considering the trade-off
`relationship between speed and area. Bitstreams for each
`channel are applied to the input of the system through
`FIFO buffers and the signals o f up to five channels are
`output in serial format as well as timing information for
`sychronization. The proposed system works as a core of
`the layer I1 multichannel decoder with limited accuracy.
`It supports mono, stereo, dual,
`intensity stereo modes,
`phantom channel coding, dynamic crosstalk, and dynamic
`transmission channel switching. Dematrixing procedure 0,
`1, 2, and 3, and the decoder configuration 110, 210 and
`312 are also supported.
`
`References
`
`[l] ISO-IEC JTClISC29IWGll "Coding of Moving Pictures
`and Associate Audio for Digital Storage Media at up to
`1.5 Mbps-CD 11 172(Part-3,MPEG-Audio)" 1991,
`about
`Nov.
`[2] 1.30-IEC JTCl/SC29/WGllMo803 "Coding of Moving
`
`Avago Exhibit 2006 – Page 5
`ASUS v. Avago
`IPR2016-00646
`
`
`
`Han, et al.: An ASIC Implementation of the MPEG-2 Audio Decoder
`
`545
`
`13818
`
`(Part-3,
`
`and Associate Audio-IS
`Pictures
`MPEG-Audio)" 1994, Nov.
`[3] Y. F. Dehery, et al. " A MUSICAM source codec for
`digital audio broadcasting and storage." Proc. ICASSP
`pp.3605-3608, 1991
`[4] K. Brandenburg, "ASPEC Coding" AES 10th Conference.
`pp.8 1-89
`filters-A new
`[SI J. H. Rothweiler, "Polyphase quadrature
`subband
`coding
`technique."
`Proc.
`ICASSP.
`1983,
`pp.1280-1283.
`
`Biographies
`
`Sung-Chul Han received the B.S. and
`M.S. degree both
`in Electronic
`Engineering from Yonsei University
`in 1994 and 1996, respectively. Now
`he works
`for Micro Division,
`Samsung Electronics Co., Ltd. He is
`interested in VLSI signal processing
`and chip design.
`
`the B.S.
`received
`Sun-Kook Yo0
`degree in Electrical Engineering from
`Yonsei University
`in
`1981. He
`received the M.S. and Ph.D degree
`in the biomedical medical engineering
`from Yonsei University in 1985 and
`1989, respectively. He had been an
`assistant professor from 1989 to 1995
`in Soonchunhyang University, and he
`is an assitant professor in the Dept.
`of Biomedical Engineering, Yonsei University. He is interested in
`various
`topics
`such as bio-signal processing
`theory
`and
`implementation, high speed
`information system, VLSI signal
`processing and architecture.
`
`the B.S.
`Sung-Wook Park received
`and M.S. degree both in Electronic
`Engineering from Yonsei University
`in 1993 and 1995, respectively. He is
`now a student for Ph.D degree there.
`His research
`interest
`includes VLSI
`signal
`processing/architecture
`and
`design rule.
`
`I
`
`. .
`>*
`
`r -
`
`the B.S.
`Nam-Hun Jeong received
`degree in Electronic Engineering from
`Yonsei University
`in 1995. He
`is
`now a student for M.S. degree there.
`His research
`interest includes VLSI
`signal
`processing/architecture
`and
`design rule.
`
`the B.S.
`received
`Joon-Suk Kim
`degree
`in Electronic Engineering
`from Yonsei University in 1995. He
`is now a student for M.S. degree
`there. His research interest includes
`VLSI
`signal processinghchitecture
`and specially microprocessor design
`
`Ki-Soo Kim received the B.S. and
`M.S. degree both
`in Electronic
`Engineering from Yonsei University
`in 1991 and 1993, respectively. He
`is now a student for Ph.D degree
`there. His research
`interest includes
`audio coding/enhancement, and VLSI
`signal processing.
`
`the B.S.
`received
`Yong-Tae Han
`degree
`in Electronic Engineering
`from Kyoungbook University in 1991
`and the M.S. degrees from Pohang
`Institute of Science and Technology,
`Pohang, Korea in 1993. Since 1994,
`he has been working
`for Korea
`Telecom Transmission Technology
`Research Labs. His research interests
`include digital communications, adaptive digital filter, channel
`coding, and ASIC design for signal processing,
`
`the B.S.
`received
`Dae-Hee Youn
`degree in Electrical Engineering from
`Yonsei University
`in
`1977. He
`received the M.S. and Ph.D degree
`in the same field from Kansas State
`University
`in
`1979
`and
`1982,
`respectively. He
`had
`been
`an
`assistant professor
`from 1982
`to
`1985 in Kansas State University, and
`he has been a faculty member
`in the Dept. of Electronic
`Engineering, Yonsei University since 1985. He is interested in a
`broad range of signal processing such as adaptive filter theory
`and
`application,
`speech/audio
`coding,
`speech
`recognition,
`enhancement, and transformation,
`radadsonar signal processing
`and VLSI signal processing.
`
`Avago Exhibit 2006 – Page 6
`ASUS v. Avago
`IPR2016-00646