`L. Kiss, E. Hanssens, K. Adriaensen, M. Huysmans, C. Gendarme,
`E Van Beylen, H. Van De Weghe
`Alcatel Antwerp (Microelectronics department)
`Fr. Wellesplein I
`201 8 Antwerp (Belgium)
`(32)32407807
`hanssene @sh.bel.alcatel.be
`
`Abstract
`From Time domain to Atm domain, the complete
`digital signal processing required by ADSL technolo-
`gy has been integrated onto a single device called SA-
`CHEM. High programmability along with flexible ar-
`chitecture enable the device to serve for both network
`and line termination. Mapping on a 0.35 um standard
`digital CMOS technology makes SACHEM a cost ef-
`fective solution as well as a low power device, consum-
`ing only 800 m W at 3.3 V
`I. Introduction
`The most important feature of ADSL is that it can
`provide high speed digital services on existing pair
`copper wire, in overlay and without interfering with
`the traditional analogue telephone service (plain old
`telephone system: POTS) see Fig. 1. ADSL can offei
`Customer premises
`
`services like high speed Internet and On-line Access,
`telecommuting and VOD to every residential tele-
`phone subscriber. The technology is largely indepen-
`dent of twisted pair characteristics, thereby enabling it
`to be applied universally, virtually regardless of the ac-
`tual parameters of the local loop.
`The modulation technique for ADSL, which has
`been standardized in TI.413 [l], is Discrete Multi-
`Tone (DMT), a special form of multicarrier modula-
`tion [2][3]. Fundamentally, DMT modulation super-
`imposes several carrier-modulated waveforms to
`represent the input bit stream. The DMT transmit sig-
`nal, see Fig. 2, is the sum of N independent sub-sig-
`
`ADS L- LT
`
`ADSL-LT
`
`Access Node
`Fig. 1 : ADSL network.
`
`due to its highly efficient line coding technique, new
`
`Fig. 2: DMT frequency division.
`nals, each of equal bandwidth and equispaced with
`center frequency fi, i=l, ... , N. Each sub-channel can
`be considered as a Quadrature Amplitude Modulated
`(QAM) signal. In a DMT modulation scheme, the
`number input data bits allocated on distinct sub-chan-
`nels is variable. Obviously, sub-channels that encoun-
`ter less attenuation and less noise will cany more bits
`of information.
`The chip we propose reaches the highest integra-
`tion level. The complete digital signal processing for
`ADSL DMT-based modem functionality and trans-
`port convergence functions such as (de)interleaving,
`ReedSolomon
`(de)coding,
`(de)scrambling and
`
`0-7803-4980-6/98/$10.00 0 1998 IEEE
`
`349
`
`Authorized licensed use limited to: Alston & Bird LLP. Downloaded on January 20,2022 at 16:30:21
`
`
`
`
`
`CommScope, Inc.
`IPR2023-00066, Ex. 1027
`Page 1 of 5
`
`
`
`Nibble
`
`lime
`
`Frequency
`
`Data
`
`Fig. 3: SACHEM architecture.
`
`&I
`
`k
`
`I
`
`I
`
`Interface
`
`(de)framing is integrated in the single device called
`SACHEM .
`
`11. Architecture
`The SACHEM is used in both central office (line
`termination) and remote applications (network ter-
`mination), and is designed for sampling rates up to 8.8
`Ms/s with DMT symbols at 4 kHz. On the other hand
`it can interface with ATM devices through an Utopia
`interface (level 1 and level2) or synchronous devices
`through SLAP interface (Alcatel propriety). Follow-
`ing main functions can be distinguished in the SA-
`CHEM, see Fig. 3: Upand downsampling, time do-
`main equalization, time-frequency conversion and
`vice-versa, frequency domain equalization, symbol
`alignment, frequency deviation trackmg, constella-
`tions (de)coding and tone ordering, channel (de)cod-
`ing and ATM.
`
`A. DSP Front End
`The DSP Front End contains a transmit part which
`performs filtering and upsampling, a receive part
`which does downsampling and time domain equaliza-
`tion and some test functionality as bypass and trans-
`mit-receive looping, see Fig. 4.
`The receive path performs decimation and time do-
`main equalization. The decimator receives 16-bit
`words at 8.8 MHz from the analog Front End and re-
`duces the rate to 552 lcHz in central office application
`
`A N 4
`
`Y
`
`1 6 M Z S c a d * n n l w d m .
`Fig. 4: Dsp Front End architecture.
`
`
`
`"tun bnn*
`08.w
`
`and 2.2 MHz in remote application. Downsampling by
`a factor 16 is performed by a cascade of halfband FIR
`filters: two 3-tap triangular filters reducing the rate by
`4, followed by a 15-tap sinx/x Hamming compensated
`FIR filter also reducing rate by 2 and finalized by a
`59-tap sindx Hamming compensated FIR filter bring-
`ing rate to 552 kHz. The factor 4 downsampling is ob-
`tained by dropping the up-front triangular filters and
`achieving an output rate of 2.2 MHz. The time equal-
`izer is a FIR filter with programmable coefficients,
`mainly intended to reduce the effect of Inter-Symbol
`Interference (ISI) by shortening the channel impulse
`response. Length is determined by the type of applica-
`
`350
`
`Authorized licensed use limited to: Alston & Bird LLP. Downloaded on January 20,2022 at 16:30:21 UTC from IEEE Xplore. Restrictions apply.
`
`CommScope, Inc.
`IPR2023-00066, Ex. 1027
`Page 2 of 5
`
`
`
`tion, 64-taps in central office and 32-taps in remote
`configuration.
`The transmit direction includes sidelobe filtering,
`clipping, delay equalization and interpolation. The
`sidelobe filtering and delay equalization are imple-
`mented in a 3-stage and 2-stage biquad (2nd order FIR
`+ IIR), thus reducing the effect of echo. Clipping is
`limiting the amplitude of the output signal by a FIR-
`type of structure and as such optimizing the dynamic
`range of the analog front-end. The interpolator per-
`forms an upsampling of 2, from 4.4 MHz to 8.8 MHZ,
`in central office application by a 7-tap triangular FIR
`filter. An upsampling of 4, from 2.2 MHz to 8.8 Mhz,
`is performed in remote application by a simple hold
`function. The noise shaper is reducing wordsize from
`16-bit to 13-bit by a l-order IIR and thus minimizing
`noise introduction by wordsize reduction.
`
`B. FFT, Rotor, FEQ and FTG
`The Fast Fourier Transformer is instantiated twice
`in the SACHEM. It is used as a DMT carrier demodu-
`lator in the receive direction and as modulator in the
`transmit direction. It is a programmable machine with
`instruction set, which can do all processing for one
`DMT symbol in less than 250 usec and is based on a
`dedicated pipeline multiplier-accumulator ALU. The
`ALU contains two 20x18 fixed point multipliers and
`two busses: one for data, two times 20-bits and one for
`coefficients, two times 18-bits. The ALU performs
`complex radix-2 and radix-4 decimation its time
`(1)FFT butterflies, special ’resolve’ butterflies to com-
`bine results of real FIT’S, complex times complex
`multiplications with 2N scaling and complex time real
`multiplication.
`In the receive direction the FFT, see Fig. 5, used as
`
`Fig. 5: Receive FFT.
`
`DMT carrier demodulator performs following func-
`
`tions:
`- A real time to positive frequencies, from 512 (CO)
`or 128 (R) time samples to 256 complex positive fre-
`quencies with a maximum computing delay of 92 usec.
`- Frequency equalization (FEQ), a rotation (360 de-
`grees maximum) to align the received carriers on the
`X and Y axis, is performed to reduce signal phase rota-
`tion by carrier specific channel distortion. Signal am-
`plitude attenuation by the same distortion can be com-
`pensated by applying fine gain, between 0 and -6dB,
`on the FEQ computation and by doing SO adjusting re-
`ceived vector to demapping grid. FEQ calculation is
`performed within 15 usec for remote and 5 usec for
`central office application.
`- ROTOR, performed on positive frequencies, per-
`forms a linear phase correction to compensate a mis-
`alignment of the sampling clock. Actually it interpo-
`lates the sampling clock of the received data to any
`intermediate point, but in the frequency domain. The
`fi = f i . 4 . 2 ~ .
`following formula applies:
`i . A@
`where i.A@ is the content of an accumulator increment
`with A@ for each next frequency. This rotor process
`is performed in 2 step, a coarse adjustment followed by
`fine adjustment, 6 LSB’s of rotor value. The computa-
`tional delay is 30 usec in remote and 9 usec in central
`office application.
`In the transmit direction the IFFT performs com-
`plementary operations:
`- Fine tune gain (FTG), meant to correct gain of indi-
`vidual carriers, an operation taking 15 usec for 256 fre-
`quencies (R) and 30 usec for 512 frequencies (CO).
`- ROTOR calculation to adjust the frequency error be-
`tween local X-tal and desired transmit frequency.
`Computing delay of both operations, coarse and fine
`is 30 usec for remote mode and 60 usec for central of-
`fice application.
`- The IFFT performs a positive frequency to real time
`samples conversion: in remote mode 256 positive fre-
`quencies are processed, yielding 512 time samples
`within 76 usec while in central office mode 512 posi-
`tive frequencies are converted to 1024 time samples
`within 178 usec.
`
`C. Constellation (de)coding
`
`Receive part mainly contains following blocks:
`- Demapper which converts the FFT computed
`constellation points to a block of bits by use of a pro-
`gramming table (tone ordering). This essentially con-
`sists in identifying a point in the 2D QAM constella-
`tion plane.
`It is also capable of demodulating 4D
`Trellis encoded carriers [4][5] by the 2D subset in-
`formation provided by Trellis decoder.
`
`35 1
`
`Authorized licensed use limited to: Alston & Bird LLP. Downloaded on January 20,2022 at 16:30:21 UTC from IEEE Xplore. Restrictions apply.
`
`CommScope, Inc.
`IPR2023-00066, Ex. 1027
`Page 3 of 5
`
`
`
`- The Viterbi decoder, see Fig. 6, collects data during
`
`Fig. 6: Viterbi decoder.
`
`a number of cycles and estimates the most likely 4D
`subset based on a long data sequence. It computes 64
`branch metrics, performs 16 add-compare&selects,
`uses a backtrace length of 20 4D symbols and trans-
`lates 4D to 2D subset information per 2 tones to pro-
`vide information towards the demapping process.
`- The Monitor computes error parameters that will be
`used for software updates of adaptive filter coeffi-
`cients (FEQ, TEQ), clock phase adjustment (DPLL)
`and error detection (loss of signal, loss of frame). Sig-
`nal detection, also a part of monitoring activities, is
`build around 8 configured leaky integrators whose out-
`puts are fed to a hlghly programmable level detector.
`Error parameters obtained from linear monitoring can
`be used for automatic hardware updates of FEQ coeffi-
`cients. This adaptive process can be inhibited for pilot
`tone or incase of missing signal to block incorrect co-
`efficient updates.
`The transmit block has less complexity and only
`contains following functions: 4D Trellis encoding and
`mapping of data by use of programming table (tone or-
`dering). The Trellis encoder fetches information of a
`pair of tones and adds l redundant bit, thus creating an
`overhead of 1/2 bit per tone.
`
`D. Transmission Convergence layer
`The data received from the demapper is split into
`two paths, one dedicated to the interleaved or slow
`data flow and the other one for the non-interleaved or
`fast data flow. Except for interleaving/deinterleaving,
`those 2 flows are identical, from demapper to slap or
`utopia interface as well as from interface to mapper.
`For the purpose of clarity, only one flow is shown on
`Figure 3. The interleavingldeinterleaving is used to
`increase the error correction capabilities of block
`codes for error burst. A block code with depth D in-
`creases the burst errors capability from T-bytes to
`D*T bytes. SACHEM uses rectangular interleaving
`with depths 1,2,4,8, 16, 32 and 64.
`
`The Reed Solomon decoder [61[71 is able to correct
`errored bytes by using the redundant bytes and erasure
`information. The error correcting capabilities of a
`Reed Solomon (N,M) code is limited to following
`equation:
`( E d + ~ * E E u ) 5 ( N - M )
`where Ed number of erased byte, Eu number of unde-
`tected error bytes, N number of RS codeword bytes
`and M number of data bytes. The SACHEM, config-
`ured up to 16 overhead bytes (N is even), is capable of
`decoding 3 codewords (N = 255) within one DMT
`symbol.
`Two PDM descramblers are used, one for slow and
`one for fast, performing dn’ = dn @ dn-18’ @ dn-23’ as
`specified in the ADSL standard T1.4 13 [ 13.
`The deframer is a highly programmable synchro-
`nization machine with a variable synchronization
`delay according to: ( S * D) mod 272 with S being num-
`ber of DMT symbols per codeword (S = 0.5, 1,2,4, 8,
`16) and D being interleaving depth. The extracted
`First, A and B byte are sent towards CRC, EOC, AOC
`termination.
`Finally the two byte streams (slow and fast) are
`presented to ATM byte-based processing unit which
`provides basic cell functions like cell synchronization,
`payload descrambling, idlehnassigned cell filtering
`and cell header detection and correction all according
`the ITU-T 1.163 standard. Provision is also made for
`Bit Error Rate measurement (BER).
`The transmit path is the dual of receive path and
`contains ATM TC layer functions, framing, PDM
`scrambling, Reed Solomon encoding and interleaving.
`
`111. Design Methodology
`As a result of design methodology improvement ac-
`tivities, a continuous process at Alcaltel Antwerp,
`three new basic approaches were introduced in the de-
`sign flow in order to meet stringent time to market re-
`quirements: modelling (C++ and behavioral VHDL),
`simulation flexibility and design mapping on emulator
`to speed up test and software development even before
`any silicon was available.
`Bit-true C-models of PMD layer were used to
`verify system performance and match versus VHDL
`simulation results. The simulation environment of
`SACHEM was build in a manner that designers were
`able to switch between behavioral, RTL, Verilog gate-
`level or mapped database (emulator) simulations with-
`out worrying about stimuli or input files. Mapping of
`the SACHEM on an emulator enabled us to signifi-
`cantly speed-up simulation time with a factor of more
`than 100 when the emulator was driven directly by a
`workstation, or more than 100,000 when the emulator
`was driven by a test board (with simulation software
`
`352
`
`Authorized licensed use limited to: Alston & Bird LLP. Downloaded on January 20,2022 at 16:30:21 UTC from IEEE Xplore. Restrictions apply.
`
`CommScope, Inc.
`IPR2023-00066, Ex. 1027
`Page 4 of 5
`
`
`
`being compiled on the target on-board controller).
`Moreover, software development could start as soon as
`netlist was mapped, i.e. several weeks before first
`available samples.
`All together, these new design methodology ap-
`proaches resulted in a 3 to 6 months reduction in deve-
`loppement time. As they are not specific for SA-
`CHEM design, these approaches will be part of the
`design flow for every new ASIC designed at Alcatel
`Antw erp .
`
`IV. Results
`A photograph of the SACHEM is included, see
`Fig. 7.
`The device is designed in 0.35 um digital
`
`VI. References
`[I J Network and Customer Installation Interfaces - Asymmetric Dig-
`ital Subscriber Line (ADSL) Metallic Interface
`ANSI TI.413, 1995
`[Z] Frequency domain data transmission usingreduced computation-
`al complexity algorithms. A. Peled and A. Riuz
`International Conference Acoustics, Speech and Signal Proces-
`sing, Denver, pp. 964-967. April 1980
`[3] A frequency domain approach to combined spzctral shaping and
`coding. A. Riuz and J.M. Cioffi
`ICC 97, Seattle, pp. 1711-1715, June 1987
`141 Trellis-Coded Modulation with Redundant Signal Sets Part 1 :
`Introduction. 6. Ungerhoeck
`IEEE Communications Magazine VoL 25, No. 2, Feb 1987
`[SI Trellis-Coded Modulation with Multidimensional Constellations
`L.-F. Wei
`IEEE trans. on information technology, Vol. 33, No. 4, July 1987
`[6] Time domain algorithms aid architectures for Reed-Solomon de-
`coding. S. Choomchuay, B. Arambepio
`IEE Proceedings-I, Vol 140, No. 3, June 1993
`[7] High-speed Reed-Solomon decoder for correcting errors and
`erasures. (2.4. Wei, G C . Chen, 6.-S. Liu
`IEE Proceedings-I, Vol 140, No. 4, August 1953
`
`Fig. 7: Photograph of SACHEM.
`
`CMOS technology (5 metal layers) and packaged in a
`144-pins PQFP. Thanks to special attention during de-
`velopment, appealing power consumption and silicon
`area figures were obtained, especially when compared
`to designs of such complexity: SACHEM only con-
`sumes 800 mW in operational conditions (3.3 V, am-
`bient temperature) while its area is well below
`100 “2.
`
`V. Conclusion
`This paper shows a high complexity design, ap-
`proaching the system on chip concept, which was de-
`veloped within a very short lead time and introduced
`new design methodology concepts such as modelling
`and use of emulation for parallel engineering (hard-
`ware and software).
`
`353
`
`Authorized licensed use limited to: Alston & Bird LLP. Downloaded on January 20,2022 at 16:30:21 UTC from IEEE Xplore. Restrictions apply.
`
`CommScope, Inc.
`IPR2023-00066, Ex. 1027
`Page 5 of 5
`
`