`
`K. CHEN, M. AFGHANI, P.E. DANIELSON, and C. SVENSSON,
`
`“PASIC: A processor-A/D converter-sensor integrated circuit,”
`
`ISCAS90, pp. 1705-1708, 1-3 May 1990
`
`
`
`
`
`TRW Automotive U.S. LLC: EXHIBIT 1028
`PETITION FOR INTER PARTES REVIEW
`OF U.S. PATENT NUMBER 8,599,001
`IPR2015-00436
`
`
`
`PASIC: A Processor-AID converter-Sensor Integrated Circuit
`
`K. Chen*, M. Afghahi*, P. E. Danielsson** and C. Svensson*
`* LSI Design Center, ** Department of Electrical Engineering
`Linkoping Unversity
`S581 83, Linkoping, Sweden
`
`pixel-serial architecture limits the speed and flexible use of
`the sensor information. To avoid the large number of pads in
`line-by-line image processing, a linear array of parallel AID
`converters and parallel shift registers has to be integrated on
`the sensor chip. A linear processor array is then added to
`include the "intelligence" of the sensor, which produces the
`processed or possibly compressed digital image output.
`
`is under
`Such an integrated smart sensor, PASIC,
`development. PASIC stands for Processor, AID converter,
`Sensor Integrated Circuit. The chip also contains two 128
`8bit bi-directional shift registers and a 128x128 bit dynamic
`RAM. Fig. 1 illustrates the PASIC architecture, which is
`also the chip floor plan.
`
`~~~~:
`
`128 bit-serial PEs
`
`~~~~~J
`
`~ F
`
`ig. 1. PASIC architecture
`
`The 2-D sensor array is arranged in columns. Each column
`of sensors has an analog read-out bus. Thus, one row of
`sensors at a time is connected to the vertical parallel output.
`Each column of sensors is equipped with one AID converter,
`two 8-bit shift registers, a bit-serial processor and a 128-bit
`RAM. The two 8bit shift registers can be operated on five
`modes,
`read, write,
`latch,
`shift
`right and shift
`left
`independently. The AID converter,
`the shift registers,
`the
`processor and the memory communicate over a single-bit
`bus. The two parallel shift registers provide the horizontal
`communication link between adjacent processors. Also, they
`serve as the input and output of the chip.
`
`This paper describes the design of an integrated smart
`sensor, PASIC. The basic idea is to integrate a 2-D image
`sensor array with a linear AID converter array and a linear
`processor array in a single chip. The current version of
`PASIC contains 128 parallel processors with a 128x128 bit
`memory, 128 8-bit AID converters and a 128x128 photo
`sensor array. Two 128x8 bi-directional shift registers are used
`for communication between processor elements and I/O. A
`memory-bus organized architecture has been used, which has
`been proven as an efficient VLSI architecture for a SIMD bit(cid:173)
`serial processor array.
`
`Introduction
`
`Visual sensors have a great potential in many industrial
`applications. Commercially available visual sensors are
`developed mainly for television. Such sensors have excellent
`sensitivity, high resolution, can handle colors and have
`excellent
`reliability. However,
`their pixel
`read-out
`architecture, fixed frame rate and fixed resolution have a
`number of drawbacks for robot vision applications.
`
`An elegant approach to incorporate vision in a robot
`system is achieved by integrating a special purpose camera
`with a parallel processor array on the sensor chip. The
`continuous progress of VLSI technology has provided the
`opportunity to integrate both analog and digital technique on
`a single chip.
`For an image processing system,
`it
`is
`desirable to merge the sensor array, the AID converter array
`and the processor array on the same chip. A primitive goal
`of such a smart sensor is to provide a digital image output
`and to perform certain low level image processing tasks.
`Thanks to the use of a bit-serial strategy in the processors and
`the A/D converters, a programmability of amplitude
`resolution is achieved. The frame-rate is limited by AID
`resolution and the integrating time of the photo diodes which
`depends on the overall light intensity. Thus, the temporal
`resolution (frame-rate) can be traded for A/D converter
`resolution. The spatial resolution can also be altered by
`proper design of the sensor array and its control/addressing
`circuitry. The final goal might be to perform more
`complicated data reduction and feature extraction on the sensor
`chip. Since all data communication is within the chip, the
`high demand of I/O bandwidth between chips in a
`conventional system is greatly reduced. Reliability is
`improved because fewer chips are needed.
`
`The PASIC Architecture
`
`Fundamental limits of the number of pads in available
`VLSI package technology lead to the pixel serial read-out
`architecture of today's digital image processing systems. The
`
`CH2868-8/90/0000-1705$1.00 © 1990 IEEE
`
`1028-001
`
`
`
`speed, we have developed a new bit-serial processor
`architecture which is well suited for the purpose.
`
`The proposed bit-serial processor is characterized by a
`single-bit bus connection between A/D converter, shift
`registers, processor and memory, see Fig. 4.
`
`Memory
`bus
`
`AID
`
`converter
`
`Shift
`
`register
`
`ALU
`
`Memory
`
`two 8bit
`
`bus read/write
`
`bi-directional
`
`shift register
`
`Bit-serial
`
`processor
`
`with 8
`
`ALU
`
`functions
`
`128 bit
`
`RAM
`
`Fig. 4. Memory-bus organized bit-serial processor array
`
`The memory bus organized architecture is a true bit-serial
`processor. All units, memory cells, ALU, shift registers,
`AID converter etc., appear as memory-like ports on the bus
`and are controlled by simple memory-type decoders. During
`one time instance, only one bit of data is fed in or read out
`from the ALU. Examples on bit-serial algorithms for image
`processing [l] indicate that this communication strategy is
`not slower than other bit-serial machines like DAP, GAPP or
`MPP.
`
`It can be noted from Fig. 4 that there is virtually no data
`paths between the three ALU registers A, Band C except
`over
`the bus. However,
`this
`seeming deficiency is
`compensated by the fact that the ALU logic has no less than
`8 different types of Boolean functions. Since the A and B
`registers can receive data inverted or non-inverted from the
`bus, the number of Boolean functions is further extended. For
`instance, subtraction A-B can be implemented by using the A
`register input. Table I summarizes the processor chip
`performance.
`
`The control word (the micro-instruction) is 26 bit wide,
`which can be divided into five fields.
`
`AID converter array
`
`In a conventional camera system, a single video rate AID
`converter has to handle an entire frame of an image. In
`there is one AID converter for each column of
`PASIC,
`sensors. The speed requirement on these AID converters is
`greatly reduced to a "line-rate", which is two orders of
`magnitude lower than the video pixel rate. A serial ramp-type
`AID converter can be used. The parallel AID converter array
`consists of
`one counter, one ramp generator, 128
`comparators and 128 8-bit RAM cells, as shown in Fig. 2.
`
`Analog inputs
`
`3-to-8
`
`decoder
`
`R A M
`
`R A M
`
`R A M
`
`8bit
`
`counter
`
`Memory buses
`
`Fig. 2. The AID converter array
`
`3-transistor memory cells are used in the RAM as shown
`in Fig. 3. The value of the 8bit counter is broadcasted and
`written into the RAMs as long as the ramp signal is less
`then the analog input. When an input analog signal in a
`to the ramp signal,
`the output of the
`channel is equal
`comparator prohibits further wrIting. The RAM outputs are
`connected to single-bit buses which allow the data to be bit(cid:173)
`serially transferred to the shift registers, processor or
`memory.
`
`From comparator
`
`From decoder
`
`From counter
`
`Fig. 3. A RAM cell used in the converter array.
`
`Memory bus
`
`Processor Array
`
`Because of the large number of processor elements, PE's,
`involved, each PE has to be extremely simple. In particular,
`one PE has to be squeezed into the pitch of the photo sensor
`element which is 60llm in the first prototype with a 211m
`technology, 40llm in the second prototype with 1.61lm
`technology and expected to decrease in the later designs. We
`found it difficult to implement the existing architectures in
`our narrow slot. By observing that the connection between
`ALU and memory is the bottleneck of the bit-serial PE in
`
`1706
`
`1028-002
`
`
`
`AD:
`
`SRI:
`SR2:
`
`PE:
`
`MA:
`
`4bit AID converter address/control
`AD=O-7
`0-7th bit to Bus
`AD= 8
`data latch
`AD= 9
`count
`5bit
`shift register 1 address/control
`5bit
`shift register 2 address/control
`SR=O
`shift right
`SR=1
`shift left
`SR=2
`data latch
`SR=3
`circle right
`SR=4
`circle left
`SR=5-12
`Bus to 0-7th bit register
`SR=13-20
`0-7th bit register to Bus
`4bit ALU input/oU!m!t address/control
`B+C to Bus
`SR=O
`Be to Bus
`SR=1
`SR=2
`carry to Bus
`SR=3
`Bus to C
`SR=4
`AC+BC to Bus
`SR=5
`Ac+BC to Bus
`SR=6
`Bus to B
`SR=7
`B to Bus
`SR=8
`Bus to B
`SR=9
`BEBC to Bus
`SR=lO
`Bus to A
`SR=11
`BustoA
`SR=12
`Bus to A & carry feedback
`SR=13
`sum to Bus
`SR=14
`sum to Bus & carry feedback
`SR=15
`no operation
`8bit memory address/control
`MA=O
`OtoBus
`MA=I-127
`1-127 bit to Bus
`MA=128
`no operation
`MA=129-256
`Bus to 1-127
`
`Table I. Performance of the processor array
`
`Number of processors
`Parallel shift register
`On chip memory
`Maximum clock rate
`Addition (b-bit words)
`Subtraction
`One-way multiplex
`Two-way multiplex
`Multiplication
`(serial/parallel multiplier)
`Move b bits k steps east/west
`
`128
`two l28x8 bits
`l28x128 bits
`20MHz
`3b+2 cycles
`3b+2 cycles
`3b+l cycles
`4b+l cycles
`5b2+7b-l cycles
`4b+l
`2b+k cycles
`
`It
`is worth noting that the processing power of the
`proposed bus organized architecture is expandable. In the
`horizontal directions, the number of columns can be increased
`assuming that the yield is acceptable when the chip becomes
`larger and larger. In the vertical direction, we can alter the bits
`of the memory and the shift register, or we can include more
`ALU functions. A serial-parallel multiplier for each PE has
`recently been designed as a potential augmentation for the
`PASIC.
`
`Image Sensor Array
`
`Most commercially available solid-state sensors use CCD
`technology. As an alternative, CMOS photo-diodes can be
`used with a similar image quality. A single chip smart sensor
`requires a technology in which both sensor and processor can
`be made. Therefore, CMOS technology has been chosen for
`the implementation. An image sensor consists of a matrix of
`photo diodes. Each sensing element is transferred via a
`
`column of conductors to the AID converter array. The
`readout of the 2D sensor array is done line-by-line controlled
`by a vertical address decoder. The sensor element is shown in
`Fig. 5. Each photo diode is guarded against blooming.
`
`--..,....-....-+-- Vdd
`
`Address
`
`Pre-charge
`
`Analog bus
`
`Fig. 5. An image sensing element
`
`The readout of the sensor is non destructive. The sensor
`array may be randomly addressed, or it may be scanned in an
`ordered manner. The size of the photo-diodes is 48x481-l.m2,
`while the pitch size is 60x60/.l.m2 and the fill factor 64% in
`the first prototype.
`
`yLSI
`
`Implementatjon
`
`The design of the PASIC chip is divided into three stages.
`In the first stage, the purpose is to design a set of prototype
`chip to test the 2-D photo sensor array, the AID converter
`array and the processor array separately. Three chips
`containing a l6x16 sensor array, 8 AID converters and 32
`processors respectively have been fabricated and tested. After
`the success in the test of the first prototypes, we are now in
`the second stage. A full scale processor array with 128
`processor elements with the reduced pitch size of 40/.l.m in
`1.6/.1.m CMOS technology has been designed and sent for
`fabrication. To accommodate more and more sensors and
`processor elements, for an example 256x256 sensor elements
`and 256 processor elements in the same PASIC architecture,
`it is necessary to reduce the pitch size of a processor element.
`We are now designing another prototype chip to test the
`interface between the sensor array and AID converters. In the
`last stage, we are expected to assemble these parts into a
`complete PASIC. Fig. 6 shows the first prototype of the
`processor array chip including 32x8 shift registers, a 32x128
`bit RAM and 32 bit-serial processors.
`'-
`•
`
`.
`
`~
`
`..
`......
`
`Fig. 6. Photograph of the processor array chip
`
`1707
`
`1028-003
`
`
`
`The final chips will be designed in a 1.611m double metal
`CMOS technology. The total chip will be about 6x8 mm2,
`where the 2-D sensor array occupies most of the area, 5.1x5.1
`mm2, the memory occupies 5.1x1.1mm2, the processor array
`5.1x0.5mm2, the A/D converter array 5.1x0.25mm2 and the
`two shift registers 5.1x1 2 mm. The chip is designed for
`20MHz clock rate, which was verified by simulation of data
`extracted from the layout.
`
`Efforts have been made in layout to minimize the total
`area and maintain high speed. Critical parts of the chip, such
`as the decoders and their drivers, have been properly sized in
`transistor dimensions. For simple design and high speed, a
`true single phase clock [2] has been used in the entire chip.
`This technique has the advantage of simple clock distribution,
`small area for clock lines, reduced clock skew problems and
`high speed.
`
`Comparison with other designs
`
`Integrating of photo-sensors and processing elements
`provides a mechanism to concurrently perform computations
`previously intractable in real-time. The first attempt of
`integrating a visual sensor array and a processor array on a
`single chip was made several years ago at Linkoping
`University, LAPP, [3]. The commercial available version of
`LAPP today consists of a 128 photo sensors, threshold units
`for digitizer and a processor array, [4]. The integrated camera(cid:173)
`and-processor eliminates the bottleneck of sequential image
`read-out that characterizes conventional systems. This fully
`parallel architecture is equipped with a dedicated processing
`unit for each pixel. Images are binarized and processed line(cid:173)
`by-line.
`
`Another approach to sensor, AID converter and processor
`integration uses 3-D technology [5]. This device consists of a
`5-by-5 array of photo sensors, 5-by-5 2-bit CMOS A/D
`converters, 40 ALUs and shift registers arranged in a 3-layer
`structure. Signals from photo diodes are transferred to the 2(cid:173)
`bit A/D converters. The quantized digital data is then
`transferred to the third layer, the ALUs. Finally, the processed
`signals are read out with shift registers. The image sensing,
`quantization and data processing of all pixels are operated
`simultaneously frame-by-frame. This architecture allows
`extremely high speed fundamental
`image processing.
`However, the area resolution and signal amplitude resolutions
`are severely limited by the architecture of the A/D converter
`and ALU. The size of a pixel is 1.05x1.05 mm2.
`
`Some excellent work on photo-sensor arrays with analog
`processing inspired by biological retinas has been performed
`by Carver Mead and his co-workers [6]. However, these
`analog implementations are too specialized and do not have
`the flexibility to be tailored for different applications.
`
`PASIC can be considered as a second generation of the
`LAPP concept. The sensor array is extended to two(cid:173)
`dimensions, the AID converter is extended from 1 bit to 8bit,
`and the on-chip memory has been extended from 14 bits to
`128 bits for each processor element, which now is a fully
`general purpose ALU. A bus-organized processor architecture
`has been introduced for the processor array. "GAPP exercises"
`[7] has provided the background for the PASIC algorithm and
`
`software developments. Table II summarizes the comparison
`of these above mentioned three designs.
`
`Table II. Comparison of Integrated Smart Sensors
`
`Technology
`Num. of sensors
`Pixel size
`Num. of AID
`AID resolution
`Num. of PEs
`ALU type
`On-chip RAM
`Chip size
`
`PASIC
`3D-IC
`LAPP
`3D VLSI 1.611m CMOS
`311m CMOS
`5x5
`128x128
`128x1
`50x50!1ll2 1.05x1.05mm2 4Ox40l1m2
`128
`128
`5x5
`Ibit
`2bit
`8bit
`128
`40
`128
`binary logic
`2bit ALU bit-serial ALU
`128x14
`128x128
`0
`8x8mm2
`6x8mm2
`5xlOmm2
`
`Conclusions
`
`A smart sensor, PASIC, with a 128x128 photo-sensor
`array, 128 AID converters,
`two 128x8 bit parallel shift
`registers, 128 bit-serial processors and a 128x128 bit RAM
`on the same chip is under development. Image processing
`within the sensor chip greatly reduces the I/O bandwidth,
`complexity and system cost. Parts of the PASIC have been
`implemented and tested. The single chip system will provide
`high frame rate digital images, perform some low-level image
`processing and data reduction. It offers a compact and
`economic system for robot vision compared to conventional
`systems where the sensor, AID converter, processor array and
`memory are separate chips.
`
`Acknowledgements:
`
`The authors wish to thank Dr. R. Forchheimer and A.
`Odmark for valuable discussions. The work
`has been
`sponsored by the Center
`for
`Industrial
`Information
`Technology, at Linkoping University.
`
`References
`
`[1] K. Chen, A. Astrom, T. Ahl and P. E. Danielsson:
`"PASle. A Smart
`sensor
`for computer vision",
`Submitted to 10th International Conference on Pattern
`Recognition.
`[2] J. R. Yuan and e. Svensson: "High speed CMOS circuit
`technique", IEEE J. Solid State Circuits, Vol. 24, pp.
`62-70, 1989.
`[3] R. Forchheimer and A. Odmark, "Single chip linear array
`processor", in Applications of Digital Image Processing,
`SPIE, Vol. 397, 1983.
`[4] LAPP1100, Product Description, IVP Integrated Vision
`Product AB, Technikringen 3, S58330 Linkoping,
`Sweden.
`[5] T. Nishimura, Y. Inoue, K. Sugahara, S. Kusunoki, T.
`Kumamoto: "Three Dimensional IC for high performance
`image signal processor", Proceedings of International
`electron devices meeting, 1987, pp. 111-114.
`[6] e. A. Mead: "Analog VLSI and Neural Network
`Systems", Addison Wesley, 1989.
`[7] P. E. Danielsson, "SIMD-array, a GAPP exercise",
`Lecture notes for postgraduate course, Linkoping
`University, Sweden, 1986.
`
`1708
`
`1028-004