IPR2018-01606, No. 1018 Exhibit - Mertzios, et al, Fast implementation of 3 D filters via systolic array processors, Multidimensional Systems and Signal Processing, vol 89, 1997, 335 349 (P.T.A.B. Sep

Multidimensional Systems and Signal Processing, 8, 335–349 (1997)
`c° 1997 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
`
`Fast Implementation of 3-D Digital Filters Via
`Systolic Array Processors
`
`mertzios@demokritos.cc.duth.gr
`B. G. MERTZIOS
`Department of Electrical and Computer Engineering, Democritus University of Thrace, 67 100 Xanthi, Greece
`
`A. N. VENETSANOPOULOS
`Department of Electrical and Computer Engineering, University of Toronto, Toronto M5S 1A4, Canada
`
`Abstract. In this paper a fast implementation architecture of three-dimensional (3-D) FIR or IIR digital ﬁlters via
`systolic VLSI array processors is described. The modular structure presented is comprised of similar processing
`elements in a linear cascade conﬁguration with local interconnections. High speed throughput rates are attained
`due to high concurrency, which is achieved by exploiting both pipelining and parallelism. The considered 3-D FIR
`and IIR ﬁlters may be used for the processing of reconstructed 3-D images and in medical imaging applications.
`
`Key Words: 3-D digital ﬁlters, systolic arrays, piplining, parallelism
`
`I. Introduction
`
`During the last decade we have witnessed a growing interest in the design and implemen-
`tation of three-dimensional (3-D) and M-D digital ﬁlters [1]–[10], which ﬁnd numerous
`applications in medical imaging [11]–[14], computer vision [15], 3-D TV video signals
`[16], image restoration and enhancement for geophysical and seismic data [17], and pro-
`cessing of time-varying images [18]. In particular, digital image processing is necessary
`for the processing of medical images in order to provide higher quality images for inter-
`pretation and diagnosis. In medical imaging applications there is usually a clinical need
`to examine sections of the human body along directions, where direct image acquisition
`cannot be attained [12]. This latter problem of representation of 3-D images is solved
`by combining a number of two-dimensional (2-D) sections and then ﬁltering the 3-D re-
`constructed images. Reconstruction and processing of 3-D images are used in imaging
`magnetic resonance (MRI), medical imaging and in the reconstruction of the carotid vessel
`by echographic sections [14].
`The need for fast processing of huge amount of data has led to the use of special purpose
`hardware architectures since the computer-based digital signal processing systems, which
`are designed with a general purpose structure and data processing philosophy, have certain
`features that prevent high throughput. The most prominent of the special purpose distributed
`processing structures are the Array Processors (APs). APs are ideal for the fast real-time
`implementation of complex digital signal processing algorithms. The aim of the APs design
`is deﬁned as the choice of the best pipelined and parallel processing techniques and device
`technology, in order to meet satisfactory performance with low-cost.
`In particular, the
`VLSI APs are special purpose, locally interconnected computing networks that maximize
`concurrency by combining pipelining and parallelism. They are fully implemented by VLSI
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 335
`
`

`336
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`chips and are characterized by massive concurrency and regular data ﬂow [19], [20]. The two
`most popular special purpose VLSI APs are the systolic and wavefront arrays, which exploit
`both pipelining and parallelism by using the concept of computational wavefront [20]–[22].
`A systolic array is a network of elementary Processing Elements (PEs) that rhythmi-
`cally compute and pass data through the system. A wavefront array is a systolic array
`plus data ﬂow computing. Both kinds of arrays are algorithm oriented and they feature
`the desired properties of regularity, modularity, local communication, data and computa-
`tional pipelining and highly synchronized multiprocessing. However, the systolic arrays are
`globally synchronized since the data movements are controlled by global timing-reference
`“beats”, while the wavefront arrays are locally synchronized since the data movements
`are controlled by correct sequencing using handshaking. In addition to the classical im-
`plementation criteria of low-sensitivity with respect to ﬁnite word length effects, absence
`of overﬂow oscillations and limit cycles, the ﬁgures of merit in the VLSI implementation
`of digital signal processors are: (i). Concurrency, which is achieved by pipelining (data
`and computational pipelining), parallelism and multiprocessing, (ii). Repetitive, regular
`and modular structure, (iii). Local communication, which is the only permitted, (iv). Local
`synchronization and (v). Workload and ﬂow distribution.
`Since the communication in VLSI is very expensive and remains restrictive, only short
`local communication paths are used, while no shared buses are needed. VLSI APs have
`been used for the fast implementation of many matrix based algorithms (matrix decompo-
`sitions, triangularization, matrix-vector multiplications) and signal processing algorithms
`(convolution and FFT techniques, estimation) [19], [20]. Moreover, recently fast implemen-
`tation structures were presented for one-dimensional (1-D) [22]–[28] and 2-D digital ﬁlters
`[29]–[31]. The recursive algorithms, or generally the realizations which need feedback,
`are usually considered inappropriate for high speed implementation, due to the recursive
`bottleneck burden. The existence of feedback loops imposes a bound in the achieved
`throughput rate and usually results in nonlocalized communications and nonlocalized tim-
`ing. Speciﬁcally, the maximum latency of the feedback loops determines the maximum
`allowed throughput rate. Fortunately, the recursive bottleneck may be overcome by recast-
`ing the algorithm using the principle of look-ahead computation, in order to increase the
`number of delays in the feedback loops and retiming to effectively pipeline the computation
`within the loops [25], [33], [34]. Other techniques for fast processing of recursive algo-
`rithms are the bit-level pipelining [35], [36] and the block processing [23], [29], [34], [37]
`(and the references therein). Also the use of internal local feedback loops, whenever this is
`possible, increases the throughput rate [23], [26], [33].
`The present paper refers to the fast implementation of 3-D FIR and IIR digital ﬁlters via
`VLSI array processors. A systolic-like architecture is presented, which is comprised of
`similar PEs in a linear cascade conﬁguration, with local communications. The resulting
`structures are modular, regular, with local interconnections. Concurrency is achieved by
`exploiting both pipelining and parallelism. The proposed VLSI implementation of the 3-D
`digital ﬁlters attains high throughput rates which meet the requirement of high sampling
`rates in the real-time processing applications.The FIR and IIR 3-D digital ﬁlters considered
`may be used for the processing of reconstructed 3-D images, such as medical or geophysical
`images.
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 336
`
`

`3-D DIGITAL FILTERS
`
`337
`
`II. Direct Form Realization of 3-D Digital Filters
`
`A. 3-D FIR Digital Filters
`
`The 3-D FIR linear digital ﬁlters may be described by the 3-D nonrecursive difference
`equation
`
`N1
`
`N2
`
`N3
`
`ai1,i2,i3 u(n1 − i1, n2 − i2, n3 − i3)
`
`(1)
`
`X i
`X i
`X i
`
`2=0
`
`3=0
`
`1=0
`
`y(n1, n2, n3) =
`
`where y(n1, n2, n3), u(n1, n2, n3) represent the input and output 3-D sequences respectively.
`The direct form realization of the 3-D FIR ﬁlter (1) may be seen as an extension of the
`direct form realization of the 2-D FIR ﬁlter [38]; speciﬁcally now each coefﬁcient of the
`2-D ﬁlter is replaced by an 1-D polynomial in the third variable. The three independent
`variables n1, n2, n3 are associated with three distinct Unit Delay (UD) elements UD1, UD2,
`UD3, corresponding to the variables z1, z2, z3, which appear in the 3-D ﬁlter’s transfer
`function
`
`N1
`
`N2
`
`N3
`
`ai1,i2,i3 z−i1
`
`1
`
`z−i2
`2
`
`z−i3
`3
`
`(2)
`
`X i
`X i
`X i
`
`1=0
`
`2=0
`
`3=0
`
`H (z1, z2, z3) =
`
`B. 3-D IIR Digital Filters
`
`The 3-D IIR linear digital ﬁlters may be described by the 3-D recursive difference equation
`
`N1
`
`N2
`
`N3
`
`ai1,i2,i3 u(n1 − i1, n2 − i2, n3 − i3)
`
`X i
`X i
`X i
`
`1=0
`
`2=0
`
`3=0
`
`y(n1, n2, n3) =
`
`bi1,i2,i3 y(n1 − i1, n2 − i2, n3 − i3)
`
`M1
`
`M2
`
`M3
`
`X i
`X i
`X i
`
`1=0
`
`2=0
`
`3=0
`
`−
`
`(i1, i2, i3) 6= (0, 0, 0)
`
`(3)
`
`in analogy to the 2-D quarter-plane model [38]. Eq. (1) should be computable according to
`the selected scanning. There is not any restriction among the triples of indices (M1, M2, M3)
`and (N1, N2, N3), since causality in space is not a necessary condition for computability.
`The 2-D transfer function, associated with (3), is given by the real rational function
`
`H (z1, z2, z3) =
`
`a(z1, z2, z3)
`b(z1, z2, z3)
`
`i2=0 PN3= PN1i1=0 PN2
`
`
`z−i3
`z−i2
`i3=0 ai1,i2,i3 z−i1
`3
`2
`
`
`1 + PM1i1=0 PM2i2=0 PM3
`z−i2
`i3=0 bi1,i2,i3 z−i1
`z−i3
`2
`3
`
`1
`
`1
`
`The two known forms of direct realization exist also here, in analogy to the 1-D and 2-D
`ones. The direct form I realization results from the cascade conﬁguration of the nonrecursive
`
`(i1, i2, i3) 6= (0, 0, 0)
`
`(4)
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 337
`
`

`338
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`Figure 1. Block diagram of the direct form I realization of a 3-D IIR digital ﬁlter.
`
`Figure 2. Block diagram of the direct form II realization of a 3-D IIR digital ﬁlter.
`
`3-D FIR ﬁlter
`
`HF (z1, z2, z3) = a(z1, z2, z3)
`
`with the recursive 3-D all-pole IIR ﬁlter (Fig. 1).
`
`HI (z1, z2, z3) = 1/b(z1, z2, z3)
`
`(5)
`
`(6)
`
`On the contrary, the direct form II realization of the 3-D ﬁlter (3) results from the cascade
`conﬁguration of the subﬁlters HF (z1, z2, z3), HI (z1, z2, z3) in reverse order (Fig. 2). The
`space-invariance property [39] of the ﬁlter considered ensures that the transfer function
`remains unchanged in both realizations.
`The direct form II realization of a 3-D IIR ﬁlter is described by the equations:
`
`M1
`
`M2
`
`M3
`
`bi1,i2,i3 w(n1 − i1, n2 − i2, n3 − i3)
`
`X i
`X i
`X i
`
`1=0
`
`2=0
`
`3=0
`
`w(n1, n2, n3) = u(n1, n2, n3) −
`
`(7a)
`
`(7b)
`
`(i1, i2, i3) 6= (0, 0, 0)
`
`N1
`
`N2
`
`N3
`
`ai1,i2,i3 w(n1 − i1, n2 − i2, n3 − i3)
`
`X i
`X i
`X i
`
`2=0
`
`3=0
`
`1=0
`
`y(n1, n2, n3) =
`
`where w(n1, n2, n3) is an intermediate variable. Here the delays UD1, UD2 in the directions
`of n1 and n2, corresponding to the variables z1 and z2 of the transfer function, may be shared
`by the subﬁlters HI (z1, z2, z3) and HF (z1, z2, z3).
`Consider the Row by Row, Plane by Plane (RRPP) scanning, in a 3-D frame of size
`J1 × J2 × J3, where the inputs are processed sequentially along the three rectangular axes;
`then the mapping of the spatial pile (n1, n2, n3) to the lexicographic index is determined by
`the index mapping
`
`I (n1, n2, n3) = n1 + J1n2 + J1 J2n3
`
`(8)
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 338
`
`

`3-D DIGITAL FILTERS
`
`339
`
`The adoption of the RRPP scanning, implies that the Z-Transform (ZT) operator is associ-
`ated to the unit delays UD1, UD2, UD3 (denoted as simple delays by 1 in the corresponding
`equations), according to the following relations:
`
`z−1
`1 Z T [x (n1, n2, n3)] = Z T [x (n1 − 1, n2, n3)]
`z−1
`2 Z T [x (n1, n2, n3)] = Z T [x (n1, n2 − 1, n3)]
`z−1
`3 Z T [x (n1, n2, n3)] = Z T [x (n1, n2, n3 − 1)]
`
`(9a)
`
`(9b)
`
`(9c)
`
`which show the correspondance of the variables z1, z2, z3 with the unit delays UD1, UD2,
`UD3.
`
`III. Fast Implementation of 3-D Filters Via Systolic Arrays
`
`In this section we describe implementation structures of the 3-D FIR and IIR ﬁlters via
`VLSI array processors, which are based on the nonrecursive direct form realization and on
`the recursive direct form II realization respectively.
`
`A. 3-D FIR Digital Filters
`
`The systolic arrays implementation of a 3-D FIR digital ﬁlter consists of a 2-D array of
`Processing Units (PUs), which are locally interconnected (Fig. 3a). Each PU is formed as a
`cascade conﬁguration of N1 +1 elementary PEs (Fig. 3b). There are in total (N2 +1)(N3 +1)
`PUs denoted as PU (i, j ), i = 0, 1, . . . , N2, j = 0, 1, . . . , N3. Moreover, each PU(i, j )
`operates with a separate input the sample u(n1, n2 − i, n3 − j ). Thus the delays D2, D3,
`which due to the structure of the RRPP scanning correspond to large delays (D2 = J1 D1,
`D3 = J1 J2 D1), do not have to be implemented.
`The structure of each PE is shown in Fig. 3c. The output of the PE is given by
`
`ai1,i2,i3 w(n1 − i1, n2 − i2, n3 − i3)
`
`(10)
`
`N1
`
`X i
`
`1=0
`
`vi2,i3 =
`
`Considering that the delay of local communication is negligible, the delay UD1 is con-
`sidered to be equal to the time T needed to execute the operations in a PE, i.e.
`
`UD1 = T = Mu + Ad
`
`(11)
`
`where Mu and Ad denote the time needed to execute one multiplication and one addition
`respectively.
`The whole structure operates with column block pipelining [40]. The maximum allowed
`throughput rate is therefore determined by
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 339
`
`(12)
`
`1 T
`
`R ≤
`
`Thus considering Mu = 115 ns and Ad = 19 ns for the 16 bit multipliers and adders [41],
`
`

`340
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`(a)
`
`(b)
`
`Figure 3. The systolic arrays’ implementation of the direct form realization of a 3-D FIR digital ﬁlter: (a). The
`layout diagram, (b). The PU(i, j ), i = 0, 1, . . . , n2, j = 0, 1, . . . , n3, (c). The structure of the processing element
`(PE).
`
`(c)
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 340
`
`

`3-D DIGITAL FILTERS
`
`341
`
`we obtain T = 134 ns and
`
`= 7.46 × 106 OPS/sec = 7.46 MOPS/sec
`
`1 T
`
`R ≤
`
`The latency or throughput delay is deﬁned as “the time separating the appearance of an
`input sample at the input port from the appearance of the corresponding output sample at
`the output port”. However, the latency (or throughput delay) is not of great importance in
`most applications. Only the latency of the feedback loops, whenever they exist, is critical
`since it imposes an upper bound in the achieved throughput rate [23], [25].
`In order to achieve global synchronization and pipelining of the proposed systolic archi-
`tecture, with clock period T , the outputs of the summers in the right column of Fig. 3a are
`delayed by the cutsets of the array that are orthogonal to the schedule vector s = [1 1]T .
`Each cutset introduces delays equal to the throughput delay (N1 + 1)T of the PUs. Then
`the latency of the whole array implementation is readily found to be
`
`L = [(N1 + 1)(N2 + N3 + 2)]T
`
`(13)
`
`since there are (N2 + N3 + 2) PUs along the schedule vector s.
`
`B. 3-D IIR Digital Filters
`
`The systolic arrays implementation of a 3-D IIR digital ﬁlter, unlike the FIR one, involves
`at least one feedback loop, due to its recursive nature. Since there are three independent
`variables that are represented by the three delays D1, D2, and D3 and only one type of the
`delays is involved in the PUs, a 2-D array of PUs is needed for the implementation of a
`3-D IIR digital ﬁlter (Fig. 4a). Each PU is a bi-directional array of elementary PEs and
`uses only the delays D1. The PU(0, 0) involves the feedback loop and is shown in Fig. 4b.
`The PU(0, j ), j = 1, 2, . . . , M3 and the PU(i, j ), i = 1, 2, . . . , M2, j = 0, 1, . . . , M3 are
`shown in Fig. 4c and Fig. 4d respectively. The structure of a typical PE is shown in Fig. 4e.
`Some remarks relatively to the proposed implementation are in order:
`
`1. Only the upper left PU has a feedback loop in its ﬁrst PE. However, note that the sum
`
`M1
`
`M2
`
`M3
`
`v(n1, n2, n3) =
`
`bi1,i2,i3 w(n1 − i1, n2 − i2, n3 − i3)
`
`X i
`X i
`X i
`
`2=0
`
`3=0
`
`1=0
`
`(i1, i2, i3) 6= (0, 0, 0)
`
`(14)
`
`is fedback in the summer of the feedback loop.
`
`2. The delays UD1 in the PEs are used for both the forward and the feedback paths, as
`occurs in the 1-D and 2-D direct form II realization structures.
`
`3. After rescaling of the time units [19], [25], the feedback loop involves two delays UD1,
`i.e. the pipelining period (time scaling factor) is a = 2 and the input data have to be
`interleaved with blank data.
`It results that the latency of the feedback loop, which
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 341
`
`

`342
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`(a)
`
`(b)
`
`Figure 4. The systolic arrays’ implementation of the direct form II realization of a 3-D IIR digital ﬁlter: (a). The
`layout diagram, (b). The processing unit PU(0,0) with the feedback loop, (c). The PU(0, j ), j = 1, 2, . . . , m3.
`
`(c)
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 342
`
`

`3-D DIGITAL FILTERS
`
`343
`
`(d)
`
`(e)
`
`Figure 4. Continued. The systolic arrays’ implementation of the direct form II realization of a 3-D IIR digital
`ﬁlter: (d). The PU(i, j ), i = 1, 2, . . . , m2, j = 0, 1, . . . , m3, (e) The structure of a processing element.
`
`determines the iteration bound in the recursive structures, equals two rescaled delays,
`i.e.
`
`L f = 2 UD1
`
`(15)
`
`4. The minimum possible rescaled delay UD1 equals the time needed to execute the
`computations in the critical path of a PE, considering that the local communication
`time is negligible. Due to the system’s regularity and modularity, all the paths include
`one multiplication and one addition, i.e.
`
`UD1 = T = Mu + Ad
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 343
`
`(16)
`
`

`344
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`5. The auxiliary variable w(n1, n2, n3) is produced at the ﬁrst PE of the PU(0, 0). This
`variable needs to be delayed w.r.t. the three indices n1, n2, n3. The delay w.r.t. n1 is
`implemented in the PEs of each PU. The delays w.r.t. n2 and n3 are implemented by using
`two type of registers R2i , i = 1, 2, . . . , N2 and R3 j , j = 1, 2, . . . , N3 respectively.
`
`6. The PU(i, j ), i = 1, 2, . . . , N2, j = 0, 1, . . . , N3 are equipped with the registers R2i ,
`i = 1, 2, . . . , N2. Moreover, the PU(0, j ), j = 1, 2, . . . , N3 are equipped with the
`registers R3 j , j = 1, 2, . . . , N3. All the registers are of ﬁrst input, ﬁrst output (FIFO)
`type. Speciﬁcally, the register R2i stores the J1 samples (Fig. 5)
`
`w(n1, n2 − i − 1, n3 − j ), w(n1 + 1, n2 − i − 1, n3 − j ), . . . ,
`
`w( J1 − 1, n2 − i − 1, n3 − j ), w(0, n2 − i, n3 − j ), . . . , w(n1 − 2n2 − i, n3 − j )
`
`w(n1 − 1, n2 − i, n3 − j )
`
`at the time instant before the introduction of w(n1, n2 − i, n3 − j ); at the clock cycle
`where w(n1, n2 − i, n3 − j ) is stored, the oldest sample w(n1, n2 − i − 1, n3 − j ) is
`erased. Moreover the register R3 j stores the J1 J2 samples (Fig. 6)
`
`w(n1, n2, n3 − j − 1), w(n1 + 1, n2, n3 − j − 1), . . . , w( J1 − 1, n2, n3 − j − 1),
`
`w(0, n2 + 1, n3 − j − 1), . . . , w( J1 − 1, J2 − 1, n3 − j − 1),
`
`w(0, 0, n3 − j ), . . . , w(n1 − 1, n2, n3 − j )
`
`at the time instant before the introduction of w(n1, n2, n3 − j ); at the clock cycle where
`w(n1, n2, n3 − j ) is stored, the oldest sample w(n1, n2, n3 − j − 1) is released and
`erased.
`
`The maximum allowed throughput rate of the whole systolic implementation structure is
`determined as function of the latency of the feedback loop and is given by [23], [26]
`
`(17)
`
`1
`
`2(Mu + Ad)
`
`=
`
`=
`
`1
`
`2T
`
`1 L
`
`f
`
`R ≤
`
`The maximum sampling rate at the input port is conﬁned by the throughput rate of the
`implementation, i.e.
`
`FS ≤ R
`
`(18)
`
`The latency of the whole systolic implementation of the 3-D IIR digital ﬁlter is found to
`be
`
`L = 2(M1 + 1)(M2 + M3 + 1)T
`
`(19)
`
`since there are (M2 + M3 + 1) PUs along the schedule vector s = [1 1]T . The latency of
`each PU equals to (M1 + 1)T and the PEs are pipelined in both directions.
`Considering again Mu = 115 ns and Ad = 19 ns for the 16 bit multipliers and adders
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 344
`
`

`3-D DIGITAL FILTERS
`
`345
`
`Figure 5. The structure of the register R2i .
`
`[41] we ﬁnd that the throughput rate of the recursive 3-D IIR digital ﬁlter is
`
`R =
`
`1
`
`2T
`
`= 3.73 × 106 OPS/sec = 3.73 MOPS/sec
`
`For comparison of the speed requirements we refer that the real- time processing of a 2-D
`256 × 256 image with a TV scan rate of 30 images per second and one operation per pixel
`requires a sampling rate FS = 1.97 MOPs.
`
`V. Conclusions
`
`The systolic VLSI array processors’ implementation of the direct form realization of 3-D IIR
`digital ﬁlters is presented. Row by row, plane by plane scanning of the processed images
`is considered. Registers operating at the sample rate are used for fast implementation
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 345
`
`

`346
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`Figure 6. The structure of the register R3 j .
`
`of the “slow” delay operator. The proposed structures are modular, regular, use only
`local communications, and achieve high throughput rates using pipelining. The obtained
`implementations of the 3-D digital ﬁlters are useful for the fast processing of medical
`images, 3-D video signals, 3-D computer vision images and of time-varying images.
`The concurrency, and therefore the throughput rate of the proposed structures, may be
`increased by exploiting parallelism in addition of pipelining. Parallel computation can be
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 346
`
`

`3-D DIGITAL FILTERS
`
`347
`
`achieved by adopting the bright idea of diagonal processing [43], [44], [6], which is based
`on a modiﬁed sampling of the original 3-D signal, such that the sampling raster is distinct
`in each one of any 3 consecutive hyperplanes. Then all the pixels belonging to the same
`hyperplane are processed simultaneously and in parallel. In 3-D case, the hyperplanes are
`parallel to the diagonal planes at 45◦. Then all the 3-D computations are organized as
`1-D ones (corresponding to a time axis), at the expense of additional hardware. The set of
`the pixels belonging to a diagonal hyperplane that can be computed simultaneously form
`a grid pattern, which is called diagonal hyperstructure. The diagonal 3-D processing is
`ideal for the processing of time dependent images. The systolic implementation of 3-D IIR
`digital ﬁlters using the diagonal 3-D processing will highly increase the attained throughput
`rate, since it will produce all the pixels belonging on a plane, in a time period equal to
`the latency 2T of the feedback loop. Thus, real time processing of 3-D images will be
`possible. The above interesting improvement in 3-D systolic implementation is currently
`under consideration.
`
`References
`
`1. B. G. Mertzios and A. N. Venetsanopoulos, “Modular Realization of M-dimensional Filters,” Signal Pro-
`cessing, vol. 7, 1984, pp. 351–369.
`
`2. A. N. Venetsanopoulos and B. G. Mertzios, “Real-Time Image Processing with Decomposition Structures,”
`Time-Varying Image Processing and Moving Object Recognition (V. Cappellini, ed.), Amsterdam: Elsevier
`Science Publishers B.V. (North-Holland), 1987, pp. 3–18.
`
`3. R. L. Webber and R. N. Nagel, “Three Dimensional Enhancement of Two-Dimensional Images,” Journal
`of Clinical Engineering, 1980, pp. 41–50.
`
`4. A. Fettweis, “Multidimensional Circuits and Systems Theory,” Tutorial Lecture, Proc. IEEE Int. Symposium
`on Circuits Syst., Montreal, Canada, May 1984, pp. 951–957.
`
`5. A. Fettweis, “Multidimensional Digital Filters with Closed Loop Behavior Designed by Complex Network
`Theory Approach,” IEEE Trans. on Circuits Syst., vol. CAS-34, no. 4, 1987, pp. 338–344.
`
`6. X. Liu and A. Fettweis, “Multidimensional Digital Filtering by Using Parallel Algorithms Based on Diagonal
`Processing,” Multidimensional Systems and Signal Processing, vol. 1, 1990, pp. 51–66.
`
`7. A. Fettweis, “Discrete Modelling of Lossless Fluid Dynamic Systems,” Archiv fur Elektronik und Ubertra-
`gungstechnik (AEU), vol. 46, no. 4, 1992, pp. 209–218.
`
`8. L. B. Bruton and N. R. Bartley, “The Design of Highly Selective Adaptive Three Dimensional Recursive
`Cone Filters,” IEEE Trans. on Circuits Syst., vol. CAS-34, no. 7, 1987, pp. 775–781.
`
`9. M. Zervakis and A. N. Venetsanopoulos, “Three-Dimensional Rotated Digital Filters: Design, Stability and
`Applications,” Circuits, Signal and Signal Processing, vol. 9, no. 4, 1990, pp. 383–408.
`
`10. V. Cappellini, L. Alparone, G. Galli, P. Lange, A. Mecocci and L. Menichetti, “Digital Processing of Stereo
`Images and 3-D Reconstruction Techniques,” Int. J. Remote Sensing, vol. 12, no. 3, 1991, pp. 477–490.
`
`11. C. C. Jaffe, “Medical Imaging,” American Scientist, vol. 70, 1982, pp. 576–585.
`
`12. V. Cappellini, R. Carla and M. Melani, “3-D Digital Filtering of Biomedical Images,” Proc. 1986 European
`Signal Processing Conference, The Hague, The Netherlands, Sept. 1986, pp. 1383–1386.
`
`14. R. Carla et. al., “3-D Reconstruction of Carotid Vessel by Echographic Sections,” Time-Varying Image
`Processing and Moving Object Recognition (V. Cappellini, ed.), Amsterdam: Elsevier Science Publishers
`
`13. G. Garibotto, S. Garozzo, M. Micca, G. Piretta and C. Giorgi, “Three Dimensional Digital Signal Processing
`in Neurosurgical Applications,” Proc. of the Int. Conf. on Digital Signal Processing, Firenze, Italy, 1981,
`pp. 434–444.
`
`B.V. (North-Holland), pp. 153–157, 1987.Petitioner Microsoft Corporation - Ex. 1018, p. 347
`
`

`348
`
`B. G. MERTZIOS AND A. N. VENETSANOPOULOS
`
`15. D. H. Ballard and C. M. Brown, Computer Vision, Englewood Cliffs, NJ: Prentice Hall, 1982.
`
`16. J-Y. Quellet and E. Dubois, “Sampling and Reconstruction of NTSC Video Signal at Twice the Color
`Subcarrier Frequency,” IEEE Trans. on Commun., vol. COM-29, 1981, pp. 1823–1832.
`
`17. N. Keskes, A. Boulanovar and O. Faugeras, “Application of Image Analysis Techniques to Seismic Data,”
`Proc. IEEE Int. Conf. on Acoust. Speech, Signal Processing, ICASSP-82, Paris, May 1982.
`
`18. G. Tascini, “Intrinsic Three-Dimensional Representation of Digital Images,” Proc. of Mediterranean Elec-
`trotechnical Conf., MELECON 83, Paper A8.07, Athens, Greece, May 1983.
`
`19. S. Y. Kung, H. J. Whitehouse and T. Kailath, Eds., VLSI and Modern Signal Processing, Englewood Cliffs,
`NJ: Prentice-Hall, 1985.
`
`20. S. Y. Kung, VLSI Array Processors, Englewood Cliffs, NJ: Prentice-Hall, 1987.
`
`21. H. T. Kung, “Why Systolic Architectures,” Computer, vol. C-15, 1982, pp. 37–46.
`
`22. S. Y. Kung, “On Supercomputing with Systolic/Wavefront Array Processors,” Proc. IEEE, vol. 72, 1984,
`pp. 867–884.
`
`23. H. H. Lu, E. A. Lee, and D. G. Messerschmitt, “Fast Recursive Filtering with Multiple Slow Processing
`Elements,” IEEE Trans. on Circuits Syst., vol. CAS-32, no. 11, 1985, pp. 1119–1129.
`
`24. S.K. Rao and Th. Kailath, “VLSI arrays for digital signal processing: Part I - A model identiﬁcation approach
`to digital ﬁlter realizations,” IEEE Trans. on Circuits Syst., vol. CAS-32, no. 11, 1985, pp. 1105–1118.
`
`25. K. K. Parhi and D. G. Messerschmitt, “Concurrent Cellular VLSI Adaptive Filter Architectures,” IEEE
`Trans. on Circuits Syst., vol. CAS-34, no. 10, 1987, pp. 1141–1151.
`
`26. B. G. Mertzios, “Fast Implementation of Multivariable Linear Systems Via VLSI Array Processors,”
`COMPEL—The International Journal for Computation and Mathematics in Electrical and Electronic En-
`gineering, vol. 10, no. 1, 1991, pp. 1–10.
`
`27. B. G. Mertzios and A. N. Venetsanopoulos, “Implementation of Quadratic Digital Filters Via VLSI Array
`Processors,” Archiv fur Elektronik und Ubertragungstechnik (AEU), vol. 43, no. 3, 1989, 153–157.
`
`28. B. G. Mertzios and St. Scarlatos, “On the Systolic Implementation of Wave Digital Filters,” Archiv fur
`Elektronik und Ubertragungstechnik (AEU), vol. 45, no. 6, 1991, pp. 335–343.
`
`29. B. G. Mertzios, “Fast Block Implementation of Two-Dimensional Recursive Digital Filters via VLSI Array
`Processors,” Archiv fur Elektronik und Ubertragungstechnik (AEU), vol. 44, no. 1, 1990, pp. 55–58.
`
`30. T. Aboulnasr and W. Steenart, “Real-Time Systolic Array Processor for 2-D Spatial Filtering,” Proc. Third
`European Signal Processing Conf., EUSIPCO-86, The Hague, The Netherlands, Sept. 1986, pp. 687–690.
`
`31. N. R. Shanbhag, “An Improved Systolic Arcitecture for 2-D Digital Filters,” IEEE Trans. on Circuits Syst.,
`vol. CAS-39, no. 5, 1991, pp. 1195–1202.
`
`32. B. G. Mertzios and A. N. Venetsanopoulos, “Fast Direct Implementations of Two-Dimensional IIR Digital
`Filters via Systolic and Wavefront Arrays,” Int. J. of Circuit Theory and Applications, vol. 21, 1993, pp. 275–
`285.
`
`33. K. K. Parhi and D. G. Messerschmitt, “Pipeline Interleaving and Parallelism in Recursive Digital Filters,
`Part I: Pipelining Using Scattered Look-Ahead Decomposition,” IEEE Trans. on Acoust., Speech, Signal
`Processing, vol. ASSP-37, 1989, pp. 1099–1117.
`
`34. K. K. Parhi and D. G. Messerschmitt, “Pipeline Interleaving and Parallelism in Recursive Digital Filters, Part
`II: Pipelined Incremental Block Filtering,” IEEE Trans. on Acoust., Speech, Signal Processing, vol. ASSP-
`37, 1989, pp. 1118–1134.
`
`35. K. K. Parhi and D. G. Messerschmitt, “A Bit-Parallel Bit Level Recursive Filter Architecture,” Proc. IEEE
`Int. Conf. Comput. Design, New York, 1986.
`
`36. B. G. Mertzios, “Pipelining the Three-Port Wave Filter Adaptor at the Bit Level,” Circuits, Signal and Signal
`Processing, vol. 14, no. 3, 1995, pp. 285–298.
`
`37. B. G. Mertzios, “Block Realization of 2-D IIR Digital Filters,” Signal Processing, vol. 7, no. 2, 1984,
`pp. 135–149.
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 348
`
`

`3-D DIGITAL FILTERS
`
`349
`
`38. D. E. Dudgeon and R. M. Mersereau, Two-Dimensional Digital Signal Processing, Englewood Cliffs, NJ:
`Prentice-Hall Inc., 1985.
`
`39. B. G. Mertzios, “Block-Space Invariance of Two-Dimensional Digital Signals,” Signal Processing, vol. 13,
`1987, pp. 141–153.
`
`40. J. R. Jump and S. R. Ahuja, “Effective Pipelining of Digital Systems,” IEEE Trans. on Computers, vol. C-27,
`1978, pp. 855–865.
`
`41. A. N. Venetsanopoulos and V. Cappellini, “Real-Time Image Processing,” in Multidimensional Systems:
`Techniques and Applications, Marcel Dekker Inc., ch. 8, pp. 345–399, 1986.
`
`42. L. B. Bruton and N. R. Bartley, “A General Purpose Computer Program for the Design of Two-Dimensional
`Recursive Filters,” Circuits, Signal and Signal Processing, vol. 3, no. 2, 1984, pp. 243–264.
`
`43. A. Fettweis, “Principles of Multidimensional Wave Digital Filtering,” Digital Signal Processing (J. K. Ag-
`garwal, ed.), Point Lobas Press, 1979.
`
`44. X. Liu and A. Fettweis, “Multidimensional Digital Filtering by Using Parallel Algorithms Based on Diagonal
`Processing,” Multidimensional Systems and Signal Processing, vol. 1, 1990, pp. 51–66.
`
`Petitioner Microsoft Corporation - Ex. 1018, p. 349
`
`

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases