`
`1
`
`1
`
`IPR2018-01476
`Apple Inc. EX1013 Page 1
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`2
`
`Diversity and Multiplexing:
`A Fundamental Tradeoff in Multiple Antenna
`Channels
`
`Lizhong Zheng, Member,
`
`IEEE David N.C. Tse, Member,
`
`IEEE
`
`Abstract—Multiple antennas can be used for increasing the
`amount of diversity or the number of degrees of freedom in
`wireless communication systems. In this paper, we propose the
`point of view that both types of gains can be simultaneously
`obtained for a given multiple antenna channel, but there is a
`fundamental tradeoff between how much of each any coding
`scheme can get. For the richly scattered Rayleigh fading channel,
`we give a simple characterization of the optimal tradeoff curve
`and use it to evaluate the performance of existing multiple
`antenna schemes.
`Index Terms—diversity, multiple antennas, MIMO, spatial
`multiplexing, space-time codes .
`
`I. INTRODUCTION
`Multiple antennas are an important means to improve the
`performance of wireless systems. It
`is widely understood
`that in a system with multiple transmit and receive antennas
`(MIMO channel), the spectral efficiency is much higher than
`that of the conventional single antenna channels. Recent re-
`search on multiple antenna channels, including the study of
`channel capacity [1], [2] and the design of communication
`schemes [3], [4], [5], demonstrates a great improvement of
`performance.
`Traditionally, multiple antennas have been used to increase
`diversity to combat channel fading. Each pair of transmit and
`receive antennas provides a signal path from the transmitter
`to the receiver. By sending signals that carry the same infor-
`mation through different paths, multiple independently faded
`replicas of the data symbol can be obtained at the receiver
`end; hence more reliable reception is achieved. For example,
`in a slow Rayleigh fading environment with 1 transmit and
`n receive antennas , the transmitted signal is passed through
`n different paths. It
`is well known that
`if the fading is
`independent across antenna pairs, a maximal diversity gain
`(advantage) of n can be achieved: the average error probability
`can be made to decay like 1/SNRn at high SNR, in contrast to
`the SNR°1 for the single antenna fading channel. More recent
`work has concentrated on using multiple transmit antennas
`to get diversity (some examples are trellis-based space-time
`codes [6], [7] and orthogonal designs [8], [3]). However, the
`underlying idea is still averaging over multiple path gains
`both authors are with the Department of Electrical Engineering and Com-
`puter Sciences, University of California, Berkeley, CA 94720
`This research is supported by a National Science Foundation Early Faculty
`CAREER Award, with matching grants from A.T.&T., Lucent Technologies
`and Qualcomm Inc., and by the National Science Foundation under grant
`CCR-01-18784
`
`(fading coefficients) to increase the reliability. In a system with
`m transmit and n receive antennas, assuming the path gains
`between individual antenna pairs are i.i.d. Rayleigh faded, the
`maximal diversity gain is mn, which is the total number of
`fading gains that one can average over.
`Transmit or receive diversity is a means to combat fad-
`ing. A different line of thought suggests that in a MIMO
`channel, fading can in fact be beneficial through increasing
`the degrees of freedom available for communication [2], [1].
`Essentially,
`if the path gains between individual
`transmit-
`receive antenna pairs fade independently, the channel matrix is
`well-conditioned with high probability, in which case multiple
`parallel spatial channels are created. By transmitting inde-
`pendent information streams in parallel through the spatial
`channels, the data rate can be increased. This effect is also
`called spatial multiplexing [5], and is particularly important in
`the high signal-to-noise ratio (SNR) regime where the system
`is degree-of-freedom-limited (as opposed to power-limited).
`Foschini [2] has shown that in the high SNR regime, the
`capacity of a channel with m transmit, n receive antennas
`and i.i.d. Rayleigh faded gains between each antenna pair is
`given by:
`
`C(SNR) = min{m, n} log SNR + O(1).
`The number of degrees of freedom is thus the minimum of m
`and n. In recent years, several schemes have been proposed
`to exploit the spatial multiplexing phenomenon(for example
`BLAST [2]).
`In summary, a MIMO system can provide two types of
`gains: diversity gain and spatial multiplexing gain. Most of
`current research focuses on designing schemes to extract
`either maximal diversity gain or maximal spatial multiplexing
`gain. (There are also schemes which switch between the two
`modes, depending on the instantaneous channel condition [5].)
`However, maximizing one type of gain may not necessarily
`maximize the other. For example,
`it was observed in [9]
`that the coding structure from the orthogonal designs [3],
`while achieving the full diversity gain, reduces the achievable
`spatial multiplexing gain. In fact, each of the two design
`goals addresses only one aspect of the problem. This makes it
`difficult to compare the performance between diversity-based
`and multiplexing-based schemes
`In this paper, we put forth a different viewpoint: given a
`MIMO channel, both gains can in fact be simultaneously ob-
`tained, but there is a fundamental tradeoff between how much
`of each type of gain any coding scheme can extract: higher
`
`IPR2018-01476
`Apple Inc. EX1013 Page 2
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`3
`
`spatial multiplexing gain comes at the price of sacrificing
`diversity. Our main result is a simple characterization of the
`optimal tradeoff curve achievable by any scheme. To be more
`specific, we focus on the high SNR regime, and think of a
`scheme as a family of codes, one for each SNR level. A scheme
`is said to have a spatial multiplexing gain r and a diversity
`advantage d if the rate of the scheme scales like r log SNR and
`the average error probability decays like 1/SNRd. The optimal
`tradeoff curve yields for each multiplexing gain r the optimal
`diversity advantage d§(r) achievable by any scheme. Clearly,
`r cannot exceed the total number of degrees of freedom
`min{m, n} provided by the channel; and d§(r) cannot exceed
`the maximal diversity gain mn of the channel. The tradeoff
`curve bridges between these two extremes. By studying the
`optimal
`tradeoff, we reveal
`the relation between the two
`types of gains, and obtain insights to understand the overall
`resources provided by multiple antenna channels.
`For the i.i.d. Rayleigh flat fading channel,
`the optimal
`tradeoff turns out to be very simple for most system parameters
`of interest. Consider a slow fading environment in which the
`channel gain is random but remains constant for a duration
`of l symbols. We show that as long as the block length
`l ∏ m + n ° 1, the optimal diversity gain d§(r) achievable
`by any coding scheme of block length l and multiplexing
`gain r (r integer) is precisely (m ° r)(n ° r). This suggests
`an appealing interpretation: out of the total resource of m
`transmit and n receive antennas, it is as though r transmit
`and r receive antennas were used for multiplexing and the
`remaining m° r transmit and n° r receive antennas provided
`the diversity. Thus, by adding one transmit and one receive
`antenna, the spatial multiplexing gain can be increased by one
`while maintaining the same diversity level. It should also be
`observed that this optimal tradeoff does not depend on l as
`long as l ∏ m + n ° 1; hence, no more diversity gain can be
`extracted by coding over block lengths greater than m + n° 1
`than using a block length equal to m + n ° 1.
`The tradeoff curve can be used as a unified framework to
`compare the performance of many existing diversity-based and
`multiplexing-based schemes. For several well-known schemes,
`we compute the achieved tradeoff curves d(r) and compare
`it to the optimal tradeoff curve. That is, the performance of
`a scheme is evaluated by the tradeoff it achieves. By doing
`this, we take into consideration not only the capability of
`the scheme to combat against fading, but also its ability to
`accommodate higher data rate as SNR increases, and therefore
`provide a more complete view.
`The diversity-multiplexing tradeoff is essentially the trade-
`off between the error probability and the data rate of a
`system. A common way to study this tradeoff is to compute
`the reliability function from the theory of error exponents
`[10]. However, there is a basic difference between the two
`formulations: while the traditional reliability function ap-
`proach focuses on the asymptotics of large block lengths, our
`formulation is based on the asymptotics of high SNR (but
`fixed block length). Thus, instead of using the machinery of
`the error exponent theory, we exploit the special properties
`of fading channels and develop a simple approach, based on
`the outage capacity formulation [11], to analyze the diversity-
`
`multiplexing tradeoff in the high SNR regime. On the other
`hand, even though the asymptotic regime is different, we do
`conjecture an intimate connection between our results and the
`theory of error exponents.
`The rest of the paper is outlined as follows. Section II
`presents the system model and the precise problem formu-
`lation. The main result on the optimal diversity-multiplexing
`tradeoff curve is given in Section III, for block length l ∏
`m + n ° 1. In Section IV, we derive bounds on the tradeoff
`curve when the block length is less than m + n° 1. While the
`analysis in this section is more technical in nature, it provides
`more insights to the problem. Section V studies the case when
`spatial diversity is combined with other forms of diversity.
`Section VI discusses the connection between our results and
`the theory of error exponents. We compare the performance
`of several schemes with the optimal tradeoff curve in Section
`VII. Section VIII contains the conclusions.
`
`II. SYSTEM MODEL AND PROBLEM FORMULATION
`A. Channel Model
`We consider the wireless link with m transmit and n receive
`antennas. The fading coefficient hij is the complex path gain
`from transmit antenna j to receive antenna i. We assume that
`the coefficients are independently Rayleigh distributed with
`unit variance, and write H = [hij] 2 Cn£m. H is assumed to
`be known to the receiver, but not at the transmitter. We also
`assume that the channel matrix H remains constant within a
`block of l symbols, i.e. the block length is much small than the
`channel coherence time. Under these assumptions, the channel,
`within one block, can be written as:
`
`m
`
`Y =r SNR
`(1)
`HX + W
`where X 2 Cm£l has entries xij, i = 1, . . . , m, j = 1, . . . , l
`being the signals transmitted from antenna i at time j; Y 2
`Cn£l has entries yij, i = 1, . . . , n, j = 1, . . . , l being the
`signals received from antenna i at time j; the additive noise W
`has i.i.d. entries wij ª CN (0, 1); SNR is the average signal
`to noise ratio at each receive antenna.
`We will first focus on studying the channel within this
`single block of l symbol times. In section V, our results are
`generalized to the case when there is a multiple of such blocks,
`each of which experiences independent fading.
`A rate R bps/Hz codebook C has |C| = b2Rlc codewords
`{X(1), . . . , X(|C|)}, each of which is an m £ l matrix. The
`transmitted signal X is normalized to have the average transmit
`power at each antenna in each symbol period to be 1. We
`interpret this as an overall power constraint on the codebook
`C:
`
`|C|Xi=1
`1
`
`kX(i)k2F ∑ ml.
`|C|
`where k.kF is the Frobenius norm of a matrix: kRk2
`Pij kRijk2 = trace(RR†).
`
`F
`
`(2)
`
`∆=
`
`IPR2018-01476
`Apple Inc. EX1013 Page 3
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`4
`
`one can transmit independent information symbols in parallel
`through the spatial channels. This idea is also called spatial
`multiplexing.
`Reliable communication at rates arbitrarily close to the
`ergodic capacity requires averaging across many independent
`realizations of the channel gains over time. Since we are
`considering coding over only a single block, we must lower
`the data rate and step back from the ergodic capacity to
`cater for the randomness of the channel H. Since the channel
`capacity increases linearly with log SNR, in order to achieve
`a certain fraction of the capacity at high SNR, we should
`consider schemes that support a data rate which also increases
`with SNR. Here, we think of a scheme as a family of codes
`{C(SNR)} of block length l, one at each SNR level. Let
`R(SNR) (bits/symbol) be the rate of the code C(SNR). We
`say that a scheme achieves a spatial multiplexing gain of r if
`the supported data rate
`R(SNR) º r log SNR (bps/Hz)
`One can think of spatial multiplexing as achieving a non-
`vanishing fraction of the degrees of freedom in the channel.
`According to this definition, any fixed-rate scheme has a zero
`multiplexing gain, since eventually at high SNR, any fixed
`data rate is only a vanishing fraction of the capacity.
`Now to formalize, we have the following definition.
`Definition 1: A scheme {C(SNR)} is said to achieve spatial
`multiplexing gain r and diversity gain d if the data rate
`R(SNR)
`lim
`log SNR
`SNR!1
`and the average error probability
`log Pe(SNR)
`(3)
`lim
`= °d
`log SNR
`SNR!1
`For each r, define d§(r) to be the supremum of the diversity
`advantage achieved over all schemes. We also define
`∆= d§(0)
`d§max
`∆= sup{r : d§(r) > 0}
`r§max
`which are respectively the maximal diversity gain and the
`maximal spatial multiplexing gain in the channel.
`Throughout the rest of the paper, we will use the spe-
`.= to denote exponential equality, i.e., we write
`cial symbol
`f(SNR) .= SNRb to denote
`log f(SNR)
`lim
`log SNR
`SNR!1
`are similarly defined. (3) can thus be written as
`Pe(SNR) .= SNR°d.
`The error probability Pe(SNR) is averaged over the additive
`noise W, the channel matrix H and the transmitted codewords
`(assumed equally likely). The definition of diversity gain here
`differs from the standard definition in the space-time coding
`literature (see for example [7]) in two important ways:
`• This is the actual error probability of a code, and not
`the pairwise error probability between two codewords as
`
`= r
`
`= b
`
`.∑
`
`and .
`∏,
`
`B. Diversity and Multiplexing
`Multiple antenna channels provide spatial diversity, which
`can be used to improve the reliability of the link. The basic
`idea is to supply to the receiver multiple independently faded
`replicas of the same information symbol, so that the proba-
`bility that all the signal components fade simultaneously is
`reduced.
`As an example, consider uncoded binary PSK signals over
`a single antenna fading channel (m = n = l = 1 in the above
`model). It is well known [12] that the probability of error at
`high SNR (averaged over the fading gain H as well as the
`additive noise) is
`
`14
`
`SNR°1.
`Pe(SNR) º
`In contrast, transmitting the same signal to a receiver equipped
`with 2 antennas, the error probability is
`3
`16 SNR°2.
`Pe(SNR) º
`Here we observe that by having the extra receive antenna,
`the error probability decreases with SNR at a faster speed of
`SNR°2. This phenomenon implies that at high SNR, the error
`probability is much smaller. Similar results can be obtained
`if we change the binary PSK signals to other constellations.
`Since the performance gain at high SNR is dictated by the
`SNR exponent of the error probability, this exponent is called
`the diversity gain. Intuitively, it corresponds to the number
`of independently faded paths that a symbol passes through;
`in other words, the number of independent fading coefficients
`that can be averaged over to detect the symbol. In a general
`system with m transmit and n receive antennas, there are
`in total m £ n random fading coefficients to be averaged
`over; hence the maximal (full) diversity gain provided by
`the channel is mn.
`Besides providing diversity to improve reliability, multiple
`antenna channels can also support a higher data rate than single
`antenna channels. As an evidence of this, consider an ergodic
`block fading channel in which each block is as in (1) and
`the channel matrix is independent and identically distributed
`across blocks. The ergodic capacity (bps/Hz) of this channel
`is well-known [1], [2]:
`
`∑log detµI + SNR
`C(SNR) =E
`At high SNR
`C(SNR) = min{m, n} log SNR
`m
`max{m,n}Xi=|m°n|+1
`E[log ¬22i] +o(1),
`
`where ¬2
`2i is Chi-square distributed with 2i degrees of free-
`dom. We observe that at high SNR, the channel capacity in-
`creases with SNR as min{m, n} log SNR (bps/Hz), in contrast
`to log SNR for single antenna channels. This result suggests
`that the multiple antenna channel can be viewed as min{m, n}
`parallel spatial channels; hence the number min{m, n} is the
`total number of degrees of freedom to communicate. Now
`
`HH†∂∏
`
`m
`
`+
`
`IPR2018-01476
`Apple Inc. EX1013 Page 4
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`5
`
`A. Optimal Tradeoff Curve
`The main result is given in the following theorem.
`Theorem 2: Assume l ∏ m + n ° 1. The optimal tradeoff
`curve d§(r) is given by the piecewise linear function connect-
`ing the points (k, d§(k)), k = 0, 1, . . . , min{m, n}, where
`d§(k) = (m ° k)(n ° k)
`In particular, d§max = mn, and r§max = min{m, n}.
`The function d§(r) is plotted in Figure 1.
`
`(4)
`
`(0,mn)
`
`(1,(m−1)(n−1))
`
`(2, (m−2)(n−2))
`
`(r, (m−r)(n−r))
`
`(min{m,n},0)
`
`Diversity Gain: d*(r)
`
`Spatial Multiplexing Gain: r=R/log SNR
`Fig. 1. Diversity-multiplexing tradeoff, d§(r) for general m, n and l ∏
`m + n ° 1.
`the r axis at
`tradeoff curve intersects
`The optimal
`min{m, n}. This means that the maximum achievable spatial
`multiplexing gain r§max is the total number of degrees of
`freedom provided by the channel as suggested by the ergodic
`capacity result
`in (3). Theorem 2 says that at
`this point,
`however, no positive diversity gain can be achieved. Intuitively,
`as r ! r§max, the data rate approaches the ergodic capacity
`and there is no protection against the randomness in the fading
`channel.
`On the other hand, the curve intersects the d axis at the
`maximal diversity gain d§max = mn, corresponding to the
`total number of random fading coefficients that a scheme
`can average over. There are known designs that achieve the
`maximal diversity gain at a fixed data rate [8]. Theorem 2 says
`that in order to achieve the maximal diversity gain, no positive
`spatial multiplexing gain can be obtained at the same time.
`The optimal tradeoff curve d§(r) bridges the gap between
`the above two design criteria, by connecting the two extreme
`points: (0, d§max) and (r§max, 0). This result says that positive
`diversity gain and spatial multiplexing gain can be achieved
`simultaneously. However, increasing the diversity advantage
`comes at a price of decreasing the spatial multiplexing gain,
`and vice versa. The tradeoff curve provides a more complete
`picture of the achievable performance over multiple antenna
`channels than the two extreme points corresponding to the
`maximum diversity gain and multiplexing gain. For example,
`the ergodic capacity result suggests that by increasing the
`minimum of the number of transmit and receive antennas,
`min{m, n}, by one, the channel gains one more degree of
`
`is commonly used as a diversity criterion in space-time
`code design.
`• In the standard formulation, diversity gain is an asymp-
`totic performance metric of one fixed code. To be specific,
`the input of the fading channel is fixed to be a particular
`code, while SNR increases. The speed that
`the error
`probability ( of a ML detector) decays as SNR increases
`is called the diversity gain. In our formulation, we notice
`that the channel capacity increases linearly with log SNR.
`Hence in order to achieve a non-trivial fraction of the ca-
`pacity at high SNR, the input data rate must also increase
`with SNR, which requires a sequence of codebooks with
`increasing size. The diversity gain here is use as a
`performance metric of such a sequence of codes, which
`is formulated as a ”scheme”. Under this formulation, any
`fixed code has 0 spatial multiplexing gain. Allowing both
`the data rate and the error probability scale with the
`SNR is the crucial element of our formulation and, as
`we will see, allows us to to talk about their tradeoff in a
`meaningful way.
`The spatial multiplexing gain can also be thought as the data
`rate normalized with respect to the SNR level. A common way
`to characterize the performance of a communication scheme
`is to compute the error probability as a function of SNR
`for a fixed data rate. However, different designs may support
`different data rate. In order to compare these schemes fairly,
`Forney [13] proposed to plot the error probability against the
`normalized SNR:
`
`∆= SNR
`SNRnorm
`C°1(R) .
`where C(SNR) is the capacity of the channel as a function of
`SNR. That is, SNRnorm measures how far the SNR is above
`the minimal required to support the target data rate.
`A dual way to characterize the performance is to plot
`the error probability as a function of the data rate, for a
`fixed SNR level. Analogous to Forney’s formulation, to take
`into consideration the effect of the SNR, one should use the
`normalized data rate Rnorm instead of R:
`R
`∆=
`Rnorm
`C(SNR)
`which indicates how far a system is operating from the
`Shannon limit. Notice that at high SNR, the capacity of the
`multiple antenna channel is C(SNR) º min{m, n} log SNR;
`hence the spatial multiplexing gain
`R
`log SNR º min{m, n}Rnorm
`r =
`is just a constant multiple of Rnorm.
`III. OPTIMAL TRADEOFF: l ∏ m + n ° 1 CASE
`In this section, we will derive the optimal tradeoff between
`the diversity gain and the spatial multiplexing gain that any
`scheme can achieve in the Rayleigh fading multiple antenna
`channel. We will first focus on the case that the block length
`l ∏ m + n ° 1, and discuss the other cases in section IV.
`
`IPR2018-01476
`Apple Inc. EX1013 Page 5
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`6
`
`above example, d(0) = d§max but for some other schemes
`d(0) < d§max strictly. Similarly, the maximal spatial multiplex-
`ing gain achieved by a scheme is in general different from the
`degrees of freedom r§max in the channel.
`Consider now the Alamouti scheme as an alternative to
`the repetition scheme in (5). Here,
`two data symbols are
`transmitted in every block of length 2 in the form:
`
`x†1 ∏
`X =∑ x1 °x†2
`
`(6)
`x2
`It is well known that the Alamouti scheme can also achieve
`the full diversity gain, d§max, just like the repetition scheme.
`However, in terms of the tradeoff achieved by the two schemes,
`as plotted in Figure 3-(b), the Alamouti scheme is strictly
`better than the repetition scheme, since it yields a strictly
`higher diversity gain for any positive spatial multiplexing gain.
`The maximal multiplexing gain achieved by the Alamouti
`scheme is 1, since one symbol is transmitted per symbol
`time. This is twice as much as that for the repetition scheme.
`However, the tradeoff curve achieved by Alamouti scheme is
`still below the optimal for any r > 0.
`In the literature on space-time codes, the diversity gain of a
`scheme is usually discussed for a fixed data rate, corresponding
`to a multiplexing gain r = 0. This is, in fact, the maximal
`diversity gain d(0) achieved by the given scheme. We observe
`that if the performance of a scheme is only evaluated by
`the maximal diversity gain d(0), one cannot distinguish the
`performance of the repetition scheme in (5) and the Alamouti
`scheme. More generally, the problem of finding a code with the
`highest (fixed) rate that achieves a given diversity gain is not
`a well-posed one: any code satisfying a mild non-degenerate
`condition (essentially a full-rank condition like the one in [7])
`will have full diversity gain, no matter how dense the symbol
`constellation is. This is because diversity gain is an asymptotic
`concept, while for any fixed code the minimum distance is
`fixed and does not depend on the SNR. (Of course, the higher
`the rate, the higher the SNR needs to be for the asymptotics to
`be meaningful.) In the space-time coding literature, a common
`way to get around this problem is to put further constraints on
`the class of codes. In [7], for example, each codeword symbol
`xij is constrained to come from the same fixed constellation.
`(c.f. Theorem 3.31 there) These constraints are however not
`fundamental. In contrast, by defining the multiplexing gain
`as the data rate normalized by the capacity, the question of
`finding schemes that achieves the maximal multiplexing gain
`for a given diversity gain becomes meaningful.
`B. Outage Formulation
`As a step to prove Theorem 2, we will first discuss another
`commonly used concept for multiple antenna channels: the
`outage capacity formulation, proposed in [11] for fading
`channels and applied to multi-antenna channels in [1].
`Channel outage is usually discussed for non-ergodic fading
`channels, i.e., the channel matrix H is chosen randomly but
`is held fixed for all time. This non-ergodic channel can be
`written as:
`
`yt =r SNR
`
`m
`
`Hxt + wt, for t = 1, 2, . . . ,1
`
`(7)
`
`d
`
`Diversity Advantage: d*(r)
`
`Spatial Multiplexing Gain: r=R/log SNR
`Adding one transmit and one receive antenna increases spatial
`Fig. 2.
`multiplexing gain by 1 at each diversity level.
`
`this corresponds to r§max being increased by 1.
`freedom;
`Theorem 2 makes a more informative statement: if we increase
`both m and n by 1, the entire tradeoff curve is shifted to the
`right by 1, as shown in Figure 2; i.e., for any given diversity
`gain requirement d, the supported spatial multiplexing gain is
`increased by 1.
`To understand the operational meaning of the tradeoff curve,
`we will first use the following example to study the tradeoff
`performance achieved by some simple schemes.
`Example: 2 £ 2 system
`Consider the multiple antenna channel with 2 transmit and
`2 receive antennas. Assume l ∏ m + n ° 1 = 3. The
`optimal tradeoff for this channel is plotted in Figure 3-(a).
`The maximum diversity gain for this channel is d§max = 4,
`and the total number of degrees of freedom in the channel is
`r§max = 2.
`In order to get the maximal diversity gain, d§max, each
`information bit needs to pass through all the 4 paths from
`the transmitter to the receiver. The simplest way of achieving
`this is to repeat the same symbol on the two transmit antennas
`in two consecutive symbol times:
`0
`
`(5)
`0
`d§max can only be achieved with a multiplexing gain r = 0. If
`we increase the size of the constellation for the symbol x1 as
`SNR increases to support a data rate R = r log SNR(bps/Hz)
`for some r > 0, the distance between constellation points
`shrinks with the SNR and the achievable diversity gain is
`decreased. The tradeoff achieved by this repetition scheme is
`plotted in Figure 3-(b)1. Notice the maximal spatial multiplex-
`ing gain achieved by this scheme is 1/2, corresponding to the
`point (1/2, 0), since only one symbol is transmitted in two
`symbol times.
`The reader should distinguish between the notion of the
`maximal diversity gain achieved by a scheme, d(0), and the
`maximal diversity provided by the channel d§max. For the
`1How these curves are computed will become evident in Section VII.
`
`X =∑ x1
`
`x1 ∏ .
`
`IPR2018-01476
`Apple Inc. EX1013 Page 6
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`7
`
`(0,4)
`
`Optimal Tradeoff
`Repetition Scheme
`Alamouti Scheme
`
`(1,1)
`
`(1,0)
`
`(2,0)
`
`(0,1/2)
`
`Diversity Gain: d*(r)
`
`(1,1)
`
`(2,0)
`
`(0,4)
`
`Diversity Advantage: d
`
`At high SNR,
`
`Spatial Multiplexing Gain: r=R/log SNR
`Spatial Multiplexing Gain: r=R/log SNR
`(b)
`(a)
`Fig. 3. Diversity-multiplexing tradeoff for (a): m = n = 2, l ∏ 3; (b): Comparison between two schemes.
`where xt 2 Cm, yt 2 Cn are the transmitted and received
`signals at time t, and wt 2 Cn is the additive Gaussian noise.
`An outage is defined as the event that the mutual information
`of this channel does not support a target data rate :
`{H : I(xt; yt | H = H) < R}
`The mutual information is a function of the input distri-
`bution P (xt), and the channel realization. Without loss of
`optimality, the input distribution can be taken to be Gaussian
`with a covariance matrix Q, in which case:
`
`=
`
`=
`
`lim
`SNR!1
`lim
`SNR!1
`
`log P [log det(I + SNRHH†) < R]
`log SNR
`
`log P [log det(I + SNRm HH†) < R]
`log SNR
`m
`
`log P [log det(I + SNRm HH†) < R]
`lim
`log SNR
`SNR!1
`Therefore on the scale of interest, the bounds are tight, and
`we have
`
`I(xt; yt | H = H) = log detµI + SNR
`
`m
`Optimizing over all input distributions, the outage probabil-
`ity is
`
`HQH†∂
`
`P∑log detµI + SNR
`
`HQH†∂ < R∏
`
`Pout(R)
`=
`inf
`m
`Q∏0,trace(Q)∑m
`where the probability is taken over the random channel matrix
`H. We can simply pick Q = I to get an upper bound on the
`outage probability.
`On the other hand, Q satisfies the power constraint,
`trace(Q) ∑ m, and hence mI ° Q is a positive-semidefinite
`matrix. Notice that log det(.) is an increasing function on the
`cone of positive-definite Hermitian matrices, i.e., if A and B
`are both positive-semidefinite Hermitian matrices, written as
`A ∏ 0 and B ∏ 0, then
`A ° B ∏ 0 =) log det A ∏ log det B.
`Therefore,if we replace Q by mIm, the mutual information is
`increased:
`
`log detµI + SNR
`HQH†∂ ∑ log det°I + SNRHH†¢ ;
`P∑log detµI + SNR
`HH†∂ < R∏
`∏ Pout(R)
`∏ P£log det°I + SNRHH†¢ < R§
`
`m
`hence the outage probability satisfies
`
`m
`
`(8)
`
`Pout(R) .= P£log det°I + SNRHH†¢ < R§ .
`
`(9)
`loss of generality assume the input
`and we can without
`(Gaussian) distribution to have covariance matrix Q = I.
`In the outage capacity formulation, we can ask an analogous
`question as in our diversity-tradeoff formulation: given a target
`rate R which scales with SNR as r log SNR, how does the
`outage probability decrease with the SNR? To perform this
`analysis, we can assume without loss of generality that m ∏ n.
`This is because
`H†H∂ ,
`HH†∂ = log detµI + SNR
`log detµI + SNR
`
`m
`m
`hence swapping m and n has no effect on the mutual infor-
`mation, except a scaling factor of m/n on the SNR, which
`can be ignored on the scale of interest.
`We start with the following example.
`Example: Single Antenna Channel
`Consider the single antenna fading channel
`y = pSNRhx + w
`where h 2 C is Rayleigh distributed, and y, x, w 2 C. To
`achieve a spatial multiplexing gain of r, we set the input data
`rate be R = r log SNR for 0 ∑ r ∑ 1. The outage probability
`for this target rate is
`Pout(r log SNR) = P (log(1 + SNRkhk2) ∑ r log SNR)
`= P (1 + SNRkhk2 ∑ SNRr)
`º P (khk2 ∑ SNR°(1°r))
`
`IPR2018-01476
`Apple Inc. EX1013 Page 7
`
`
`
`IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 1, NO. 8, AUGUST 2002
`
`8
`
`is exponentially distributed, with density
`Notice khk2
`pkhk2(t) = e°t; hence
`Pout(r log SNR) º P (khk2 ∑ SNR°(1°r))
`= 1 ° exp(°SNR°(1°r))
`.= SNR°(1°r)
`This simple example shows the relation between the data
`rate and the SNR exponent of the outage probability. The result
`depends on the Rayleigh distribution of h only through the
`near zero behavior: P (khk2 ∑ ≤) ª ≤; hence is applicable to
`any fading distribution with a non-zero finite density near 0.
`We can also generalize to the case that the fading distribution
`has P (khk2 ∑ ≤) ª ≤k, in which case the resulting SNR
`exponent is k(1 ° r) instead of 1 ° r.
`In a general m£n system, an outage occurs when the chan-
`nel matrix H is “near singular”. The key step in computing
`the outage probability is to explicitly quantify how singular
`H needs to be for outage to occur, in terms of the target
`data rate and the SNR. In the above example with a data rate
`R = r log SNR, outage occurs when khk2 ∑ SNR°(1°r), with
`a probability SNR°(1°r). To generalize this idea to multiple
`antenna systems, we need to study the probability that the
`singular values of H are close to zero. We quote the joint
`probability density function (pdf.) of these singular values
`[14].
`Lemma 3: Let R be an m £ n random matrix with i.i.d.
`CN (0, 1) entries. Suppose m ∏ n, µ1 ∑ µ2 ∑ . . . ∑ µn be
`the ordered non-zero eigenvalues of R†R, then the joint pdf.
`of µi’s is
`
`(10)
`
`nYi=1
`i Yi<j
`p(µ1, . . . µn) = K°1
`(µi ° µj)2e° i µi
`µm°n
`m,n
`where Km,n is a normalizing constant. Define Æi
`:=
`° log µi/ log SNR for all i. The joint pdf. of the random vector
`Æ = [Æ1, . . . , Æn] is
`nYi=1
`(SNR°Æi ° SNR°Æj )2 exp"°
`SNR°Æi#
`nXi=1
`Yi<j
`This can be obtained from (10) by the change of variables
`µi = SNR°Æi.
`Now consider (9) with R = r log SNR, let ∏1 ∑ ∏2 ∑ . . . ∑
`∏n be the non-zero eigenvalues of HH†, we have
`.= P [log det(I + SNRHH†) < R]
`Pout(R)
`= P" nYi=1
`(1 + SNR∏i) < SNRr#
`Let ∏i = SNR°Æi. At high SNR, we have (1 + SNR∏i) .=
`SNR(1°Æi)+, where (x)+ denotes max{0, x}. The above can
`thus be written as
`.= P"Yi
`< SNRr#
`= P"Xi
`(1 ° Æi)+ < r#
`
`m,n(log SNR)n
`p(Æ) = K°1
`
`SNR°(m°n+1)Æi
`
`Pout(R)
`
`SNR(1°Æi)+
`
`nYi=1
`(SNR°Æi ° SNR°Æj )2 exp"°Xi
`
`m,n(log SNR)n
`K°1
`
`SNR°(m°n+1)Æi
`
`SNR°Æi# dÆ
`
`Here, the random vector Æ indicates the level of singularity
`of the channel matrix H. The lar