throbber
A White-Box DES Implementation
`for DRM Applications(cid:63)
`
`S. Chow1, P. Eisen1, H. Johnson1, P.C. van Oorschot2
`
`1 Cloakware Corporation, Ottawa, Canada
`2 Carleton University, Ottawa, Canada
`(This research was carried out at Cloakware Corp.)
`{stanley.chow, phil.eisen, harold.johnson}@cloakware.com,
`vanoorschot@scs.carleton.ca
`
`Abstract. For applications such as digital rights management (drm)
`solutions employing cryptographic implementations in software, white-
`box cryptography (or more formally: a cryptographic implementation
`designed to withstand the white-box attack context) is more appropriate
`than traditional black-box cryptography. In the white-box context, the
`attacker has total visibility into software implementation and execution,
`and our objective is to prevent the extraction of secret keys from the
`program. We present methods to make key extraction difficult in the
`white-box context, with focus on symmetric block ciphers implemented
`by substitution boxes and linear transformations. A des implementation
`(useful also for triple-des) is presented as a concrete example.
`
`1 Introduction
`
`In typical software digital rights management (drm) implementations, crypto-
`graphic algorithms are part of the security solution. However, the traditional
`cryptographic model — employing a strong known algorithm, and relying on
`the secrecy of the cryptographic key — is inappropriate surprisingly often, since
`the platforms on which many drm applications execute are subject to the control
`of a potentially hostile end-user. This is the challenge we seek to address.
`A traditional threat model used in black-box symmetric-key cryptography is
`the adaptive chosen plaintext attack model. It assumes the attacker does not
`know the encryption key, but knows the algorithm, controls the plaintexts en-
`crypted (their number and content), and has access to the resulting ciphertexts.
`However, the dynamic encryption operation is hidden — the attacker has no
`visibility into its execution.
`We make steps towards providing software cryptographic solutions suitable in
`the more realistic (for drm applications) white-box attack context: the attacker
`is assumed to have all the advantages of an adaptive chosen-text attack, plus full
`access to the encrypting software and control of the execution environment. This
`includes arbitrary trace execution, examining sub-results and keys in memory,
`
`(cid:63) Version: Oct 15, 2002; Pre-proceedings for ACM DRM-2002 workshop.
`
`APPLE 1013
`
`1
`
`

`

`2
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`performing arbitrary static analyses on the software, and altering results of sub-
`computation (e.g. via breakpoints) for perturbation analysis.
`Our main goal is to make key extraction difficult. While an attacker con-
`trolling the execution environment can clearly make use of the software itself
`(e.g. for decryption) without explicitly extracting the key, forcing an attacker to
`use the installed instance at hand is often of value to drm systems providers. How
`strong an implementation can be made against white-box threats is unknown.
`We presently have no security proofs for our methods. Nonetheless, regardless
`of the security of our particular proposal, we believe the general approach of-
`fers useful levels of security in the form of additional protection suitable in the
`commercial world, forcing an attacker to expend additional effort (compared to
`conventional black-box implementations). Our goal is similar to Aucsmith and
`Graunke’s split encryption/decryption [1]; the solutions differ.
`White-box solutions are inherently (and currently, quite significantly) bulkier
`and slower than black-box cryptography. These drawbacks are offset by ad-
`vantages justifying white-box solutions in certain applications. Software-only
`white-box key-hiding components may be cost-effectively installed and updated
`periodically (cf. Jakobsson and Reiter [8]), whereas smart cards and hardware
`alternatives can’t be transmitted electronically. Hardware solutions also cannot
`protect encryption within mobile code. While white-box implementations are
`clearly not appropriate for all cryptographic applications (see [4]), over time, we
`expect increases in processing power, memory capacity and transmission band-
`width, along with decreasing costs, to ameliorate the efficiency concerns.
`In black-box cryptography, differences in implementation details among func-
`tionally equivalent instances are generally irrelevant with respect to security
`implications. In contrast, for white-box cryptography, changing implementation
`details becomes a primary means for providing security. (This is also true, to
`a lesser extent, for cryptographic solutions implemented on smart cards and
`environments subject to so-called side-channel attacks.)
`In this paper, we focus on general techniques that are useful in producing
`White-Box implementations of Feistel ciphers. We use des (e.g. see [11]) to
`provide a detailed example of hiding a key in the software. des-like ciphers
`are challenging in the White-Box context since each round leaves half the bits
`unchanged and the expansions and permutations boxes are very simple (and
`known). We propose techniques to handle these problems.
`We largely ignore space and time requirements in the present paper, noting
`only that white-box implementations have been successfully used in commercial
`practice. In the present paper we restrict attention to the embedded (fixed) key
`case; dynamic-key white-box cryptography is the subject of ongoing research.
`The motivation for using des is twofold: (1) des needs only linear transforma-
`tions and substitution boxes, simplifying our discussion; and (2) our technique
`readily extends to triple-des which remains popular. We outline a white-box im-
`plementation for aes [5] elsewhere — see Chow et al. [4], to which we also refer
`for further discussion of the goals of white-box cryptography, related literature,
`
`2
`
`

`

`A White-Box DES Implementation
`
`3
`
`and why theoretical results such as that of Barak et al. [2] are not roadblocks to
`practical solutions.
`Following terminology and notation in §2, §3 outlines basic white-box con-
`struction techniques. §4 presents a blocking method for building encoded net-
`works. §5 provides an example white-box des implementation, with a recom-
`mended variant discussed in §5.3. Concluding remarks are found in §6.
`
`2 Terminology and Notation
`
`We follow the terminology of [4]. The main concept is the encoding of a transfor-
`mation, we use this extensively in this paper — we consider a substitution-box
`lookup to be a transformation; we also consider the whole des to be a transfor-
`mation. We use input/output encodings to protect all these transforms.
`
`Definition 1 (encoding) Let X be a transformation from m to n bits. Choose
`an m-bit bijection F and an n-bit bijection G. Call X(cid:48) = G◦X◦F −1 an encoded
`version of X. F is an input encoding and G is an output encoding.
`
`In many places, the transformations have wide inputs and/or outputs (in
`the des construction, some are 96 bits input and output), which means the
`implementation can get very large. To avoid huge tables, we can construct an
`encoding as the concatenation of smaller bijections. Consider bijections Fi of
`size ni, where n1 + n2 + . . . + nk = n. Let (cid:107) denote vector concatenation.
`Definition 2 (concatenated encoding) The function concatenation F1(cid:107)F2(cid:107)
`. . . (cid:107)Fk is the bijection F such that, for any n-bit vector b = (b1, b2, . . . , bn),
`F (b) = F1(b1,. . ., bn1)(cid:107)F2(bn1+1,. . ., bn1+n2)(cid:107) . . . (cid:107)Fk(bn1+...+nk−1+1,. . ., bn). For
`2 (cid:107) . . . (cid:107)F −1
`1 (cid:107)F −1
`such a bijection F , plainly F −1 = F −1
`k .
`Generally, output of a transformation will become the input to another subse-
`quent transformation, which means the output encoding of the first must match
`the input encoding of the second.
`
`Definition 3 (networked encoding) A networked encoding for computing Y
`◦ X (i.e. transformation X followed by transformation Y ) is an encoding of the
`form
`Y (cid:48) ◦ X(cid:48) = (H ◦ Y ◦ G−1) ◦ (G ◦ X ◦ F −1) = H ◦ (Y ◦ X) ◦ F −1 .
`
`In the subsequent sections, it is often useful to keep track of the sizes of the
`inputs and outputs. We use the following notations.
`P (cid:48) denotes an encoded implementation derived from the function P . n
`mP de-
`notes P , emphasizing that it maps m-vectors to n-vectors. For a matrix M, n
`mM
`denotes explicitly that M has m columns and n rows. (Interpreting application
`of M to a vector as function application, this notation is the same as above,
`though there is not necessarily a matrix representation M for every arbitrary
`function P .)
`
`3
`
`

`

`4
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`I is the identity function on k-vectors. n
`mE (a mnemonic for an entropy-
`transference function) is any function from m-vectors to n-vectors such that, if
`m ≤ n, the mapping loses no bits of information, and if m > n, it loses at most
`n − m bits.
`(cid:104)e1, e2, e3, . . . , ek(cid:105) is a k-vector with elements ei; it will be evident from
`context whether the elements are bits. x(cid:107)y is the concatenation of vectors x
`and y. vi is the ith element. vi··j is the sub-vector containing elements i through
`j. kv denotes explicitly that v has k elements. ke is any vector with k elements
`(mnemonically: an entropy-vector). ⊕ denotes bitwise xor.
`at (mnemonic: affine transformation) denotes a vector-to-vector transforma-
`
`tion function P which can be defined for all me by nmP (me) = n
`mMme + nd, or
`P (e) = Me + d, where M is a constant matrix and d is a constant displacement
`vector. We consider ats over gf(2). Note that if A and B are ats, then so are
`A(cid:107)B and A ◦ B (where defined).
`
`kk
`
`3 Producing Encoded Implementations
`
`As is well known, des consists of “permutations”, S-Box lookups and xor oper-
`ations. We will follow our approach of applying encodings to each of these steps.
`For S-Box lookups and xor operations, encoding each operation (along with its
`input and output) seem to produce good security. For the various permutations,
`the problem is more difficult.
`These permutations are, by nature, very simple; and it is difficult to hide the
`information being manipulated. We address this by changing the domain from
`permutations to linear algebra so that we have more tools at our disposal. We
`start by expressing each of the des permutations and the bitwise xor operations
`as ats. Clearly, the resultant ats are still very simple and do not hide information
`well, but subsequent use of non-linear encoding (see 4) will make such hiding
`much more effective.
`
`3.1 Some techniques
`
`In general, we are working towards an implementation consisting entirely of
`substitution boxes, none of which implement ats. Several intermediate methods
`are needed to transform a normal cipher implementation to such a form. We
`describe such intermediate methods below. We will use a subset of these in our
`des example.
`
`Partial Evaluation If part of our input to P is known at implementation
`creation time, we can simply put in the known values to P (cid:48) and pre-evaluate all
`constant expressions. For example, in the present case where the key is known
`in advance, we can pre-evaluate all the operations involving the key. In the case
`of des this essentially means we replace the standard S-Boxes with key-specific
`S-Boxes.
`
`4
`
`

`

`A White-Box DES Implementation
`
`5
`
`Mixing Bijection A mixing bijection is a bijective at which attempts to maxi-
`mize the dependency of each output bit on all input bits. (Clearly, it is invertible
`and the inverse is also an at.)
`In des, for example, the permutations, represented as ats, have very sparse
`matrices (one or two 1-bits per row or column). In order to diffuse information
`over more bits, we can represent such a permutation P by J ◦ K, where K is
`a mixing bijection and J = P K−1, thereby replacing a sparse matrix with two
`non-sparse ones (which is advantageous in subsequent de-linearizing encoding
`steps).
`
`mP , where m and n are large, cannot
`I/O-Blocked Encoding An arbitrary n
`simply be encoded using two arbitrary bijective encodings as P (cid:48) = G ◦ P ◦ F −1
`using a substitution-box representation, since the size of a substitution-box varies
`exponentially with the number of input bits.
`We can produce practically implementable encodings for such a P by dividing
`its input into j blocks of a bits each, and its output into k blocks of b bits each, so
`
`mJ and nnK be two mixing bijections (see above).
`that m = ja and n = kb. Let m
`We choose (arbitrary) coding bijections for each block of the input and out-
`bGk. We then define FP = (F1(cid:107) · · · (cid:107)Fj) ◦ J and
`put: a
`aFj and b
`aF1, . . . , a
`bG1, . . . , b
`GP = (G1(cid:107) · · · (cid:107)Gk) ◦ K. Then P (cid:48) = GP ◦ P ◦ F −1
`as usual.
`P
`(We still have the problem of how wide-input ats are to be represented by
`networks of substitution boxes. Methods for this are described in §4.)
`This permits us to use networked encoding (def. 3) with a ‘wide I/O’ linear
`function in encoded form, since, prior to encoding, as a preliminary step, we
`only need to deal with J and K (i.e., we replace P with K ◦ P ◦ J−1), which
`
`i s and G(cid:48)is which we add
`can be done using the smaller blocking factors of the F (cid:48)
`during encoding. That is, if the input to P is provided by an at X, and the
`output from P is used by an at Y , we would use J ◦ X and Y ◦ K−1 instead.
`Then the input and output coding of the parts can ignore J and K — they have
`already been handled — and deal only with the concatenated non-linear partial
`I/O encodings F1(cid:107) · · · (cid:107)Fj and G1(cid:107) · · · (cid:107)Gk, which conform to smaller blocking
`factors easily handled by substitution boxes. This easily extends to non-uniform
`I/O blocked encoding (where blocks vary in size).
`
`Combined Function Encoding For functions P and Q that happen to be
`evaluated together, we could choose an encoding of P(cid:107)Q such as G◦(P(cid:107)Q)◦F −1.
`Essentially, we combine P and Q into a single function, then encode the combined
`input and output. The encoding mixes P ’s input and output entropy with Q’s,
`making it harder for an attacker to separate and determine the components P
`and Q. Note that this differs from concatenated encoding (def. 2) in how the
`encoding is applied. Here, the encoding applies to all components as a single
`unit.
`
`By-Pass Encoding In general, we want the implementation of each transform
`to have extra entropy at both the input and output, so that it is more difficult to
`
`5
`
`

`

`6
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`identify the transform. For example, for a transform n
`mP , we carry a extra bits
`of entropy at the input, and b extra bits of entropy at the output, where a ≥ b,
`aE the by-pass componentm+aP (cid:48) as G ◦ (P(cid:107) baE) ◦ F −1. We call b
`
`
`we can encode n+b
`of P (cid:48). (If b
`
`aE has the specific form aaI, so that a = b, we call it identity by-pass.)
`
`Split-Path Encoding For a function nmP , we may use an encoding that is
`
`
`really a concatanation of two separate encodings. That is, we define n+km Q(me) =
`P (me)(cid:107) k
`mR(me) for all me, for some fixed function R. The effect is that, if P is
`lossy, Q may lose less information (or no information). We sometimes use this
`method to achieve local security (see §3.2.)
`
`Output Splitting This technique is useful for disguising outputs where input
`information can be well hidden. This does not appear to be the case for des:
`for implementations of des, output splitting is not recommended since it cannot
`provide much security.
`Where the technique is appropriate, we may encode a function P as k ≥ 2
`functions, P1, P2, . . . , Pk, where the encoded implementation for each part can
`mix in additional entropy as described above, and where the output of all of the
`encoded Pi’s is needed to determine the original output of P .
`mP1 to be a
`mP , we can choose k = 2, define n
`For example, given a function n
`mP2(me) = P (me) ⊕ P1(me) for all me.
`mE, and define n
`randomly selected fixed n
`At this point, we can compute the P output from the xor of the outputs of
`the two Pi’s. However, after we then independently encode the Pi’s, the output
`1 and P (cid:48)
`of P (cid:48)
`2 is not combinable via an at into information about P ’s output.
`
`3.2 Substitution Boxes and Local Security
`
`We can represent any function n
`mP by a substitution box (S-Box): an array of
`2m entries, each of n bits. To compute P (x), find the array entry indexed by the
`binary magnitude x. The exponential growth in S-Box size with its input width
`limits S-Boxes to the representation of narrow input functions.
`When the underlying P is bijective, the encoded S-Box for P (cid:48) is locally secure
`in the sense that it is not possible to extract any useful information by examining
`the encoded S-Box. The reason is that given a S-Box for P (cid:48), every possible
`bijective P fits (just as in an One-Time Pad, every possible plaintext can result
`in every possible ciphertext). Of course, this just means that an attack must be
`non-local.
`The lossy case is not locally secure. When a slightly lossy encoded function is
`represented as an S-Box, some information about the function beyond its input
`and output widths can be found by examining its S-Box. While it still requires
`non-local attacks to completely unravel such a lossy box, it often leaks enough
`information to allow attacks such as the statistical bucketing attack (see §5.4).
`
`6
`
`

`

`4 Wide-Input Encoded ATs: Building Encoded Networks
`
`A White-Box DES Implementation
`
`7
`
`(cid:80)n#
`
`(cid:80)m#
`
`Clearly, if we want to construct a S-Box with wide-input, say 32 bits or even
`96 bits, we will quickly use immense amounts of storage. This means we cannot
`represent a wide-input encoded at by an S-Box. We can, however, construct
`networks of S-Boxes to implement a wide-input encoded at. The following con-
`struction handles ats in considerable generality, including compositions of ats,
`
`mA encoded as nmA(cid:48). The form of the
`and for a wide variety of ats of the form n
`network can remain invariant except for variations in the bit patterns within its
`S-Boxes.
`For an at A, we simply partition the matrix and vectors into blocks, giving
`us well-known formulas using the blocks from the partition which subdivide the
`computation of A. We can then use (smaller) S-Boxes to encode the functions
`defined by the blocks, and combine the result into a network, using the methods
`in §3.1 above, so that the resulting network is an encoding of A.
`mM me + nd for all me. We choose
`
`Consider an at A, defined by nmA(me) = n
`partition counts m# and n# and sequences (cid:104)m1, . . . , mm#(cid:105) and (cid:104)n1, . . . , nn#(cid:105),
`such that
`1 ni = n. That is, the former sequence (the m-
`1 mi = m and
`partition) is an additive partition of m, and the latter sequence (the n-partition)
`is an additive partition of n.
`The m-partition partitions the inputs and the columns of M; the n-partition
`partitions d and the outputs. Hence the i, jth block in partitioned M contains
`mi columns and nj rows, the ith partition of the input contains mi elements,
`and the jth partition of d or the output contains nj elements.
`At this point, it is straightforward to encode the components (of the network
`forming A) to obtain an encoded network, by the methods of §3.1, and then
`representing it as a network of S-Boxes (see §3.2.) In such a network,none of
`the subcomputations is linear: each is encoded and represented as a non-linear
`S-Box.
`A naive version of this consists of a forest of n# trees of binary ‘vector add’
`S-Boxes, with m#(m# − 1) ‘vector add’ nodes per tree. At the leaves are m#
`unary ‘constant vector multiply’ nodes, and at the root is either a binary ‘vector
`add’ node (if there is no displacement) or a unary ‘constant vector add’ node (if
`there is a displacement).
`However, we can eliminate the unary ‘constant vector add’ and ‘constant
`vector multiply’ nodes entirely. We simply compose them into their adjacent
`binary ‘vector add’ nodes, thereby saving some space by eliminating their S-
`Boxes.
`A potential weakness of this entire approach is that the blocking of A may
`produce blocks, such as zero blocks, which convert to S-Boxes whose output
`contains none, or little, of their input information. This narrows the search space
`for an attacker seeking to determine the underlying at from the content and
`behavior of the network. However, so far as we have yet determined, such blocked
`implementations remain combinatorially quite difficult to crack, especially if we
`apply the following proposal.
`
`7
`
`

`

`8
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`To address the above potential weakness, instead of encoding n
`mA, find trans-
`
`forms nmA1 and mmA2, such that A2 is a mixing bijection (see §3.1), and A1 =
`
`A◦ A−1
`2 . Encode the two functions separately into networks of S-Boxes, and con-
`2 representation to the inputs of the A(cid:48)
`nect the outputs of the A(cid:48)
`1 representation,
`
`1 ◦ A(cid:48)2 = A(cid:48).
`thus creating a representation of A(cid:48)
`While the above proposal helps, it is not easy, in general, to eliminate m × n
`blocks which lose more bits of input information than the minimum indicated
`knM into k × k blocks, we
`by m and n. For example, if we partition a matrix kn
`cannot guarantee that all of the k × k blocks are non-singular, even if the rank
`of M is greater than k. Hence if M is non-singular, a partition of M into square
`blocks may contain some singular (lossy) blocks.
`Therefore, some information about an encoded at may leak in its represen-
`tation as a blocked and de-linearized network of S-Boxes when this blocking
`method is used.
`
`5 A White-Box DES Implementation Example
`
`We now construct an embedded, fixed-key des implementation. We will start
`with a simple construction with weaknesses, both in security and efficiency. We
`discuss how to address these in §5.3.
`
`5.1 Replacing the DES SBs
`
`des is performed in 16 rounds, each employing the same eight des S-Boxes
`(dsbs), S1, . . . S8, and the same ats, sandwiched between initial and final ats
`(the initial and final permutations). Each dsb is an instance of 4
`6E (see e.g. [11]).
`
`In Figure 1(a), we see an unrolling of two typical des rounds. The round
`structure implements a Feistel network with a by-pass left-side data-path (Lr−1,
`Lr, Lr+1 in the figure) and an active right-side data-path (everything else in the
`figure). Kr is the round subkey of round r.
`???Check change fig 1b to be unrolled to match fig 1a??? The encoded im-
`plementation will look like Figure 1(b), where each round is represented by 12
`T boxes. Between rounds, the left and right sides are combined into a single 96
`bit representation and we uses a single rM2 transform to to subsume the P-Box,
`xor, side flip, and E-box. (See The Transfer Functions in §5.2 for a more
`detailed description.)
`We also need M1 to do an initial expansion of the input from 64 bits to the
`internal 96 bits form and M3 for the final shrink of the output.
`
`Eliminating the Overt Key by Partial Evaluation In each round, a dsb’s
`input is the xor of ‘unpredictable’ information (i.e. data), and ‘predictable’ infor-
`mation, determined by the algorithm and the key. We can merge the ‘predictable’
`information and the dsbs into new S-Boxes that are dependent on the key and
`round.
`
`8
`
`

`

`A White-Box DES Implementation
`
`9
`
`Fig. 1. Original and Modified DES
`
`96
`
`8
`
`8
`
`8
`
`r−1
`K T1
`
`r−1
`K T2
`
`r−1
`K T3
`
`. . .
`
`8
`
`8
`
`8
`
`8
`
`r−1
`K T12
`
`8
`
`96
`
`r−1
`M
`
`2
`
`2
`
`8
`
`T12
`
`r K
`
`8
`
`. . .
`
`96
`
`96
`
`8
`
`8
`
`8
`
`Tr
`K 1
`
`Tr
`K 2
`
`Tr
`K 3
`
`8
`
`8
`
`8
`
`L r−1
`
`rR −1
`
`Expansion
`
`r−1K
`
`...
`
`S1
`
`S8
`
`P−Box
`
`L r
`
`rR
`
`Expansion
`
`rK
`
`...
`
`S1
`
`S8
`
`P−Box
`
`L r+1
`
`rR +1
`
`} }}
`
`−1r
`TK
`
`r−1
`M
`
`Tr
`
`K
`
`(a) Two Rounds of DES
`
`(b) Modified DES Before
`De−Linearization and Encoding
`
`9
`
`

`

`10
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`Let us therefore produce new S-Boxes identified as r
`KSi, where K is the
`encryption key, r is the round number, and i is the corresponding dsb number,
`KSi yields the same result as Si would produce in
`such that, for any given input, r
`round r if the des key were K, but the xors of the inputs of the original dsbs have
`been eliminated (see partial evaluation in §3.1). Each of the 16 × 8 = 128
`
`Si’s is still in 46E form of 6 input bits and 4 output bits.
`At this point, the overt key K has disappeared from the algorithm: it is
`
`represented in the contents of the rKSi’s. This permits us to remove the xors
`(“⊕”) with the inputs to S1, . . . , S8 shown in Figure 1(a).
`
`rK
`
`Preparing the Modified DSBs for Local Security In grey-box (smart card)
`implementations of des the dsbs have proven to be effective sites for statistical
`attacks. To make such attacks more difficult for our white-box implementation,
`we would like to end up with locally secure (see §3.2) S-Boxes. This means we
`need to replace the lossy S-Boxes with something that is bijective. We convert
`8E form by means of split-path encoding (see §3.1):
`
`the lossy rKSi’s into 8
`KSi(8e1··6)(cid:107) R(8e)
`Ti(8e) = r
`for all 8e, for the fixed key K, for round r = 1, . . . , 16, for S-Box number
`i = 1, . . . , 8, where we define R(8e) = (cid:104) 8e1, 8e6, 8e7, 8e8 (cid:105) for all 8e. This is an
`application of split-path encoding
`The plan is that the first six bits of the input of a rKTi will be the 6-bit input
`
`to dsb i in round r. We then add two extra input bits. The left 4-bit half of
`
`the output of a rKTi is the output of dsb i in round r, and the right 4-bit half
`contains the first and last input bits of dsb i in round r followed by the two
`extra input bits. That is, the right half of the output contains copies of four of
`the input bits.
`To see that each rKTi is a bijection, we note that the function Fa,b,c,d defined
`
`KTi((cid:104)a(cid:105)(cid:107)4e(cid:107)(cid:104)b, c, d(cid:105)) is a bijec-
`for any constant bits a, b, c, d by Fa,b,c,d(4e) = r
`tion. (Every row of every dsb contains a permutation of (cid:104)0, . . . , 15(cid:105), with the
`row selected by the bits corresponding to a, b above. The xor with the relevant
`bits of the key K effectively re-orders this permutation into a new permutation.
`The output of Fa,b,c,d is therefore a bijection mapping the 4e according to a 1-to-
`
`1 mapping of the input space determined by a permutation. Since rKTi simply
`KTi preserves all of its
`copies the bits corresponding to a, b, c, d to the output, r
`input entropy; i.e., it is a bijection.)
`
`rK
`
`Providing 64 bits of By-Pass Capacity In our construction, we want to
`hide the difference between the left and right sides of the Feistel data-path, so
`each rM2 expects more than just the 32 bits of S-Box outputs, both the left and
`(unchanged) right sides are needed. We refer to this as needing 64 bit of by-pass.
`As converted above, each r
`KTi carries eight bits to the next rM2: 4 bits of
`S-Box output, 2 bits from the right side and 2 bits that we can choose to be
`from the left. This means eight T boxes will only carry 16 bits from the left and
`
`10
`
`

`

`A White-Box DES Implementation
`
`11
`
`16 bits from the right. This means the by-pass capacity of the r
`KTi’s is too small
`by 32 bits.
`So we add four more S-Boxes for each round, designated as r
`KT9, . . . ,rK T12.
`
`Each is a bijective at of 8 bits to 8 bits. These extra S-Boxes are at’s to make it
`easier to access the bypassed bits for subsequent processing. (Subsequent steps
`will de-linearize every S-Box, so use of ats for these by-pass paths need not
`compromise security.) These new S-Boxes provide the remaining 32 bits: 16 bits
`of right-side by-pass capacity, and 16 bits of left-side by-pass capacity.
`
`5.2 Connecting and Encoding the New SBs to Implement DES
`
`The overall data-flow structure of our des implementation immediately prior
`to de-linearization of ats and encoding of S-Boxes (see §3.1, §3.2), is shown in
`Figure 1(b). It would look just the same after de-linearization and encoding,
`except that each Mi would be replaced by a corresponding M(cid:48)
`i and each r
`KTi
`KT(cid:48)
`i. Except for the addition of these “ (cid:48) ”
`would be replaced by a corresponding r
`characters, the figure would be identical.
`
`Data-Flow and Algorithm Before de-linearization and encoding, each Mi
`96M3, respectively.
`96M2, and 64
`64M1, 96
`is representable as a matrix, with forms 96
`(We briefly discuss the role of each one in §5.1 and in more detail below in The
`Transfer Functions.)
`In Figure 1(b), italic numbers such as 8 and 64 denote the length of the
`vectors traversing the data path to their left. Arrows represent data-paths and
`indicate their direction of data-flow.
`KTi’s in order by i in Figure 1(b) does not indicate
`The appearance of rows of r
`any ordering of their appearance in the implementation: the intervening M2
`transformations can handle any such re-ordering.
`
`The Transfer Functions In constructing M1, M2’s, and M3, we must deal
`with the sparseness of the matrices for the ats used in standard des. The bit-
`reorganizations, such as the Expansion and P-box transforms appearing in
`Figure 1(a), are all 0-bits except for one or two 1-bits in each row and column.
`The xor operations (“⊕” in Figure 1(a)) are similarly sparse.
`Therefore, we use the second method proposed for handling sparseness in §4:
`doubling the implementations into two blocked implementations, with the initial
`portion of each pair being a mixing bijection. We will regard this as part of
`the encoding process, and discuss the nature of the Mi’s prior to this ‘anti-
`sparseness’ treatment.
`The following constructions all involve only various combinations, compo-
`sitions, simple reorganizations, and concatenations of ats, and are therefore
`straightforward.
`M1 combines the following: (1) the initial permutation of des, (2) the Ex-
`pansion (see Figure 1(a)), modified to deliver its output bits to the first six in-
`KTi, combined with (3) the delivery of the 32 left-side data-path bits
`puts of each r
`
`11
`
`

`

`12
`
`S. Chow, P. Eisen, H. Johnson, P.C. van Oorschot
`
`to be passed through the by-pass provided by inputs 7 and 8 of r
`KT1, . . . ,r
`K T8
`and 16 bits of by-pass provided at randomly chosen positions in the four ‘dum-
`mies’, r
`K T12, all in randomly chosen order.
`KT9, . . . ,r
`M2 for each round combines the following: (1) the P-box transform (see
`Figure 1(a)), (2) the xor of the left-side data with the P-box output, (3) ex-
`traction of the original input of the right-side data-path, (4) the Expansion, as
`in M1, (5) the left-side by-pass, as in M1.
`M3 combines the following: (1) ignoring the inputs provided for simultaneous
`by-pass, (2) the left-side by-pass, as in M1, (3) inversion of the Expansion,
`ignoring half of each redundant bit pair, (4) swapping the left-side and right-
`side data (des effectively swaps the left and right halves after the last round),
`and (5) the final permutation.
`
`Blocking and Encoding Details We recommend using 4× 4 blocking for the
`Mi’s. As a result of the optimization noted in §4, this means that the entire
`implementation consists entirely of networked 8 × 4 (‘vector add’) and 8 × 8
`KT(cid:48)
`(r
`i) S-Boxes.
`Aside from M1’s input coding and M3’s output coding, both of which are
`64I (appropriately blocked), all S-Boxes are input- and output-coded
`simply 64
`using the method of §3.1 in order to match the 4-bit blocking factor required for
`each input by the binary ‘vector add’ S-Boxes.
`
`5.3 Recommended Variants
`
`The above section completes a naked variant of white-box des. The recommended
`variant applies input and output encodings to the whole des operations. That is,
`we modify the scheme shown in Figure 1(b), so that M1 is replaced by M1 ◦ M0
`and M3 is replaced by M4◦M3, where the M0 and M4 ats are mixing bijections.
`Each of M1 ◦ M0 and M4 ◦ M3 is, of course, a single at. When it is encoded in
`4-bit blocks, it becomes non-linear.
`One issue that arises is whether this recommended variant of des (or other ci-
`phers) is still an implementation of the standard algorithm. Although it employs
`an encoded input and output, we can pre- and post-process the input to this
`computation by the inverses of the pre- and post-encodings, to effectively cancel
`both. One might refer to this as operating on de-encoded intext and outtext. The
`de-encoding process can be done in any one or a combination of several places,
`for example: the software immediately surrounding the cryptographic computa-
`tion; more distant surrounding software; or even software executing on a separate
`node (with obvious coordination required). T

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket