IPR2023-00458, No. 1019 Exhibit - Ex 1019 Agrawal (P.T.A.B. Jan. 31, 2023)

Order Preserving Encryption for Numeric Data
`
`Rakesh Agrawal
`
`Jerry Kiernan
`
`Ramakrishnan Srikant
`
`Yirong Xu
`
`IBM Almaden Research Center
`650 Harry Road, San Jose, CA 95120
`
`ABSTRACT
`Encryption is a well established technology for protecting sensi-
`tive data. However, once encrypted, data can no longer be easily
`queried aside from exact matches. We present an order-preserving
`encryption scheme for numeric data that allows any comparison op-
`eration to be directly applied on encrypted data. Query results pro-
`duced are sound (no false hits) and complete (no false drops). Our
`scheme handles updates gracefully and new values can be added
`without requiring changes in the encryption of other values. It al-
`lows standard database indexes to be built over encrypted tables
`and can easily be integrated with existing database systems. The
`proposed scheme has been designed to be deployed in application
`environments in which the intruder can get access to the encrypted
`database, but does not have prior domain information such as the
`distribution of values and cannot encrypt or decrypt arbitrary val-
`ues of his choice. The encryption is robust against estimation of the
`true value in such environments.
`
`1.
`
`INTRODUCTION
`Database systems typically offer access control as the means to
`restrict access to sensitive data. This mechanism protects the pri-
`vacy of sensitive information provided data is accessed using the in-
`tended database system interfaces. However, access control, while
`important and necessary, is often insufﬁcient. Attacks upon com-
`puter systems have shown that information can be compromised
`if an unauthorized user simply gains access to the raw database
`ﬁles, bypassing the database access control mechanism altogether.
`For instance, a recent article published in the Toronto Star [14] de-
`scribes an incident where a disk containing the records of several
`hundred bank customers was being auctioned on eBay. The bank
`had inadvertently sold the disk to the eBay re-seller as used equip-
`ment without deleting its contents. Drawing upon privacy legisla-
`tions and guidelines worldwide, Hippocratic databases also identify
`the protection of personal data from unauthorized acquisition as a
`vital requirement [1].
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for proﬁt or commercial advantage, and that copies
`bear this notice and the full citation on the ﬁrst page. To copy otherwise, to
`republish, to post on servers or to redistribute to lists, requires prior speciﬁc
`permission and/or a fee.
`SIGMOD 2004 June 13-18, 2004, Paris, France.
`Copyright 2004 ACM 1-58113-859-8/04/06 : : : $5.00.
`
`Encryption is a well established technology for protecting sensi-
`tive data [7] [22] [24]. Unfortunately, the integration of existing
`encryption techniques with database systems causes undesirable
`performance degradation. For example, if a column of a table con-
`taining sensitive information is encrypted, and is used in a query
`predicate with a comparison operator, an entire table scan would
`be needed to evaluate the query. The reason is that the current en-
`cryption techniques do not preserve order and therefore database
`indices such as B-tree can no longer be used. Thus the query exe-
`cution over encrypted databases can become unacceptably slow.
`We present an encryption technique called OPES (Order Preserv-
`ing Encryption Scheme) that allows comparison operations to be
`directly applied on encrypted data, without decrypting the operands.
`Thus, equality and range queries as well as the MAX, MIN, and
`COUNT queries can be directly processed over encrypted data.
`Similarly, GROUP BY and ORDER BY operations can also be ap-
`plied. Only when applying SUM or AVG to a group do the values
`need to be decrypted. OPES is also endowed with the following
`properties:
` The results of query processing over data encrypted using
`OPES are exact. They neither contain any false positives nor
`miss any answer tuple. This feature of OPES sharply differ-
`entiates it from schemes such as [13] that produce a superset
`of answer, necessitating ﬁltering of extraneous tuples in a
`rather expensive and complex post-processing step.
`
` OPES handles updates gracefully. A value in a column can
`be modiﬁed or a new value can be inserted in a column with-
`out requiring changes in the encryption of other values.
`
` OPES can easily be integrated with existing database sys-
`tems as it has been designed to work with the existing index-
`ing structures such as B-trees. The fact that the database is
`encrypted can be made transparent to the applications.
`
`Measurements from an implementation of OPES in DB2 show that
`the time and space overhead of OPES are reasonable for it to be
`deployed in real systems.
`1.1 Estimation Exposure
`The security of an encryption scheme is conventionally assessed
`by analyzing whether an adversary can ﬁnd the key used for en-
`cryption. See [22] [24] for a categorization of different levels of
`attacks against a cryptosystem.
`When dealing with sensitive numeric data, an adversary does
`not have to determine the exact data value p corresponding to an
`
`Page 1 of 12
`
`Netskope Exhibit 1019
`
`

`column statistics, as well as values written to recovery logs.
`Otherwise, an adversary may be able to use this information
`to guess data distributions.
`
`1.3 Pedagogical Assumptions and Notations
`The focus of this paper is on developing order-preserving en-
`cryption techniques for numeric values and assumes conventional
`encryption [22] [24] for other data types as well as for encrypting
`information such as schema names and metadata. We will some-
`time refer to unencrypted data values as plaintext. Similarly, en-
`crypted values will also be referred to as ciphertext.
`We will assume that the database consists of a single table, which
`in turn consists of a single column. The domain of the column will
`be initially assumed to be a subset of integer values, pmin; pmax.
`The extension for real values is given later in the paper.
`Assume the database ~P consists of a total of j ~P j plaintext values.
`Out of these, jP j values are unique, which will be represented as
`P = p ; p; : : : ; pjP j , pi pi+ The corresponding encrypted
`values will be represented as C = c ; c; : : : ; cjP j, ci ci+ .
`Duplicates can sometimes be used to guess the distribution of a
`domain, particularly if the distribution is highly skewed. A closely
`related problem is that if the number of distinct values is small (e.g.,
`day of the month), it is easy to guess the domain. We will ini-
`tially assume that the domain to be encrypted either does not con-
`tain many duplicates or contains a distribution that can withstand a
`duplicate attack, and discuss the handling of duplicates later in the
`paper.
`1.4 Paper Layout
`The rest of the paper is organized as follows. We ﬁrst discuss re-
`lated work in Section 2. We give an overview of OPES in Section 3.
`The next three sections give details of the three main phases of
`OPES. We describe extensions to handle real values and duplicates
`in Section 7. In Section 8, we study the quality of the encryption
`produced by OPES and present performance measurements from a
`DB2 implementation. We conclude with a summary and directions
`for future work in Section 9.
`
`2. RELATED WORK
`
`as c = Pp
`
`Summation of Random Numbers A simple scheme has been
`proposed in [3] that computes the encrypted value c of integer p
`j= Rj, where Rj is the jth value generated by a se-
`cure pseudo-random number generator R. Unfortunately, the cost
`of making p calls to R for encrypting or decrypting c can be pro-
`hibitive for large values of p.
`A more serious problem is the vulnerability to estimation ex-
`posure. Since the expected gap between two encrypted values is
`proportional to the gap between the corresponding plaintext val-
`ues, the nature of the plaintext distribution can be inferred from the
`encrypted values. Figure 2 shows the distributions of encrypted val-
`ues obtained using this scheme for data values sampled from two
`different distributions: Uniform and Gaussian. In each case, once
`both the input and encrypted distributions are scaled to be between
`0 and 1, the number of points in each bucket is almost identical for
`the plaintext and encrypted distributions. Thus the percentile of a
`point in the encrypted distribution is also identical to its percentile
`in the plaintext distribution.
`
`Figure 1: Transparent encryption in the “trusted database soft-
`ware with vulnerable storage” setting.
`
`encrypted value c; a breach may occur if the adversary succeeds
`in obtaining a tight estimate of p. For a numeric domain P , if
`an adversary can estimate with c conﬁdence that a data value
`p lies within the interval p ; p] then the interval width p (cid:0)
`p =domain-widthP deﬁnes the amount of estimation exposure
`at c conﬁdence level.
`Clearly, any order-preserving encryption scheme is vulnerable
`to tight estimation exposure if the adversary can choose any num-
`ber of unencrypted (encrypted) values of his liking and encrypt
`(decrypt) them into their corresponding encrypted (unencrypted)
`values. Similarly, any order-preserving encryption is not secure
`against tight estimation exposure if the adversary can guess the do-
`main and knows the distribution of values in that domain.
`We consider an application environment where the goal is safety
`from an adversary who has access to all (but only) encrypted values
`(the so called ciphertext only attack [22] [24]), and does not have
`any special information about the domain. We will particularly fo-
`cus on robustness against estimation exposure.
`1.2 Threat Model
`We assume (see Figure 1):
` The storage system used by the database software is vulnera-
`ble to compromise. While current database systems typically
`perform their own storage management, the storage system
`remains part of the operating system. Attacks against storage
`could be performed by accessing database ﬁles following a
`path other than through the database software, or in the ex-
`treme, by physical removal of the storage media.
` The database software is trusted. We trust the database
`software to transform query constants into their encrypted
`values and decrypt the query results. Similarly, we assume
`that an adversary does not have access to the values in the
`memory of the database software.
` All disk-resident data is encrypted. In addition to the data
`values, the database software also encrypts schema infor-
`mation such as table and column names, metadata such as
`
`Page 2 of 12
`
`Netskope Exhibit 1019
`
`

`(a) Input: Uniform
`
`2500
`
`Original
`Encrypted
`
`Scaled Domain
`
`(b) Input: Gaussian
`
`1
`
`Original
`Encrypted
`
`1
`
`2000
`
`1500
`
`1000
`
`Number of points
`
`500
`
`0
`
`0
`
`2500
`
`2000
`
`1500
`
`1000
`
`Number of points
`
`500
`
`0
`
`0
`
`Scaled Domain
`Figure 3: Polynomial functions: Encryption of different input
`distributions look different.
`
`(a) Input: Uniform
`
`Original
`Encrypted
`
`Scaled Domain
`
`(b) Input: Gaussian
`
`1
`
`Original
`Encrypted
`
`1
`
`300
`
`250
`
`200
`
`150
`
`100
`
`50
`
`Number of points
`
`0
`
`0
`
`700
`
`600
`
`500
`
`400
`
`300
`
`200
`
`100
`
`0
`
`Number of points
`
`0
`
`Scaled Domain
`Figure 2: Summation of random numbers: Distribution of en-
`crypted values tracks the input distribution.
`
`Polynomial Functions
`In [12], a sequence of strictly increasing
`polynomial functions is used for encrypting integer values while
`preserving their order. These polynomial functions can simply be
`of the ﬁrst or second order, with coefﬁcients generated from the en-
`cryption key. An integer value is encrypted by applying the func-
`tions in such a way that the output of a function becomes the input
`of the next function. Correspondingly, an encrypted value is de-
`crypted by solving these functions in reverse order. However, this
`encryption method does not take the input distribution into account.
`Therefore the shape of the distribution of encrypted values depends
`on the shape of the input distribution, as shown in Figure 3 for the
`encryption function given in Example 10 in [12]. This illustration
`suggests that this scheme may reveal information about the input
`distribution, which can be exploited.
`
`Bucketing
`In [13], tuples are encrypted using conventional en-
`cryption, but an additional bucket id is created for each attribute
`value. This bucket id, which represents the partition to which the
`unencrypted value belongs, can be indexed. The constants ap-
`pearing in a query are replaced by their corresponding bucket ids.
`Clearly, the result of a query will contain false hits that must be
`removed in a post-processing step after decrypting the tuples re-
`turned by the query. This ﬁltering can be quite complex since the
`bucket ids may have been used in joins, subqueries, etc. The num-
`ber of false hits depends on the width of the partitions involved. It is
`shown in [13] that the post-processing overhead can become exces-
`sive if a coarse partitioning is used for bucketization. On the other
`hand, a ﬁne partitioning makes the scheme vulnerable to estimation
`exposure, particularly if an equi-width partitioning is used.
`It has been pointed out in [6] that the indexes proposed in [13]
`can open the door to interference and linking attacks. Instead, they
`
`build a B-tree over plaintext values, but then encrypt every tuple
`and the B-tree at the node level using conventional encryption. The
`advantage of this approach is that the content of B-tree is not visible
`to an untrusted database server. The disadvantage is that the B-tree
`traversal can now be performed by the front-end only by execut-
`ing a sequence of queries that retrieve tree nodes at progressively
`deeper level.
`
`Other Relevant Work Rivest et al. [21] suggest that the limit on
`manipulating encrypted data arises from the choice of encryption
`functions used, and there exist encryption functions that permit en-
`crypted data to be operated on directly for many sets of interesting
`operations. They call these functions “privacy homomorphisms”.
`The focus of [21] and the subsequent follow-up work [2] [8] [9]
`has been on designing privacy homomorphisms to enable arith-
`metic on encrypted data, but the comparison operations were not
`investigated in this line of research.
`In [10], a simple but effective scheme has been proposed to en-
`crypt a look-up directory consisting of (key, value) pairs. The goal
`is to allow the corresponding value to be retrieved if and only if
`a valid key is provided. The essential idea is to encrypt complete
`tuples, but associate with every tuple the one-way hash value of its
`key. Thus, no tuple will be retrieved if an invalid key is presented.
`Answering range queries was not a goal of this system.
`In [23], interesting schemes are proposed to support keyword
`searches over an encrypted text repository. The driving application
`for this work is the efﬁcient retrieval of encrypted email messages.
`Naturally, they do not discuss relational queries and it is not clear
`how their techniques can be adapted for relational databases.
`In [4], a smart card with encryption and query processing ca-
`pabilities is used to ensure the authorized and secure retrieval of
`encrypted data stored on untrusted servers. Encryption keys are
`
`Page 3 of 12
`
`Netskope Exhibit 1019
`
`

`running OPES with different input distributions and the same target
`distribution. Notice that the distribution of encrypted values looks
`identical in both 4(a) and 4(b), even though the input distributions
`were very different.
`3.1 Intuition
`To understand the intuition behind OPES algorithm, consider the
`following encryption scheme:
`Generate jP j unique values from a user-speciﬁed target distri-
`bution and sort them into a table T . The encrypted value ci of pi
`is then given by ci = T i. That is, the ith plaintext value in the
`sorted list of jP j plaintext values is encrypted into the ith value in
`the sorted list of jP j values obtained from the target distribution.
`The decryption of ci requires a lookup into a reverse map. Here T
`is the encryption key that must be kept secret.
`Clearly, this scheme does not reveal any information about the
`original values apart from the order, since the encrypted values
`were generated solely from the user-speciﬁed target distribution,
`without using any information from the original distribution. Even
`if an adversary has all of the encrypted values, he cannot infer T
`from those values. By appropriately choosing target distribution,
`the adversary can be forced to make large estimation errors.
`This simple scheme, while instructive, has the following short-
`comings for it to be used for encrypting large databases:
`
` The size of encryption key is twice as large as the number of
`unique values in the database.
`
` Updates are problematic. When adding a new value p, where
`pi p pi+ , we will need to re-encrypt all pj ; j i.1
`
`OPES has been designed such that the result of encryption is sta-
`tistically indistinguishable from the one obtained using the above
`scheme, thereby providing the same level of security, while remov-
`ing its shortcomings.
`3.2 Overview of OPES
`When encrypting a given database P , OPES makes use of all
`the plaintext values currently present P ,2 and also uses a database
`of sampled values from the target distribution. Only the encrypted
`database C is stored on disk. At the same time, OPES also creates
`some auxiliary information K, which the database system uses to
`decrypt encoded values or encrypt new values. Thus K serves the
`function of the encryption key. This auxiliary information is kept
`encrypted using conventional encryption techniques.
`OPES works in three stages:
`1. Model: The input and target distributions are modeled as
`piece-wise linear splines.
`2. Flatten: The plaintext database P is transformed into a “ﬂat”
`database F such that the values in F are uniformly distributed.
`1It is possible to avoid immediate re-encryption by choosing an
`encrypted value for p in the interval (ci; ci+ ), but T would still
`need updating. Moreover, there might be cases where ci+ = ci+
`and therefore inserting a new value will require re-encryption of
`existing values.
`Note that the encryption scheme ci = T pi circumvents the up-
`date problem. But now the size of the key becomes the size of the
`domain. It is also vulnerable to percentile exposure, as discussed
`earlier in Section 2.
`2If an installation is creating a new database, the database admin-
`istrator can provide a sample of expected values.
`
`1
`
`1
`
`(a) Input: Uniform, Target: Zipf
`2500
`
`Original
`Target
`Encrypted
`
`2000
`
`1500
`
`1000
`
`Number of points
`
`500
`
`0
`
`0
`
`Scaled Domain
`
`(b) Input: Gaussian, Target: Zipf
`2500
`
`Original
`Target
`Encrypted
`
`2000
`
`1500
`
`1000
`
`Number of points
`
`500
`
`0
`
`0
`
`Scaled Domain
`
`Figure 4: Illustrating OPES.
`
`maintained on the smart card. The smart card can translate exact
`match queries into equivalent queries over encrypted data. How-
`ever, range queries require creating a disjunction for every pos-
`sible value in the range, which is infeasible for real data values.
`The smart card implementation could beneﬁt from our encryption
`scheme in that range queries could be translated into equivalent
`queries over encrypted data.
`In [25], the security and tamper resistance of a database stored
`on a smart card is explored. They consider snooping attacks for
`secrecy, and spooﬁng, splicing, and replay attacks for tamper resis-
`tance. Retrieval performance is not the focus of their work and it
`is not clear how much of their techniques apply to general purpose
`databases not stored in specialized devices.
`Amongst commercial database products, Oracle 8i allows values
`in any of the columns of a table to be encrypted [18]. However,
`the encrypted column can no longer participate in indexing as the
`encryption is not order-preserving.
`Related work also includes research on order-preserving hashing
`[5] [11]. However, protecting the hash values from cryptanalysis is
`not the concern of this body of work. Similarly, the construction of
`original values from the hash values is not required.
`
`3. PROPOSED ORDER-PRESERVING EN-
`CRYPTION SCHEME
`The basic idea of OPES is to take as input a user-provided tar-
`get distribution and transform the plaintext values in such a way
`that the transformation preserves the order while the transformed
`values follow the target distribution. Figure 4 shows the result of
`
`Page 4 of 12
`
`Netskope Exhibit 1019
`
`

`3. Transform: The ﬂat database F is transformed into the ci-
`pher database C such that the values in C are distributed
`according to the target distribution.
`
`Note that
`
`pi pj = fi fj = ci cj:
`
`We give details of the three stages in Sections 4, 5 and 6 respec-
`tively.
`
`4. MODELING THE DISTRIBUTIONS
`Techniques for modeling data distributions have been studied ex-
`tensively in the database literature in the context of estimating the
`costs of different query execution plans. As stated in [16], there are
`two broad categories of techniques: histogram-based that capture
`statistical information about a distribution by means of counters
`for a speciﬁed number of buckets, and parametric that approximate
`a distribution by ﬁtting the parameters of a given type of function.
`We experimented with several histogram-based techniques [15], in-
`cluding equi-depth, equi-width, and wavelet-based methods, but
`found that the ﬂattened values obtained were not uniformly dis-
`tributed unless the number of buckets was selected to be unreason-
`ably large. The main source of the problem was the assumption that
`the distribution is uniform within each bucket. The parametric are
`suitable for closed-form distributions but lead to poor estimations
`for irregular distributions [16], which we expect to be the norm in
`our application.
`We, therefore, resorted to a combination of histogram-based and
`parametric techniques. As in [16], we ﬁrst partition the data values
`into buckets and then model the distribution within each bucket as
`a linear spline. The spline for a bucket pl; ph is simply the line
`connecting the densities at the two end-points of the bucket.3
`We also allow the width of value ranges to vary across buck-
`ets. However, unlike [16], we do not have a given ﬁxed number
`of buckets. Rather, we use the minimum description length (MDL)
`principle [20] to determine the number of buckets.
`
`4.1 Bucket Boundaries
`The bucket boundaries are determined in two phases:4
`
`1. Growth phase. The space is recursively split into ﬁner par-
`titions. Each partitioning of a bucket reduces the maximum
`deviation from the density function within the newly formed
`buckets when compared to their parent bucket.
`
`2. Prune phase. Some buckets are pruned (merged into big-
`ger buckets). The idea is to minimize the number of buckets
`and yet have the values within buckets after mapping be uni-
`formly distributed. We use the MDL principle to obtain this
`balance.
`
`The details of these two phases are discussed next.
`
`3In [16], the splines are not continuous across buckets; they use lin-
`ear regression over data values present in a bucket for determining
`the spline. However, such discontinuities may cause undesirable
`breaks in the uniformity when we ﬂatten plaintext values.
`4This procedure is reminiscent of the procedure for building deci-
`sion tree classiﬁers, and in particular SLIQ [17], but the details are
`quite different.
`
`4.2 Growth Phase
`We are given a bucket pl; ph, with h (cid:0) l (cid:0) (sorted) points:
`fpl+ ; pl+; : : : ; ph(cid:0) g. We ﬁrst ﬁnd the linear spline for this
`bucket. Next, for each point ps in the bucket, we compute its ex-
`pected value if the points were distributed according to the density
`distribution modeled by the linear spline (i.e., the expected value
`of the s (cid:0) lth smallest value in a set of h (cid:0) l (cid:0) random val-
`ues drawn from the distribution). We then split the bucket at the
`point that has the largest deviation from its expected value (break-
`ing ties arbitrarily). We stop splitting when the number of points in
`a bucket is below some threshold, say, 10.
`
`4.3 Prune Phase
`The MDL principle [20] states that the best model for encoding
`data is the one that minimizes the sum of the cost of describing the
`model and the cost of describing the data in terms of that model.
`For a given bucket pl; ph, the local beneﬁt LB of splitting this
`bucket at a point ps is given by
`
`LBpl; ph = DataCostpl; ph (cid:0) DataCostpl; ps
`(cid:0) DataCostps; ph (cid:0) IncrModelCost
`
`where DataCost(p ; p gives the cost of describing the data in the
`interval p ; p and IncrModelCost is the increase in modeling
`cost due to the partitioning of a bucket into two buckets.
`The global beneﬁt GB of splitting this bucket at ps takes into
`account the beneﬁt of further recursive splits:
`
`GBpl; ph = LBpl; ph + GBpl; ps + GBps; ph:
`
`If GB , the split is retained; otherwise, the split at ps and
`all recursive splits within pl; ph are pruned. Note that we do this
`computation bottom up, and therefore the cost is linear in the num-
`ber of splits.5
`We now provide the functions for the computation of DataCost
`and IncrModelCost. Assume momentarily the existence of a map-
`ping M that transforms values sampled from a linear density func-
`tion into a set of uniformly distributed values. We specify M in
`the next section. As we shall see, M will have two parameters: a
`quadratic coefﬁcient and a scale factor.
`
`4.3.1 DataCost
`We want to ﬂatten a given data distribution into a uniform distri-
`bution. So, given a bucket, we ﬁrst ﬂatten the values present in the
`bucket using the mapping M , and then compute the cost of encod-
`ing the deviations from uniformity for the mapped values.6
`Let the set of data values fpl; pl+ ; : : : ; ph(cid:0) g be mapped into
`ffl; fl+ ; : : : ; fh(cid:0) g using M . The encoding of a value pi
`
`5One might wonder why we did not combine pruning with the
`growth phase and stop splitting a bucket as soon as the local beneﬁt
`became zero or negative. The reason is that the beneﬁt of partition-
`ing may start showing only at a ﬁner granularity, and it will often
`be the case that the local beneﬁt is less than zero even though the
`global beneﬁt is much greater than zero.
`6Note that our implementation encodes only the statistically sig-
`niﬁcant deviations to avoid overﬁtting, i.e., rather than a single ex-
`pected value, we consider the range of values that would occur with
`a uniform distribution, and only encode values that are outside this
`range. We omit this detail for brevity.
`
`Page 5 of 12
`
`Netskope Exhibit 1019
`
`

`5.2 Scale Factor
`We need to ﬁnd the scale factor z, one for each bucket B such
`that:
`1. Two distinct values in the plaintext will always map to two
`distinct values in the ﬂattened space, thereby ensuring incre-
`mental updatability.
`2. Each bucket is mapped to a space proportional to the number
`of points n in that bucket, i.e., if w is the width of the bucket
`and wf = M w is the width after ﬂattening, then wf n.
`The ﬁrst constraint can be written as:
`
`p ; w : M p + (cid:0) M p :
`
`The 2 in the RHS (instead of 1) ensures two adjacent plaintext val-
`ues will be at least 2 apart in the ﬂattened space. As we will explain
`in Section 5.5, this extra separation makes encryption tolerant to
`rounding errors in ﬂoating point calculations. Expanding M , we
`get
`
`p ; w : z =sp + + :
`
`The largest value of =sp + + will be at p = if s ,
`and at p = w (cid:0) otherwise. Therefore we get
`
`^z = ;
`
`= + sw (cid:0) ;
`
`s
`s
`
`where ^z denotes the minimum value of z that will satisfy the ﬁrst
`constraint.
`To satisfy the second constraint, we want
`
`wf = Kn
`
`for all the buckets. Deﬁne
`
`^wf = ^zsw + w
`
`as the minimum width for each bucket, and deﬁne
`
`
`
`K = maxh ^wfii ; i = ; : : : ; m:
`
`Then the scale factors
`
`z =
`
`Kn
`sw + w
`will satisfy both the desired constraints, since z ^z, and wf =
`zsw + w = Kn.
`5.3 Encryption Key
`Let us brieﬂy review what we have at this stage. The model-
`ing phase has yielded a set of buckets f B ; : : : ; Bm g. For each
`bucket, we also have a mapping function M , characterized by two
`parameters: the quadratic coefﬁcient s and the scale factor z. We
`save the m + bucket boundaries, the m quadratic coefﬁcients,
`and the m scale factors in a data structure Kf . The database sys-
`tem uses Kf to ﬂatten (encrypt) new plaintext values, and also to
`unﬂatten (decrypt) a ﬂattened value. Thus Kf serves the function
`of the encryption key.
`Note that Kf is computed once at the time of initial encryption
`of the database. As the database obtains new values, Kf is used to
`encrypt them, but it is not updated, which endows OPES with the
`incremental updatability property.8
`8In order
`
`to encrypt stray values outside of
`
`the current
`
`pl; ph would cost
`
`Costpi = log jfi (cid:0) Eij;
`
`where Ei, the expected value of the ith number assuming unifor-
`mity, is given by
`
`Ei =f l +
`
`i (cid:0) l
`h (cid:0) l
`The cost of encoding all the values in the interval pl; ph is then
`given by
`
`fh (cid:0) fl:
`
`DataCostpl; ph =
`
`Costpi:
`
`h(cid:0)
`
`Xi=l+
`
`IncrModelCost
`4.3.2
`If we have m buckets, we need to store m + boundaries, m
`quadratic coefﬁcients, and m scale factors. Thus the model cost
`will be m + , assuming 32 bits for each of these values.
`More importantly, the cost of an additional bucket
`
`IncrModelCost = = :
`
`5. FLATTEN
`The overall idea of the ﬂatten stage is to map a plaintext bucket B
`into a bucket Bf in the ﬂattened space in such a way that the length
`of Bf is proportional to the number of values present in B. Thus,
`the dense plaintext buckets will be stretched and the sparse buckets
`will be compressed. The values within a bucket are mapped in
`such a way that the density will be uniform in the ﬂattened bucket.
`Since the densities are uniform both inter-bucket and intra-bucket,
`the values in the ﬂattened database will be uniformly distributed.
`We specify next a mapping function that accomplishes these goals.
`5.1 Mapping Function
`
`OBSERVATION 1. If a distribution over ; ph has the density
`function qp + r, where p ; ph, then for any constant z ,
`the mapping function
`
`
`
`p + p
`
`q
`
`M p = z
`
`r
`will yield a uniformly distributed set of values.
`
`This follows from the fact that the slope of the mapping function
`at any point p is proportional to the density at p:7
`
`qp + r
`
`z r
`
`=
`
`dM
`dp
`
` qp + r:
`
`We will refer to s := q=r as the quadratic coefﬁcient. Thus
`
`M p = zsp + p
`
`A different scale factor z is used for different buckets, to make
`the inter-bucket density uniform as well. We describe next how the
`scale factors are computed.
`7An equivalent way to think about this is that the space around p,
`say from p (cid:0) to p + is mapped to a length of
`
`M p + (cid:0) M p (cid:0) =
`
`z
`r
`
`qp + r qp + r:
`
`Page 6 of 12
`
`Netskope Exhibit 1019
`
`

`5.4 Mapping a Plaintext Value into a Flat Value
`Represent the domains of the input database P and the ﬂat database
`F as pmin; pmax and fmin; fmax respectively. Note that
`
`Compute the buckets
`1.
`used to transform the plain-
`text distribution into the ﬂat-
`tened distribution
`
`2. From the target distri-
`bution, compute the buckets
`for the ﬂattened distribution
`^Bf .
`
`wf
`i
`
`mXi
`
`=
`
`fmax = fmin +
`
`where wf
`i = Miwi. Recall that wi is the length of plaintext
`bucket Bi, and wf
`i the length of the corresponding ﬂat bucket.
`To ﬂatten a plaintext value p, we ﬁrst determine the bucket Bi
`into which p falls, using the information about the bucket bound-
`aries saved in Kf . Now p is mapped into the ﬂat value f using the
`equation:
`
`i(cid:0)
`
`i(cid:0)
`
`B
`0
`
`B
`
`m−1
`
`f
`B
`0
`
`^
`f
`B
`0
`
`^
`B
`
`f
`k−1
`
`f
`B
`m−1
`
`B
`
`t
`0
`
`B
`
`t
`k−1
`
`3. Scale the buckets of the ﬂattened target distribution to
`equal the range of the ﬂattened distribution computed in Step
`1, and scale the target distribution proportionately.
`
`B
`0
`
`B
`
`m−1
`
`f
`B
`0
`
`_
`f
`B
`0
`
`f
`B
`m−1
`
`_
`B
`
`f
`k−1
`
`B
`
`c
`0
`
`B
`
`c
`k−1
`
`Figure 5: Scaling the target distribution.
`
`6. TRANSFORM
`The transform stage is almost a mirror image of the ﬂatten stage.
`Given a uniformly distributed set of ﬂattened values, we want to
`map them into the target distribution. An equivalent way of think-
`ing about the problem is that we want to ﬂatten the target distribu-
`tion into a uniform distribution, while ensuring that the distribution
`so obtained “lines-up” with the uniform distribution yielded by ﬂat-
`tening the plaintext distribution.
`Figure 5 portrays this process. We already have on hand the
`buckets for the plain text distribution (Step 1). We bucketize the
`target distribution, independent of the bucketization of the plain-
`text distribution (Step 2). We then scale the target distribution (and
`the ﬂattened target distribution) in such a way that the width of th

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

PTO Denying Access

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go Monthly

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases