throbber
Relational Languages for Metadata
`Integration
`
`CATHARINE M. WYSS and EDWARD L. ROBERTSON
`Indiana University
`
`In this article, we develop a relational algebra for metadata integration, Federated Interoperable
`Relational Algebra (FIRA). FIRA has many desirable properties such as compositionality, closure, a
`deterministic semantics, a modest complexity, support for nested queries, a subalgebra equivalent
`to canonical Relational Algebra (RA), and robustness under certain classes of schema evolution.
`Beyond this, FIRA queries are capable of producing fully dynamic output schemas, where the
`number of relations and/or the number of columns in relations of the output varies dynamically
`with the input instance. Among existing query languages for relational metadata integration, only
`FIRA provides generalized dynamic output schemas, where the values in any (fixed) number of
`input columns can determine output schemas.
`Further contributions of this article include development of an extended relational model for
`metadata integration, the Federated Relational Data Model, which is strictly downward compatible
`with the relational model. Additionally, we define the notion of Transformational Completeness
`for relational query languages and postulate FIRA as a canonical transformationally complete
`language. We also give a declarative, SQL-like query language that is equivalent to FIRA, called
`Federated Interoperable Structured Query Language (FISQL).
`While our main contributions are conceptual, the federated model, FISQL/FIRA, and the notion
`of transformational completeness nevertheless have important applications to data integration and
`OLAP. In addition to summarizing these applications, we illustrate the use of FIRA to optimize
`FISQL queries using rule-based transformations that directly parallel their canonical relational
`counterparts. We conclude the article with an extended discussion of related work as well as an
`indication of current and future work on FISQL/FIRA.
`Categories and Subject Descriptors: H.2.1 [Database Management]: Logical Design—Data mod-
`els; H.2.3 [Database Management]: Languages—Query languages
`General Terms: Languages
`Additional Key Words and Phrases: Data integration, federated databases, federated data model,
`interoperability, metadata integration, metadata querying, multidatabases, relational query alge-
`bra, schema integration, transformational completeness
`
`The work of E. L. Robertson was supported in part by National Science foundation (NSF) grant
`IIS-82407.
`Authors’ address: Indiana University, Computer Science Department, Bloomington, IN 47405;
`email: {cmw,edrbtsn}@cs.indiana.edu.
`Permission to make digital or hard copies of part or all of this work for personal or classroom use is
`granted without fee provided that copies are not made or distributed for profit or direct commercial
`advantage and that copies show this notice on the first page or initial screen of a display along
`with the full citation. Copyrights for components of this work owned by others than ACM must be
`honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
`to redistribute to lists, or to use any component of this work in other works requires prior specific
`permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515
`Broadway, New York, NY 10036 USA, fax: +1 (212) 869-0481, or permissions@acm.org.
`C(cid:1) 2005 ACM 0362-5915/05/0600-0624 $5.00
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005, Pages 624–660.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 1 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`
`625
`
`1. INTRODUCTION
`In this article, we present a formal paradigm for transformationally com-
`plete relational metadata integration. Our paradigm consists of an extended
`relational algebra, Federated Interoperable Relational Algebra (FIRA). FIRA
`augments canonical Relational Algebra (RA) with the ability to query and re-
`structure metadata along with data directly within the relational model. Fur-
`thermore, FIRA has an equivalent SQL-like counterpart, called Federated In-
`teroperable Structured Query Language (FISQL). While our main contributions
`are conceptual, the FISQL/FIRA framework nevertheless has important appli-
`cations to data integration and OLAP, which we now summarize.
`The field of data integration has interested database researchers for decades.
`The dominant architectural model today is federated, where sources export
`(import) views to (from) a mediated schema. This requires both wrappers en-
`capsulating source data repositories as well as mapping functions giving the
`translation between source data and/or schemas and the mediated schema.
`Wrappers, mapping functions, and mediated schemas are crucial concepts in
`data integration, whether the overall network architecture is centralized, peer-
`to-peer, or hybrid [Halevy 2004].
`Our language, FIRA, can assist with the creation and maintenance of wrap-
`pers, mediating functions, and mediated schemas, especially in the case of
`relational data sources. FIRA is more robust under schema evolution than tra-
`ditional relational languages, as several examples throughout the paper illus-
`trate. Furthermore, one open problem that FIRA addresses in data integration
`is that of dynamically varying schemas, where target schemas may depend
`dynamically on source data. For instance, relational “pivot”-type operations re-
`quire certain data values to become attribute names. An example of this is the
`translation between Indianapolis and Chicago stores in our sales federation
`(Figure 1). FIRA extends contemporary solutions to this problem with more
`generality. For example, dynamic transformations are no longer restricted to
`a single column per query as in SchemaSQL [Gingras and Lakshmanan 1998;
`Lakshmanan et al. 2001]. Also, the federated data model underpinning FIRA
`provides for a clean extension of RA to federations without recourse to a two-
`tiered language as in MD-SQL [Rood et al. 1999] or MQL/MA [Wyss and Van
`Gucht 2001; Wyss et al. 2001]. A deeper discussion of related work appears in
`Section 6.
`Another important application area for FIRA is OnLine Analytical Process-
`ing (OLAP). Many relational transformations required in OLAP contain a “rela-
`tional transpose” similar to our τ operator (Section 3.8). However, the semantics
`of this operation within the constraints of the relational model have not been
`well understood. FIRA leads to important theoretical results concerning rela-
`tional transposition, which we report elsewhere [Robertson and Wyss 2004].
`Furthermore, we are working to extend FISQL with aggregation capabilities
`to better facilitate OLAP applications within the FISQL/FIRA framework.
`
`1.1 Motivating Example
`As an example motivating our languages, consider the relational tables shown
`in Figure 1. In Figure 1, sales data for Amart stores in three midwestern
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 2 of 37
`
`

`

`626
`
`•
`
`C. M. Wyss and E. L. Robertson
`
`Fig. 1. Average sales for Amart stores in three midwestern cities.
`
`Fig. 2. Using SQL to compare Amart data.
`
`cities is shown. Each city uses a different relational representation of their
`sales data. The goal is to facilitate posing queries comparing data in the three
`cities. An SQL query comparing the Indianapolis and Chicago data is shown
`in Figure 2. This query returns pairs of Indianapolis departments and stores
`that make less than at least one comparable department in some Chicago
`store.
`The SQL for this comparison has several problems, centering around the fact
`that the compared departments are “hard-coded” into the query. Thus, for ex-
`ample, if any Chicago store begins offering clothing for toddlers, this query will
`no longer perform the correct comparisons. Similarly, if any Indianapolis stores
`change their department structure, the query is broken. Furthermore, this ap-
`proach does not scale well as the number of departments increases. For 50 or
`100 departments, the query would be daunting to write. In contrast, compare
`the concise representation in FISQL which is independent of the specific de-
`partments (Figure 10, Section 4.1). Previous work has coined the term Schema
`Independence for queries that are robust under this type of schema evolution
`[Lakshmanan et al. 2001; Masermann and Vossen 2000].
`Another class of problems with the SQL query centers around the problem
`of missing or incomplete information. If Chicago stores decide to discontinue
`men’s clothing, the query in Figure 2 will break because “C.Men” nolonger refers
`to a legitimate column in Chicago.AvgSales. In contrast, FISQL assumes a
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 3 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`
`627
`
`federated data model which differs from the relational model in its approach to
`missing information, and behaves gracefully under such losses of information.
`At the same time, the federated model is strictly downward compatible with
`the relational model, in that every relation has a unique federated counterpart
`(see Definition 3.5, Section 3.2). Furthermore, this counterpart can be generated
`purely syntactically without recourse to semantic information about the data
`(unlike the transformation to the tabular model in Lakshmanan et al. [1999],
`for example).
`
`1.2 Main Contributions
`In summary, the main contributions of the current article are as follows:
`
`—We introduce the Federated Relational Model in Section 2.1 as a straightfor-
`ward syntactic extension of the relational model. The federated model treats
`relation and attribute names as first class domain elements, and behaves
`well under incomplete information.
`—We introduce the notion of Transformational Completeness for query lan-
`guages in Section 2.2. This notion captures our intuition concerning
`the completeness of a query language for relational data ↔ metadata
`transformations.
`—Our foremost contribution in this article is to introduce the algebraic query
`language FIRA, which is our paradigm for transformationally complete re-
`lational metadata integration. FIRA has several advantageous properties
`beyond current relational metadata integration languages, including:
`—Compositionality—A FIRA query maps a set of federated databases to a
`single federated database. This closure is a natural parallel of RA closure,
`since RA accepts a finite number of relations and returns a single relation
`as output.
`—Downward Compatibility—FIRA contains a subalgebra isomorphic to the
`canonical RA. This entails that RA equivalences translate directly to FIRA
`equivalences. Thus (for example), FIRA Cartesian Product is commutative
`and associative, unlike previous algebras [Grant et al. 1993].
`—Generality—FIRA allows dynamic schema transformations in more gen-
`erality than previous languages such as SchemaSQL [Lakshmanan et al.
`2001] where only a single column of data could be promoted to metadata
`per query.
`—Nested Queries—FIRA includes support for nested queries, and FIRA
`queries can be arbitrarily chained together in finite sequences. This is in
`contrast to previous languages that have used view mechanisms to inte-
`grate data and metadata, such as SchemaSQL [Lakshmanan et al. 2001].
`—In Section 4, we present FISQL and show subsequently that it is equivalent
`to FIRA (Section 4.4). The FISQL-FIRA equivalence parallels the SQL-RA
`equivalence, which means rule-based query optimization techniques devel-
`oped for SQL immediately carry over to FISQL, as we show in Section 5.
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 4 of 37
`
`

`

`628
`
`•
`
`C. M. Wyss and E. L. Robertson
`
`—Finally, in Section 6, we give a targeted overview of related work, com-
`paring properties of previous relational metadata integration languages to
`FISQL/FIRA. A tabular summary of the survey is given in Table I.
`In the next section, we begin by detailing formal notation and concepts as-
`sumed throughout the presentation.
`
`2. FORMAL PRELIMINARIES
`For the remainder of the article, we assume familiarity with the standard re-
`lational model and Relational Algebra (RA). To fix these notions, we briefly
`recapitulate what we understand to be accepted definitions of these concepts.
`We assume a domain of atomic elements, denoted dom, which contains al-
`phanumeric strings, such as 123, abc or SomeElement. Weuse teletype font to
`denote specific elements of dom. In addition, we use a special element, ⊥, called
`the null element. We assume ⊥ (cid:4)∈ dom, sothat ⊥ can be used to distinguish
`noninformation within the model.
`For metadata, we also assume a domain of atomic elements, denoted M0,
`which is usually considered to contain alphanumeric strings beginning with a
`letter. We assume M0 ⊂ dom.1
`Definition 2.1
`(1) A (Canonical) Tuple, t is a mapping from a finite set S ⊂ M0 to dom ∪{⊥}.
`Elements of S are termed attributes. The square-bracket notation t[A] is
`used to signify the element t(A) for A ∈ S.
`(2) A (Canonical) Relation has a name N ∈ M0 and a finite schema S ⊂ M0. The
`relation body consists of a finite set of (canonical) tuples t : S → dom ∪{⊥}.
`(3) A (Canonical) Database consists of a finite set of (canonical) relations.
`Definition 2.2
`(1) The Relational Algebra (RA) is understood to be the query language gener-
`ated by applying the six relational operations ρ (Renaming), σ (Selection),
`π (Projection), × (Cartesian Product), ∪ (Set Union), and—(Set Difference).
`A query in the RA maps a set of input relations (i.e., a database) to a single
`output relation.
`(2) A query language that can express all queries of the RA is said to be Rela-
`tionally Complete [Codd 1970].
`The notion of Relational Completeness is often used to gauge expressivity
`within the relational model. Beyond this, the data complexity of the RA is
`LOGSPACE. This is often used as a second gauge for relational query languages,
`since efficient query processing is of paramount importance in large data sets.
`The LOGSPACE class is considered to contain enough power for most common
`query tasks, while remaining efficient in terms of execution times. The RA itself
`is particularly suited to optimized query execution, since several nice algebraic
`
`1Classically, relational metadata and data are not compared within the model. However, relational
`metadata are represented as alphanumeric strings beginning with a letter, so in principle there is
`no problem with comparing metadata and data literals (as strings).
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 5 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`
`629
`
`rules give rise to simple but effective rule-based query optimization strategies
`[Garcia-Molina et al. 2000].
`
`2.1 The Federated Relational Data Model
`One of the main contributions of our work is to extend the relational model to
`incorporate metadata. The resulting data model is termed the Federated Re-
`lational Data Model, orsimply the Federated Model. Inthe federated model,
`relational metadata can be compared and transformed along with relational
`data. However, the meta-ness of metadata in the ordinary relational model
`must be recapitulated in the federated model, only at the “next-higher” level.
`Thus, we assume a countable domain of meta-metadata, denoted M1, contain-
`ing elements distinct from any in dom and M0; that is, M1 ∩ dom = ∅ and
`M1 ∩ M0 = ∅. In the federated model, relational data and metadata are al-
`lowed to occupy both data and schema positions.2 However, federated metadata
`(elements of M1) are not allowed as data.
`Definition 2.3
`(1) A (Federated) Tuple is a mapping from a finite set S ⊂ dom∪ M1 to dom
`∪{⊥}. S is known as the Schema of the tuple, that is, S = schema(t).
`(2) A (Federated) Relation has a name N ∈ dom. The relation body consists of
`a finite set of (federated) tuples.
`(3) A (Federated) Database has a name D ∈ M0. The database body consists of
`a finite set of (federated) relations.
`(4) A Federation consists of a finite set of (federated) databases.
`
`Definition 2.4
`(1) A federated relation can be seen as a pair (cid:11)N, B(cid:12), where N ∈ dom and
`B is a finite set of federated tuples. Given federated relation R = (cid:11)N, B(cid:12),
`we use the notation name(R) and body(R) torefer abstractly to N and B
`(respectively).
`(2) Given a federated relation R, wedefine its Schema to be
`∪
`schema(R) =
`schema(t).
`t∈body(R)
`Note that a federated relation is allowed to contain tuples with (possibly)
`differing schemas. The schema of the relation is the union of the schemas of
`tuples it contains, so that the schema of a federated relation depends critically
`on the relation instance. This is an important design decision in the creation
`of the federated model, and facilitates dynamic schema transformations, such
`as relational transposition (Section 3.8) or attribute dereference (Section 3.5).
`One consequence of this decision is that algebraic operations that remove or
`change tuples may modify the relation schema. As per Definition 2.4, we denote
`
`2It may seem odd to allow numbers as attributes and relation names. However, there is no reason
`in principle to rule this out; in fact, dates or ages (for example) may be quite useful column or
`relation names. In any case, the formalism for disallowing these cases is unwieldy, so for clarity we
`defer such considerations to implementation.
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 6 of 37
`
`

`

`630
`
`•
`
`C. M. Wyss and E. L. Robertson
`
`Fig. 3. Data-metadata transformations.
`the schema of a federated relation abstractly as schema(R), even though the
`extension of this set varies with the relation instance. This view is notationally
`convenient, and we use it throughout the remainder of this article.
`
`2.2 Transformational Completeness
`So far, we have seen federated analogs of the relational notions of tuple, relation,
`database, and schema. Another analogous concept is that of Transformational
`Completeness. We say that a query language is Transformationally Complete if
`(i) it is relationally complete and (ii) it can express schema-independent queries
`encompassing all six data-metadata transformations depicted in Figure 3. Note
`that canonical RA is not transformationally complete in this sense, as previous
`work has indicated [Lakshmanan et al. 2001].
`Several languages have been proposed that capture some or all of the trans-
`formations depicted in Figure 3. These languages are discussed in more detail
`in Section 6 after we present our transformationally complete query languages
`FIRA and FISQL (Sections 3 and 4, respectively). In fact, one of our contri-
`butions is to formalize the notion of “transformational completeness” by pos-
`tulating FIRA as a paradigmatic transformationally complete language. This
`is not to say FIRA is the final word in relational metadata integration; rather
`we use FIRA as a formal archetype for what it means to be transformationally
`complete (in the same way as RA is a formal archetype for what it means to be
`relationally complete). Thus, transformational completeness and FIRA are the
`federated analog of relational completeness and RA.
`Note that the transformations of Figure 3 must be carried out in a man-
`ner that preserves the semantics of the underlying data. For the relational
`model, one formalization of this type of symmetry is encoded in the notion of
`BP-Completeness [Abiteboul et al. 1995]. To fully capture the notion of sym-
`metry inherent in federated transformations, ongoing work is extending BP-
`Completeness and related notions to the federated data model.
`
`2.3 Higher-Order Federated Data Models
`In principle, there is no reason to stop with federations and M1. For k ≥ 2
`we can assume a “higher-order” metadata domain Mk that is disjoint from
`dom and from M j where j < k. We can then define the concepts of tuple,
`relation, database, federation, set of federations, set of sets of federations, etc.,
`and associated notions. The query languages for these “higher-order” relational
`models will have affinities with higher-order logics, just as FIRA and FISQL
`(and similar languages) have affinities with second-order logic. Ongoing work is
`characterizing the resulting relational hierarchy and languages. In particular,
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 7 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`
`631
`
`extended federated models provide a segue to tree-structured databases and
`XML.
`In the present federated model, elements of M1 cannot be demoted to data
`within the federated model; thus the “chain of removal” ends with M1.
`
`3. FEDERATED INTEROPERABLE RA
`In this section, we present the query algebra called Federated Interoperable Re-
`lational Algebra. Each operator in FIRA accepts a fixed number of databases as
`input (i.e., a federation) and returns a single database. This closure is a natural
`parallel of RA closure, since RA accepts a fixed number of relations and returns
`a single relation as output. As with RA, some operators are parameterized by
`elements of the input schema.
`
`3.1 Basic Terms
`Basic terms in FIRA are database names or database variables. For example,
`“Indy” is abasic term in the FIRA. We use variables of the form D1, D2, . . . to
`denote databases.
`
`3.2 The Relational Core of FIRA
`Each of the six RA operators (ρ, σ, π, ×, ∪, −) has a federated counterpart within
`FIRA. We use hatted notation for these FIRA counterparts ( ˆρ, ˆσ , etc.). The
`definitions of the FIRA relational algebra operators include applications of
`canonical RA operators to federated relation bodies. Given federated relation R,
`body(R) isnot, strictly speaking, a canonical relation. However, these notions
`are so close that we ignore the difference to facilitate concise definitions.3
`One twist in the definition of the federated counterparts for unary RA op-
`erators is that we need to allow relation renaming in addition to attribute re-
`naming; this is done in Definition 3.1(1). Selection and Projection are straight-
`forward (Definitions, 3.1(2) and 3.1(3), respectively).
`Definition 3.1 (Federated Unary Relational Operators)
`(1) (Renaming) Let D be a federated database. There are two cases.
`(a) (General Renaming) Let Ai, Bi ∈ dom ∪ M1 for 1 ≤ i ≤ n. Then
`ˆρ A1→B1,..., An→Bn(D) = {(cid:11)name(R), ρA1→B1,..., An→Bn(body(R))(cid:12)|R ∈ D}.
`(b) (Relation Specific Renaming) Let Ai, Bi ∈ dom ∪ M1 for 1 ≤ i ≤ n and
`N, M ∈ dom. Then
`A1→B1,..., An→Bn(D) =
`ˆρ N→M
`{(cid:11)M, ρA1→B1,..., An→Bn(body(R))(cid:12)|R ∈ D, name(R) = N}
`∪ {R ∈ D|name(R) (cid:4)= N}.
`3Formally, for federated relationR, weconsider schema(R) asthe set “ S” inDefinition 2.1, extending
`tuples in body(R) with null values where necessary. Since RA operators do not demote relational
`metadata to data, we can view elements of dom and M1 in schema(R) ascanonical metadata
`constants so that the action of RA operators on columns headed by these elements is the same as
`for columns headed by elements of M0.
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 8 of 37
`
`

`

`•
`632
`C. M. Wyss and E. L. Robertson
`(2) (Selection) Let D be a federated database and C be a well-formed Boolean
`selection condition. Then
`ˆσ C(D) = {(cid:11)name(R), σC(body(R))(cid:12)|R ∈ D}.
`(3) (Projection) Let D be a federated database and Ak ∈ dom ∪ M1 for 1 ≤ k ≤ n.
`Then
`ˆπ A1,..., An(D) = {(cid:11)name(R), πA1,..., An(body(R))|R ∈ D}.
`For the binary RA operators, the fact that relation names are now a part
`of the formalism means that we must be careful when defining the federated
`counterparts of these operations. When operating on federated databases, the
`FIRA counterparts of the binary RA operators behave consistently with the
`RA by acting non-trivially only on like named relations. Furthermore, since
`schemas vary dynamically in FIRA, we assume outer versions of the canonical
`RA operators×,∪, and−, sothat mismatched schemas will still provide sensible
`answers.
`Definition 3.2 (Federated Cartesian Product). Let D1 and D2 denote feder-
`ated databases. Their Federated Cartesian Product is defined as
`D1 ˆ× D2 = {(cid:11)name(R1), body(R1) × body(R2)(cid:12)
`| R1 ∈ D1, R2 ∈ D2 and name(R1) = name(R2)}.
`Note that unmatched relations in either database are dropped in the product,
`so the cardinality of D1 ˆ× D2 is equal to or less than that of D1. This decision
`may seem odd, but has two important consequences. First, ˆ× is commutative
`and associative, unlike in previous algebras that define the Cartesian Product
`of databases as all pairwise combinations of products [Grant et al. 1993]. Also,
`there exists an embedding of canonical relations into federated relations that
`preserves RA as a subalgebra of FIRA (Theorem 3.6, below). For similar reasons,
`federated set union and set difference are defined analogously, as follows.
`Definition 3.3 (Federated Set Union). Let D1 and D2 denote federated
`databases. Their Federated Set Union is defined as
`D1 ˆ∪D2 = {(cid:11)name(R1), body(R1) ∪ body(R2)(cid:12)
`(cid:1)
`| R1 ∈ D1, R2 ∈ D2 and name(R1) = nameR2}
`(cid:1)
`{R1 ∈ D1| there is no R2 ∈ D2 such that name(R1) = nameR2}
`{R2 ∈ D2| there is no R1 ∈ D1 such that name(R2) = nameR1}.
`Definition 3.4 (Federated Set Difference). Let D1 and D2 denote federated
`databases. The Federated Set Difference of D1 and D2 is defined as
`D1 ˆ−D2 = {(cid:11)name(R1), body(R1) − body(R2)(cid:12)
`(cid:1)
`| R1 ∈ D1, R2 ∈ D2 and name(R1) = nameR2}
`{R1 ∈ D1| there is no R2 ∈ D2 such that name(R1) = nameR2}.
`The six FIRA operators ˆρ, ˆσ , ˆπ, ˆ×, ˆ∪, and ˆ− are in and of themselves a
`compositionally closed algebra on federated databases. This algebra is thus
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 9 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`
`633
`
`a subalgebra of full FIRA (which we have yet to define), and furthermore is
`equivalent to canonical RA. To see this, we require a well-defined embedding of
`canonical relations into federated databases. Note that since dom ⊂ dom ∪ M1,
`we can view a canonical relation as the body of a federated relation without
`issue. Let ε be a distinguished symbol in dom, called the “empty name”. We
`federate canonical relations simply by assigning them the name ε.
`
`Definition 3.5. Let R be a canonical relation. The Federated Counterpart
`of R, denoted federate(R), is the federated database {(cid:11)ε, R(cid:12)}.
`THEOREM 3.6. RA is isomorphic to a subalgebra of FIRA.
`
`PROOF. Definition 3.5 defines a mapping, federate, from canonical relations
`into federated databases given by R (cid:15)→ federate(R). It is easy to see that federate
`is injective. Furthermore, federate is a homomorphism with respect to the RA
`operators and their FIRA counterparts so that, for example, federate(σC(R)) =
`ˆσC(federate(R)), federate(R ∪ S) = federate(R) ˆ∪ federate(S), etc.
`On the other hand, consider the class D that consists of exactly the federated
`databases that are the image of some canonical relation under federate. Clearly,
`federate is surjective for this class. It is relatively easy to see that we have
`closure under the FIRA operations ˆρ, ˆσ , ˆπ, ˆ×, ˆ∪, and ˆ− within D.
`Thus, federate induces a subalgebra of FIRA that is isomorphic to RA. We
`term this subalgebra the Relational Core of FIRA. (cid:16)(cid:17)
`Theorem 3.6 is important because it precisely characterizes how FIRA se-
`mantics properly extends RA semantics. Previous algebras have not shown such
`a result.
`The next subsections introduce the operators of full FIRA that move beyond
`the relational core.
`
`3.3 Drop Projection
`FIRA includes a projection operator that allows one to remove elements from
`relation schemas, denoted π.
`Definition 3.7
`(1) (Drop Projection for Federated Relations) Let R be a federated relation and
`A ∈ dom ∪ M1. Then
`πA(R) = {(cid:11)name(R), πschema(R)−A(body(R)(cid:12)}.
`(2) (Drop Projection for Federated Databases) Let D be a federated database
`and A ∈ dom ∪ M1. Then πA(D) = { πA(R)|R ∈ D}. In addition, we use the
`shorthand notation ˆπA1,... , An(D) for Ak ∈ dom ∪ M1 (1 ≤ k ≤ n) tomean
`ˆπA1(···( ˆπAn(D))···).
`Note that since federated schemas vary dynamically with instances, the drop
`projection cannot uniformly be imitated with canonical projection. In fact, re-
`cent work shows that drop projection is not superfluous for polymorphic versions
`of the canonical RA [Van den Bussche and Waller 2002].
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 10 of 37
`
`

`

`634
`
`•
`
`C. M. Wyss and E. L. Robertson
`
`Fig. 4. The relation ↓3 (Milw.FreePkSales).
`
`3.4 The Down Operator
`FIRA contains a family of operators which explicitly demote relational meta-
`data to federated data. Notationally, we use fixed constants from M1 to signify
`columns created to hold relational metadata. We will assume two families of
`constants in M1 which we represent using lowercase, subscripted versions of the
`“Fraktur” font, such as r3, and a21 (for relation, and attribute names, respec-
`tively). Note that these constants can appear in federated relation schemas,
`however they cannot be demoted to federated data. This is a consequence of
`“stopping” the relational hierarchy at federations (as discussed in Section 2.3).
`Definition 3.8 (Down Operators)
`(1) (Down Operators for Federated Relations) Let R be a federated relation and
`i ∈ N be a fixed natural number.
`(a) Let name(R) = N ∈ dom and schema(R)∩ dom = {A1, . . ., An}. We
`define metadatai(R) to bethe following set of federated tuples:
`ri
`ai
`N A1
`N A2
`...
`...
`N An
`(b) The down of R with respect to i, denoted ↓i (R) isthe federated relation
`↓i (R) = (cid:11)name(R), metadatai(R) × πri,ai(body(R))(cid:12).
`(2) (Down Operators for Federated Databases) Let D be a federated database.
`Then
`↓i (D) = {↓i (R)|R ∈ D}.
`Example 3.9. The federated relation ↓3(Milw.FreePkSales) isshown in
`Figure 4.
`Note that i only affects the choice of constants from M1 labeling the out-
`put columns of metadatai(R). The drop projection in the body of ↓i (R) en-
`sures that we only add meta-metadata subscripted by i once. Inother words,
`
`ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005.
`
`Enfish, LLC; IPR2014-00574
`Exhibit 2212
`Page 11 of 37
`
`

`

`Relational Languages for Metadata Integration
`
`•
`↓i(↓i(R)) = ↓i(R). This plus the fact that we do not include meta-metadata as
`data in the relation metadatai(R) means that ↓i (↓ j (R)) = ↓i (R) 1↓ j (R) for
`i, j ∈ N.
`
`635
`
`3.5 Attribute Dereference
`FIRA includes an operator for projecting a relation R on attributes determined
`by cell values in another attribute. For each tuple t ∈ R, instead of projecting
`that tuple on field A, weproject on the field named by t[A]. This operation is
`termed Attribute Dereference and is denoted .
`Definition 3.10 (Attribute Dereference)
`(1) (Attribute Dereference for Federated Relations) Let R be a federated relation
`A(R) = (cid:11)name(R), R(cid:19)(cid:12) where R(cid:19) is obtained
`and A, B ∈ dom ∪ M1. Then B
`from body(R) tuple-by-tuple as follows: For t ∈ body(R), we obtain s ∈ R(cid:19)
`(cid:2)
`as:
`t[t[A]] iff X = B;
`s[X ] =
`t[X ] otherwise.
`(2) (Attribute Dereference for Federated Databases) Let D be a database and
`A, B ∈ dom ∪ M1. Then B
`
`A(D) = { BA(R)|R ∈ D}.
`Example 3.11. Consider dereferencing Chi.AvgSales by the Indy depart-
`ments. We will first augment Chi.AvgSales with the column of Indy department
`names (Figure 5(a)); then we can dereference the resulting relation to obtain
`the desired information (Figure 5(b)).
`
`Dereferencing is common when comparing data from relations having “trans-
`posed” structure, such as the Chi.AvgSales and the Indy.Sales relations. This
`type of structural mismatch is commonly found when comparing spreadsheet
`data to native relational data. In these cases, dereference terms will most often
`be included within selections and projections. For these situations, we provide
`convenient notational shorthands.
`Definition 3.12
`(1) Let R be a federated relation and A, B ∈ dom ∪ M1. Let op ∈ {<, >, =, (cid:4)=,
`≤, ≥}. Then σA op B(R) isshorthand notation for
`A (R))).
`πX (σX op B( X
`where X ∈ (dom ∪ M1) − schema(R) isarbitrary (but fixed). Similarly, we
`allow selection terms of the form σA op B(R) orσ A op B(R).
`(2) Let R be a federated relation and A, B ∈ dom ∪ M1. Then π B
`A (R) isshort-
`A(R)). When such projection terms are included in a list, it is
`hand for πB( B
`understood that the dereferencing will take place first, then the projections.
`
`In general, the

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket