`
`Semistructured Data
`
`Serge Abitebouly Jason McHughz Michael Rysz Vasilis Vassalosz
`
`Janet L. Wienerz
`
`y INRIA-Rocquencourt
`F- Le Chesnay, France
`Serge.Abiteboul@inria.fr
`
`z Stanford University
`Stanford, CA , USA
`Firstname.Lastname@cs.stanford.edu
`http:www-db.stanford.edulore
`
`Abstract
`Semistructured data is not strictly typed like relational
`or object-oriented data and may be irregular or incom-
`plete.
`It often arises in practice, e.g., when heteroge-
`neous data sources are integrated or data is taken from
`the World Wide Web. Views over semistructured data
`can be used to lter the data and to restructure or pro-
`vide structure to it. To achieve fast query response time,
`these views are often materialized. This paper proposes
`an incremental maintenance algorithm for materialized
`views over semistructured data. We use the graph-based
`data model OEM and the query language Lorel, devel-
`oped at Stanford, as the framework for our work. Our
`algorithm produces a set of queries that compute the up-
`dates to the view based upon an update of the source.
`We develop an analytic cost model and compare the cost
`of executing our incremental maintenance algorithm to
`that of recomputing the view. We show that for nearly
`all types of database updates, it is more e cient to ap-
`ply our incremental maintenance algorithm to the view
`than to recompute the view from the database, even when
`there are thousands of updates.
`
` Introduction
`
`Database views increase the exibility of a database
`system by adapting the data to user or application
`needs , . Views are frequently materialized to
`speed up querying when the underlying data is remote
`or response time is critical , . Once a view is ma-
`terialized, however, its contents must be maintained in
`order to preserve its consistency with the base data.
`Maintenance can be performed either by recomputing
`the view contents from the database or by comput-
`
` Research partially supported by NSF grant IRI ,
`Air Force contract F , the Swiss National Sci-
`ence Foundation, and the Lilian Voudouri Foundation.
`
`Permission to copy without fee all or part of this material is
`granted provided that the copies are not made or distributed for
`direct commercial advantage, the VLDB copyright notice and
`the title of the publication and its date appear, and notice is
`given that copying is by permission of the Very Large Data Base
`Endowment. To copy otherwise, or to republish, requires a fee
`andor special permission from the Endowment.
`Proceedings of the th VLDB Conference
`New York, USA,
`
`ing the incremental updates to the view based on the
`updates to the database. In this paper, we study the
`maintenance of materialized views for semistructured
`data. We propose a simple view speci cation mecha-
`nism and an algorithm for incremental maintenance.
`We then demonstrate the algorithm’s strengths and
`weaknesses with a maintenance cost analysis.
`Unlike relational or object-oriented data, semistruc-
`tured data need not conform to a xed schema. The
`data may be irregular or incomplete, and often arises
`in practice, e.g., when heterogeneous data sources are
`integrated or data is extracted from the World Wide
`Web , , , . Views over semistructured data
`can be used to lter the data and to restructure or
`provide structure to it . Filtering is crucial since
`semistructured data is often encountered by applica-
`tions interested in a very small portion of the available
`data e.g., some speci c data from the Web. Further-
`more, a view is the only way in which we can restruc-
`ture semistructured data that is outside of our control.
`For performance reasons, views over semistructu-
`red data often need to be materialized. Queries over
`semistructured data possibly traversing long paths
`are expensive to evaluate, as Mike Carey argued re-
`cently . A materialized view can be used to isolate
`the data of interest, allowing subsequent queries to
`run over a smaller, often more structured, data set.
`Materialized views can also be used to rewrite queries
`over the base data and improve the query performance
` . Furthermore, queries over the materialized view
`may be able to take advantage of standard query opti-
`mization techniques and access methods for structured
`data, even though the underlying base data of the view
`is semistructured.
`View mechanisms and algorithms for materialized
`view maintenance have been studied extensively in
`the context of the relational model , , , , .
`Incremental maintenance has been shown to dra-
`matically improve performance for relational views
`. Views are much richer in the object world
`and, subsequently, languages for specifying and query-
`ing materialized views are signi cantly more intricate
`, , , , .
`Previous results on incremental view maintenance
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-1
`
`
`
`Restaurant
`
`&2
`
`Guide
`
`&1
`
`Restaurant
`
`&3
`
`Name Rating
`
`Name Rating
`
`Entree
`
`Entree
`
`Restaurant
`
`Entree
`
`&4
`
`Name Rating
`
`Entree
`
`&10
`
`&11
`"Eats"
`Ingredient
`
`&12
`"Four Stars"
`Name
`
`&13
`
`Ingredient
`Ingredient
`
`&9
`
`Name
`
`Ingredient
`
`&7
`"Baghdad
`Cafe"
`
`&8
`9
`
`5&
`
`6
`
`&5
`"Thai City"
`
`&14
`"Beef Stew"
`
`&15
`"Mushroom"
`
`&16
`"Tomato"
`
`Figure : A Simple OEM Database
`
`&17
`"Cheeseburger
`Club"
`
`&18
`"Cheese"
`
`&19
`"Beef"
`
`for object databases , and nested data are
`based on the extensive use of type information. Semi-
`structured data provides no type information, so the
`same techniques do not apply. In particular, subobject
`sharing along with the absence of a schema make it dif-
` cult to detect if a particular update a ects a view.
`Gluche and colleagues use a view maintenance
`scheme that is limited to linear OQL view de nitions.
`Because of subobject sharing, most nontrivial semi-
`structured view de nitions are not linear, making their
`approach inapplicable in our context.
`Suciu also considers incremental view mainte-
`nance for semistructured data. The view speci ca-
`tion language is limited to select-project queries and
`only considers database insertions. Our approach al-
`lows joins in the view query and handles database in-
`sertions, deletions, and updates. Zhuge and Garcia-
`Molina also investigate graph structured views
`and their incremental maintenance. However, their
`views consist of object collections only, while we in-
`clude edges structure between objects. Also, their
`maintenance algorithms only work for select-project
`views over tree-structured databases, while our ap-
`proach handles joins and arbitrary graph-structured
`databases.
`Our work is based on the Object Exchange Model
`OEM for semistructured data.
`In OEM, a
`database is a directed, labeled graph. OEM has strong
`similarities to XML , a proposed standard for a uni-
`versal format for data on the Web. Our view speci ca-
`tion language is based on the Lorel query language for
`OEM . We propose a view speci cation extension to
`Lorel that introduces two sets of objects in the view:
` the select-from-where part speci es the primary ob-
`jects imported to the view and the new with part
`speci es paths from the primary objects to adjunct
`objects. Both the paths and the adjunct objects ap-
`pear in the view. The distinction between the two sets
`of objects is invisible to the user it is only used to
`simplify the discussion of the incremental maintenance
`algorithm. Given a view and a database update, the
`algorithm produces a set of maintenance statements,
`
`evaluates them on the database to yield a set of view
`updates, and installs the updates in the view.
`We demonstrate the advantages of our algorithm
`with a cost model and a performance evaluation. We
`compare the cost of recomputation to the cost of incre-
`mentally computing the new view. Our results show
`that the incremental maintenance algorithm is several
`orders of magnitude faster than recomputing the view
`for insertion and deletion of edges between objects.
`In addition, incremental maintenance is cheaper for
`small numbers of atomic value changes. However, in
`some cases, such as when a substantial portion of the
`database is updated, it may be cost e ective to recom-
`pute the view.
`The presented maintenance algorithm can be used
`both for immediate maintenance and for deferred
`maintenance , of the views. The techniques
`presented here are also applicable to other query
`languages for semistructured data , for the Web
`, , and to some extent to query languages for
`hypertext documents , .
`
` View Speci cation
`
`We use the Lore system to investigate material-
`ized view maintenance over semistructured data. We
`now introduce OEM, the data model used by Lore; the
`Lorel query language; the view speci cation language;
`and the update operations. and provide further
`details on Lorel and the view speci cation language,
`respectively.
`
`. The OEM Data Model
`
`An OEM database is a labeled, directed graph
`such as the small example database given in Fig-
`ure .
`Each vertex in the graph represents an
`object; each object has a unique object identi er
`oid such as &. Atomic objects contain a value
`from one of the atomic types, e.g., integer, real,
`string, gif, java, audio. All other objects are
`complex objects and in the Lore system have a set
`of hlabel; subobjectoidi pairs as their value.
`In Fig-
`ure , object & is atomic and has the value Thai
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-2
`
`
`
`City". Object & is complex and has as its value
`fhEntree; & i; hName; & i; hRating; & i; hEntree;
`& ig. Names are special labels that each serve as an
`alias for a single object, and are used as entry points
`into the database. In Figure , Guide is a name that
`denotes object & .
`There is no notion of a schema in an OEM database.
`Semantic information is included in the labels, which
`are part of the data and can change dynamically.
`In this respect, an OEM database is self-describing.
`OEM has been designed to handle incompleteness of
`data, as well as the structural and type heterogeneity
`as exhibited in Figure . For example, observe that
`the Restaurant object & has no Entree subobjects,
`while Restaurants & and & each have two.
`
`. The Lorel Query Language
`
`Lorel, for Lore Language, uses the familiar select-from-
`where syntax of SQL, and can be considered an exten-
`sion to OQL that provides powerful path expres-
`sions for traversing the data and extensive coercion
`rules for a more forgiving type system. Both features
`are useful when operating in a semistructured environ-
`ment. Consider the Lorel query in Example .
`
`Example Lorel Query
`select e
`from Guide.Restaurant r, r.Entree e
`where r.Name = Baghdad Cafe"
`and
`e.Ingredient = Mushroom";
`
`
`
`The query asks for all Entree subobjects of a Restau-
`rant object where the restaurant’s name is Baghdad
`Cafe" and one of the ingredients of the entree has the
`value Mushroom". The result of this query over the
`database in Figure is the set f& g.
`The expression Guide.Restaurant r, r.Entree e is
`a path expression describing a traversal through the
`database.
`In this paper, a path expression is com-
`posed of one-step paths of the form x:L y, where x is
`bound to a set of objects, L is the label for some out-
`going edge, and y designates the set of objects that
`are reached by starting from an object in the set x
`and traversing an edge labeled L. Each one-step path
`describes a single step traversal through the data and
`can be written hx; L; yi.
`While Lorel supports many ways for specifying
`paths for example, by combining one-step path ex-
`pressions, eliminating variables, or using wild cards,
`in this paper, we use one-step paths for clarity. Path
`expressions appearing in the where clause that are not
`quanti ed by the from clause are implicitly existen-
`tially quanti ed according to Lorel semantics.
`
`. View Speci cation in Lorel
`
`A view speci cation statement in Lorel imports ob-
`jects and edges from a source database into a view.
`In addition, new objects and edges can be created in
`the view. Our view speci cation language can:
`
`identify objects within a graph; import arbitrary
`subgraphs; add or remove objects appearing in
`the view. To specify views, we use Lorel’s query and
`update operations and extend the select-from-where
`statement with a with clause.
`The with clause is composed of path expressions
`where each path begins from a variable appearing in
`the select clause. Each object and edge along a path in
`the with clause is included in the view. Intuitively, the
`select-from-where statement returns a at set of ob-
`jects. The with clause imports some of the structure
`of the database in the view. It is a compromise be-
`tween returning everything or nothing reachable from
`selected objects.
`We call the objects included in the view by the
`select-from-where part of the view speci cation the
`primary objects and the objects included in the view
`by the with clause the adjunct objects. An object can
`be both a primary and an adjunct object in a view.
`Although a view de nition may consist of several view
`speci cation statements, in this paper, we concentrate
`on views de ned by a single statement.
`The view speci cation in Example de nes a view
`for the result of the query in Example now written
`in an OQL-like syntax along with all Name and
`Ingredient subobjects of each Entree.
`
`Example Canonical View Speci cation
`de ne view FavoriteEntrees as Entrees =
`select e
`from Guide.Restaurant r, r.Entree e
`where exists x in r.Name: x = Baghdad Cafe"
`and
`exists y in e.Ingredient: y = Mushroom"
`with e.Name n, e.Ingredient i;
`
`
`The objects bound to e are primary objects, while
`all the subobjects discovered by the with clause are
`adjunct objects. Without the with clause, a view is
`a simple collection of objects that satisfy the query,
`without edges or subobjects present.
`
`. Materialized Views
`
`We now explain how views are materialized in Lore,
`using a simple top-down query evaluation strategy .
`First, the from then where clauses are evaluated to
`obtain bindings for variables that appear in the from
`clause and satisfy the where clause. The select clause
`is evaluated for these bindings. Each primary object
`identi ed by the select clause is then augmented with
`the subobjects and edges in the with clause.
`In the
`view, each imported database object is represented by
`a new delegate object.
`Figure shows the materialized view for Example
`applied to the database in Figure . The objects & ,
`& , and & in Figure provide bindings for e, n and
`i; the sole primary object & ’ and the adjunct objects
`& ’ and & ’ are the corresponding delegate objects
`in the view.
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-3
`
`
`
`Entrees
`
`&99
`
`Entree
`
`&9’
`
`Name
`
`Ingredient
`
`&15’
`&14’
`"Mushroom"
`"Beef Stew"
`Figure : The materialized view for Example
`
`. Update Operations
`
`The Lorel update statements contain three elemen-
`tary update operations that can a ect a materialized
`view:
`
` Insertion and deletion of the edge with label L from
`the object with oid o to the object with oid o,
`denoted hIns; o ; L; oi and hDel; o ; L; oi.
` Change of value of the atomic object with oid o
`from OldVal to NewVal, denoted hChg; o ; OldVal;
`NewVali.
`
` View Maintenance
`
`When an update operation a ects a materialized view,
`the view must be maintained to keep it consistent with
`the database. A view V is considered consistent with
`the database DB if the evaluation of the view speci-
` cation S over the database yields the view instance
`V = SDB. Therefore, when the database DB is
`updated to DB , we need to update the view V to
`V = SDB in order to preserve its consistency.
`Our incremental maintenance algorithm computes
`the new state of the materialized view from the cur-
`rent state of the database, the view, and the database
`updates. Similar to relational view maintenance al-
`gorithms, the incremental maintenance algorithm uses
`the database updates to minimize the portion of the
`database examined when computing the view updates
` .
`The algorithm applies to an important subset of
`Lorel . More speci cally, it handles every view spec-
`i cation statement without wild cards, subqueries, or
`negation except on atomic objects, e.g., x = is per-
`mitted. To simplify the presentation, in our examples
`the select clause is of the form select y" generalizing
`for any select clause is straightforward.
`
` . Overview of the Maintenance Algorithm
`
`We treat the primary and adjunct objects Vprim and
`Vadj separately during maintenance. The algorithm’s
`input is shown in Figure .
`The view speci cation S, the database update U ,
`and the database state DB after the update are
`used to compute the view maintenance statements in
`Lorel syntax. These statements generate the sets
`
` . View speci cation statement S:
`select vi
`from v :L v , . . . , vj:Lk vk, . . . , vn(cid:0) :Ln vn
` vj can be any variable that
` already appeared in the sequence
`where conditionsv ; : : : ; vn
`with vi:L w , w :L w , . . . , w p(cid:0) :L p w p,
`uj:Lj wj , . . . , wjk(cid:0) :Ljk wjk, . . . ,
`wjq(cid:0) :Ljq wjq
` where uj is vi or wkl
` j, k j (cid:0) , l
`. Update U :
`hIns; o ; L; oi,
`hDel; o ; L; oi,
`hChg; o ; OldV al; N ewV ali
` . New database state DB
`. View instance V
`
`or
`
`Figure : Incremental maintenance algorithm input
`ADDprim, DELprim, ADDadj , and DELadj of ob-
`jects and edges to add to and remove from the view.
`In Figure , we abbreviate the where clause with
`conditionsv ; : : : ; vn." Conditions are written in
`disjunctive normal form using boolean expressions,
`such as y = Mushroom", as in SQL.
`
` . Check for relevance of update U to the view instance
`V de ned by the view speci cation S. Generate a set
`of relevant variables R. If R is empty, stop.
`. Generate maintenance
`statements
`and
`ADDprim and DELprim using U , S, and R.
`create
` . Generate maintenance
`statements
`and
`ADDadj and DELadj using U , S, R, and ADDprim
`or DELprim.
`. Install ADDprim, DELprim, ADDadj, and DELadj
`in V .
`
`create
`
`Figure : Basic structure of the incremental mainte-
`nance algorithm
`Figure outlines the steps of the view maintenance
`algorithm. We describe the algorithm as it operates on
`a single update. First, it checks whether the update is
`relevant to the view, that is, if update U could cause
`a change to the view instance V . If so, the algorithm
`creates the Lorel statements that generate ADDprim
`and DELprim. The statements identify the primary
`objects to add and remove by explicitly binding the
`objects in the update to the view speci cation. The
`algorithm then creates the sets of maintenance state-
`ments that generate ADDx
`adj and DELx
`adj . ADDx
`adj
`and DELx
`adj contain the adjunct objects and edges to
`add and remove for each with clause variable x. Ad-
`junct objects may be a ected in three ways: by
`newly inserted or deleted primary objects; by cur-
`rent adjunct objects that are the source of an inserted
`or deleted edge; and by atomic value changes.
`
` . Relevance of an Update
`
` We extend Lorel to allow the use of explicit object identi ers
`wherever names are allowed within a statement.
`
`To avoid generating and evaluating! unnecessary
`maintenance statements, we rst perform some sim-
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-4
`
`
`
`function RelevantVarsUpdate U , View speci cation S
` If updated object is not in RelevantOids, then it’s not
` relevant. RelevantOids is fhoid; queryvariableig.
`if ho U ; i = RelevantOids then return ;;
` Find out which variables are relevant to the update
`vars ;; relvars ;;
`foreach v variablesS do
` If updated object is not in RelevantOids, then it’s
` not relevant
`if ho U ; vi RelevantOids then vars vars fvg;
` If update is atomic change, do simple syntactic check
`if typeU = Chg then
`foreach v vars do
` Let constantsS, v be the constants appearing
` in S compared to v, e.g., using = here
`foreach c constantsS, v do
` See if there’s a predicate in the view spec
` whose value may have changed
`if OldValU = cand NewValU = c or
`OldValU = c and NewValU = c then
`relvarsrelvarsfvg;
`elserelvars vars;
`return relvars;
`
`Figure : RelevantVars returns the view speci cation
`variables for which the update U is relevant.
`ple relevance checks. We use an auxiliary data struc-
`ture, RelevantOids, to keep information that would be
`available from the schema in a structured database.
`RelevantOids contains the object identi er of every ob-
`ject touched during the evaluation of a view speci ca-
`tion, paired with the variable to which it was bound,
`whether or not the object eventually appears in the
`view. It is used to check quickly whether a database
`update could possibly a ect the view. For example, if
`object o in a Chg update does not appear in Relevan-
`tOids, then it was not examined during view evaluation
`and the update can be ignored.
`We also use syntactic checks that indicate whether
`speci c atomic value changes could a ect the view.
`For each comparison in the view speci cation where
`clause that involves a constant value, we compare the
`constant to the update’s OldVal and NewVal. If both
`or neither of OldVal and NewVal satisfy the compari-
`son, then the change cannot a ect the view.
`Figure presents the function RelevantVars, which
`determines the set of variables appearing in the query
`that the update could be bound to given a view spec-
`i cation.
`For example, suppose that the value of object &
`in Figure is changed from Thai City" to Hunan
`Wok". We can infer that this update does not a ect
`the view in Example , because the view speci cation
`mentions neither Thai City" nor Hunan Wok". On
`the other hand, if the value of & is changed to Bagh-
`dad Cafe", which is the constant used in the compari-
`son x.Name = Baghdad Cafe", then the update may
`be relevant.
`We do not attempt to quantify the savings achieved
`by using RelevantOids in this paper. However, we
`
`note that for views de ned over a small portion of the
`database, most updates are irrelevant.
`
` . Generating Maintenance Statements
`
`We now describe how to generate the maintenance
`statements for each type of update: edge insertion,
`edge deletion, or atomic value change. Consider rst
`the edge insertion and edge deletion cases. For each
`one-step path in the view speci cation, we generate
`a maintenance statement that checks whether the up-
`dated edge binds to it. If so, the statement produces
`updates to the view. We use auxiliary data structures
`to represent the one-step paths appearing in the view
`speci cation. OneStepP athf rom, OneStepP athprim,
`and OneStepP athadj contain all the one-step paths
`that appear in the from clause,
`from and where
`clauses, and with clause, respectively. For example,
`OneStepP athprim for the view speci cation in Exam-
`ple is fGuide.Restaurant r, r.Entree e, r.Name x,
`e.Ingredient yg. Note that each OneStepPath set is
`small since it depends on the query and not on the
`database.
`
` . . Edge Insertion
`
`For edge insertion, let the update be hIns; o ; L; oi.
`We generate a primary object maintenance statement
`for every possible pair of bindings of o and o using
`the procedure GenAddPrim in Figure .
`
`Example Generating ADDprim
`Suppose that update hIns; & ; Ingredient; & i is
`performed on the database in Figure . The Bagh-
`dad Cafe restaurant now has two entrees with the in-
`gredient Mushroom". Given the view speci cation,
`RelevantVars returns the set feg. GenAddPrim then
`generates one statement.
`ADDprim +=
`select e
`from Guide.Restaurant r, r.Entree e
`where exists x in r.Name: x = Baghdad Cafe"
`and
`exists & in & .Ingredient:
`& = Mushroom"
`e =& ;
`
`and
`
`This maintenance statement can be evaluated more
`e ciently than the original view speci cation, as we
`show in Section .
`
`We then generate the maintenance statements for
`the adjunct objects. There are two cases to consider:
` adjunct objects attached to the new primary ob-
`jects in ADDprim and adjunct objects that are
`newly connected to the view by the inserted edge from
`o to o when o is an adjunct object.
`For the rst case, we generate maintenance state-
`ments starting from the set ADDprim. For the second
`case, we rst test whether the inserted edge matches
`a relevant adjunct object variable and has a match-
`ing label. If so, then we generate a set of maintenance
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-5
`
`
`
`procedure GenAddPrimUpdate U = hIns; o ; L; oi, View speci cation S, RelevantVars R
` For each relevant variable
`foreach a R do
` For each place where the update can be substituted in the view speci cation
`foreach ha,L,bi OneStepP athprim do
` Write a maintenance statement based on the view speci cation
`ADDprim+= copy S except for the with clause and
`Li in from clause replace a:Li" with o :Li"
`Lj in from clause replace b:Lj " with o:Lj "
`replace a" with o " and b" with o" in where clause
`add and a = o " to each disjunct in where clause
`if ha,L,bi OneStepP athf rom add and b = o" to each disjunct in where clause
`
`Figure : GenAddPrim generates the ADDprim maintenance statements.
`
`procedure GenAddAdjUpdate U = hIns; o ; L; oi, View speci cation S, RelevantVars R
` If primary objects were added, need to add adjunct objects from them
`if ADDprim = ; then
` For each one-step path in the with clause
`foreach hwjk(cid:0) ,Ljk,wjki OneStepP athadj do
` Write a maintenance statement based on the view speci cation no where or with clause
`wjk
`ADD
`adj += select hwjk(cid:0) ,Ljk ,wjki
`from ADDprim vi, vi:Lj wj , . . . , wjk(cid:0) :Ljk wjk;
` For each place that edge could be adjunct edge
`foreach v R do
`foreach hv,L,wjki OneStepP athadj do
` Write a set of maintenance statements starting from added edge:
` Add inserted edge to the view
`wjk
`ADD
`adj += select ho ,L,oi;
` From o, add any necessary edges
`wjk+
`ADD
`+= select ho,Ljk+ ,wjk+ i from o:Ljk+ wjk+ ;
`adj
` In a similar fashion, include all paths
`foreach hwjk+n(cid:0) ,Ljk+n,wjk+ni O do
`wjk+n
`ADD
`+= select hwjk+n(cid:0) ,Ljk+n,wjk+ni
`adj
`from o:Ljk+ wjk+ , . . . , wjk+n(cid:0) :Ljk+n wjk+n;
`
`Figure : GenAddAdj generates the ADDadj maintenance statements.
` . . Edge Deletion
`statements that add the inserted edge and all subse-
`quent paths in OneStepP athadj. Both cases are han-
`dled by procedure GenAddAdj in Figure .
`
`Example Generating ADDadj
`GenAddAdj generates the following maintenance state-
`ments for the update hIns; & ; Ingredient; & i.
`
`ADDn
`adj +=
`select he,Name,ni
`from ADDprim e, e.Name n;
`
`ADDi
`adj +=
`select he,Ingredient,ii
`from ADDprim e, e.Ingredient i;
`
`Since the inserted edge is not connected to an ex-
`isting adjunct object & is not currently an adjunct
`object in the view, no statements are generated by
`the second half of GenAddAdj.
`
`
`Because the addition of an edge in the absence of
`negation cannot cause a deletion, we do not have to
`generate DELprim or DELadj .
`
`Let the update be hDel; o ; L; oi. A deleted edge may:
` be irrelevant and not a ect the view; cause a
`primary object or objects to be deleted; appear
`directly in the adjunct view and need to be removed.
`Either or could cause additional adjunct edges
`to be removed from the view.
`In principle, a delete
`edge update generates maintenance statements sim-
`ilar to an insert edge update. However, the delete
`edge statements must simulate the existence of the
`now deleted edge in the view to determine whether
`it originally contributed to the appearance of objects
`or edges in the view. Also, the delete edge state-
`ments must check using a subquery whether a po-
`tentially deleted object or edge should remain in the
`view due to paths not involving the deleted edge. For
`example, if the Entree object & in Figure had two
`Mushroom" ingredients, then applying the update
`hDel; & ; Ingredient; & i should not remove the En-
`tree object & from the view.
`Figure shows the procedure GenDelPrim, used to
`generate the maintenance statements for the primary
`objects.
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-6
`
`
`
`procedure GenDelPrimUpdate U = hDel; o ; L; oi, View speci cation S, RelevantVars R
` For each relevant variable
`foreach a R do
` For each place where the update can be substituted in the view spec
`foreach ha,L,bi OneStepP ath prim do
` Write a maintenance statement based on the view speci cation:
`DELprim+= copy S except for the with clause and
` The rst two replacements reconstruct the before state
`replace a:L b" with o :L fog b" in from clause.
`replace exists b in a:L" with exists b in o :L fog" in where clause.
` The remaining replacements handle normal appearance of bound variables
`Li in from clause replace a:Li" with o :Li"
`Lj in from clause replace b:Lj " with o:Lj "
`replace a" with o " in where clause
`replace b" with o" in where clause
`add and a = o " to each disjunct in where clause
`if ha,L,bi OneStepP athf rom add and b = o" to each disjunct in where clause
` The duplicate test is a subquery that ensures that the object bound
` to vi is not in the view for another reason
`add to where clause and not exists S " where S is
`S without the with clause and with new variables v
`j for each vj
`and an additional where condition: v
`i = vi" vi is the selected variable in S
`
`Figure : Generating maintenance statements for DELprim.
`
`Clause Original
`From
`Guide.Restaurant r
`
`Incremental Statement
`Guide.Restaurant r
`
`From
`Where
`Where
`
`r.Entree e
` x in r.Name: x = Baghdad Cafe"
` y in e.Ingredient: y = Mushroom"
`
`& .Entree f& ge
` x in & .Name: x = Baghdad Cafe"
` y in & .Ingredient: y = Mushroom"
`
`General Rule
`vj:Lk vk s.t.
`vj and vk = a or b ! No Change
`a:L b ! o :L fog b
`a:Lj vj s.t. vj = b ! o :Lj vj
`b:Lj vj ! o:Lj vj
`
`Figure : Transformations for incremental maintenance statements for Example
`
`Example Generating DELprim
`Suppose the update U = hDel; & ; Entree; & i is ap-
`plied to the database of Figure . The object & must
`be removed from the view. GenDelPrim generates one
`statement.
`DELprim +=
`select e
`from Guide.Restaurant r, & .Entree f& g e
`where exists x in & .Name: x = Baghdad Cafe"
`and
`exists y in & .Ingredient: y = Mushroom"
`and
`r = & and e = &
`and
`not exists
`select e
`from Guide.Restaurant r , r .Entree e
`where exists x in r .Name:
`x = Baghdad Cafe"
`exists y in e .Ingredient:
`y = Mushroom"
`e = e;
`
`and
`
`and
`
`This statement adds bindings for r and r.Entree to
`the view speci cation S and reconstructs the deleted
`edge by binding e to & . The transformations to the
`original query are summarized in the table shown in
`Figure .
`
`We then generate the maintenance statements for
`the adjunct objects and edges. However, like the ob-
`jects in the primary zone, an adjunct object or edge
`
`can be included in the view due to multiple paths.
`Reachability via a deleted edge is not a su cient con-
`dition for deleting an adjunct object or edge, as we
`explain in . Instead, a subquery of the where clause
`looks for other variable bindings for the edge to be
`removed. If another binding is found, then the edge
`is not deleted. Due to space restrictions, we omit the
`procedure GenDelAdj here; see .
`Example Generating DELadj
`For the update hDel; & ; Entree; & i, procedure Gen-
`DelAdj creates one maintenance statement for each
`path in OneStepP athadj .
`DELn
`adj +=
`select he,Name,ni
`from DELprim e, e.Name n;
`
`DELi
`adj +=
`select he,Ingredient,ii
`from DELprim e, e.Ingredient i;
`
`Neither statement in our simple example includes a
`where subclause. More complex cases, however, do.
`
`
` . . Atomic Value Change
`
`Let the update U be hChg; o ; OldVal; NewVal i. This
`value change may cause deletions, insertions or both
`
`AMERICAN EXPRESS v. METASEARCH
`CBM2014-00001 EXHIBIT 2020-7
`
`
`
`to the view, or there might be no update necessary
`because the change is irrelevant to the view. Due
`to object sharing, an object may have many incom-
`ing edges with di erent labels. Therefore, the original
`edge traversed to nd an object is not the only possi-
`bly relevant edge. Consequently, we bind the changed
`object to each variable identi ed by the procedure Rel-
`evantVars, using a separate maintenance statement for
`each variable. A single maintenance statement handles
`each case. An atomic value change that could cause
`the addition of objects within the view is treated sim-
`ilarly to an edge insertion, where the inserted edge is
`the edge followed to get to the atomic object during
`view evaluation. The deletion case is similar. Note
`t