throbber
USOO8352456B2
`
`(12) United States Patent
`Duffy et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,352.456 B2
`Jan. 8, 2013
`
`(*) Notice:
`
`(54) PRODUCER/CONSUMER OPTIMIZATION
`(75) Inventors: John J. Duffy, Renton, WA (US);
`Henricus Johannes Maria Meijer,
`Mercer Island, WA (US)
`(73) Assignee: Microsoft Corporation, Redmond, WA
`(US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 590 days.
`(21) Appl. No.: 11/747,772
`(22) Filed:
`May 11, 2007
`(65)
`Prior Publication Data
`US 2008/O281786 A1
`Nov. 13, 2008
`
`(51) Int. Cl.
`(2006.01)
`G06F 7700
`(2006.01)
`G06F 7/30
`(52) U.S. Cl. ........................................ 707/713; 707/719
`(58) Field of Classification Search .................. 707/713,
`707/719,999.002
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,590,319 A 12/1996 Cohen et al.
`5,857, 180 A
`1/1999 Hallmarket al.
`6,009,265 A 12/1999 Huang et al.
`6,625,593 B1
`9/2003 Leung et al.
`6,968.335 B2 11/2005 Bayliss et al.
`7,051,034 B1
`5, 2006 Ghosh et al.
`7,146,365 B2 12/2006 Allen et al.
`2005/O165798 A1
`7/2005 Cherkauer et al.
`
`OTHER PUBLICATIONS
`Ganguly et al., “Query Optimization for Parallel Execution'. Pro
`ceedings of the 1992 ACM SIGMOD international conference on
`Management of data, p. 9-18, Jun. 2-5, 1992, San Diego, California,
`United States.
`
`Graefe, “Volcano—An Extensible and Parallel Query Evaluation
`System'. IEEE Transactions on Knowledge and Data Engineering.
`vol. 6, No. I. pp. 120-135, Feb. 1994, IEEE.*
`Bouganim et al., “A Dynamic Query Processing Architecture for
`Data Integration Systems'. Bulletin of the IEEE Computer Society
`Technical Committee on Data Engineering, 2000, IEEE.*
`Bala et al., “Dynamo: a transparent dynamic optimization system'.
`PLDI 00 Proceedings of the ACM SIGPLAN 2000 conference on
`Programming language design and implementation, pp. 1-12, 2000,
`ACM.
`Girbal et al., “Semi-Automatic Composition of Loop Transforma
`tions for Deep Parallelism and Memory Hierarchies”. International
`Journal of Parallel Programming, vol. 34, No. 3, pp. 261-317, 2006,
`Spreinger Verlag Berlin Heidelberg.*
`Qasem et al., “Profitable loop fusion and tiling using model-driven
`empirical search”, ICS '06 Proceedings of the 20th annual interna
`tional conference on Supercomputing, pp. 249-256, 2006, ACM.*
`Pouchetaet al., “Combined Iterative and Model-driven Optimization
`in an Automatic Parallelization Framework”. SC 10 Proceedings of
`the 2010 ACM/IEEE International Conference for High Performance
`Computing, Networking, Storage and Analysis, pp. 1-11, 2010,
`ACM.
`Ganguly, et al. "Query Optimization for Parallel Execution.”
`Hewlett-Packard Laboratories, http://delivery.acm.org/10.1145/
`140000/130291/p9-ganguly.pdf?key1=130291&key2=3748847311
`&col=GUIDE&d=GUIDE&CFID=65859593
`&CFTOKEN=60779184, Jun. 1992, last accessed Feb. 14, 2007, 10
`pageS.
`
`(Continued)
`Primary Examiner — Michael Hicks
`(57)
`ABSTRACT
`Systems and methods facilitate efficient data processing in a
`computer environment. Data producers and consumers are
`considered in aggregate rather than in isolation. In one
`instance, interaction between data producers and consumers
`is improved by integrating producers and consumers. Opti
`mization can Subsequently be performed over the combina
`tion to produce Synergistic results.
`
`12 Claims, 13 Drawing Sheets
`
`
`
`
`
`PARTITION
`DATA
`
`
`
`
`
`PRODUCER
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IDENTIFY ONE ORMORE CONSUMERS
`ASSOCATED WITHA PRODUCER
`
`1110
`
`IDENTIFY CONSUMERACTIONS
`
`MODIFY THE PRODUCER ASA
`FUNCTION OF CONSUMERACTIONS
`
`1120
`
`1130
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 1
`
`

`

`US 8,352.456 B2
`Page 2
`
`OTHER PUBLICATIONS
`
`Lu, et al. "Optimization of Multi-Way Join Queries for Parallel
`Execution.” Proceedings of the 17th International Conference on
`Very Large Data Bases, http://www.sigmod.org/vldb/conf 1991/
`P549. PDF, Sep.1991, Barcelona, Spain, last accessed Feb. 14, 2007,
`12 pages.
`
`Ayres, et al. "An Extensible Query Execution Engine for Supporting
`New Query Execution Models.” http://icwww.epfl.ch/publications/
`documents/IC TECH REPORT 2005034.pdf, last accessed Feb.
`14, 2007, 16 pages.
`
`* cited by examiner
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 2
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 1 of 13
`
`US 8,352.456 B2
`
`100
`
`Y
`
`
`
`PRODUCER
`COMPONENT
`
`CONSUMER
`COMPONENT
`
`PROGRAM CODE
`
`INTERFACE
`COMPONENT
`
`OPTIMIZATION
`COMPONENT
`
`Fig. 1
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 3
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 2 of 13
`
`US 8,352.456 B2
`
`
`
`130
`
`INTEGRATION
`COMPONENT
`
`EXECUTION
`STRATEGY
`COMPONENT
`
`OPTIMIZATION
`COMPONENT
`
`COST
`COMPONENT
`
`Fig. 2
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 4
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 3 of 13
`
`US 8,352.456 B2
`
`
`
`130
`
`INTEGRATION
`COMPONENT
`
`QUERY PLAN
`COMPONENT
`
`OPTIMIZATION
`COMPONENT
`
`COST
`COMPONENT
`
`Fig. 3
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 5
`
`

`

`US. Patent
`
`Jan.8,2013
`
`Sheet4of13
`
`US 8,352,456 B2
`
`
`
`MNEDmZOU
`
`v.5
`
`mmUDQOMm
`
`mammg
`
`HMSC5
`
`ZOHEHm‘fim
`
`<F<Q
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 6
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 6
`
`
`
`

`

`US. Patent
`
`Jan 8,2013
`
`mf05a
`
`CuU
`
`00
`
`4,253a
`
`mMMEDmZOU
`
`mmEDmZOQ
`
`MmEDmZOO
`
`MmEDmZOO
`
`
`
`2
`
`B.xm”a
`
`MMUDDOMQ
`
`vaqmm
`
`Emma
`
`PUmAMm
`
`Mimi?»
`
`Homw—m—m
`
`mam—:3
`
`HQmAm—m
`
`Emma
`
`ZOEEUm/wm
`
`<H<Q
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 7
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 7
`
`
`

`

`U.S. Patent
`
`US 8,352.456 B2
`
`
`
`YHTEIVNÍCI SNOO
`
`LOEITIGIS
`
`JLOHTEIS NA
`
`
`
`
`
`
`
`
`
`GIOVLS EINITIHd|Id]
`
`|
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 8
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 7 of 13
`
`US 8,352.456 B2
`
`
`
`130
`
`CONSUMER
`ANALYSIS
`COMPONENT
`
`PRODUCER
`MODIFICATION
`COMPONENT
`
`OPTIMIZATION COMPONENT
`
`Fig. 7
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 9
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 8 of 13
`
`US 8,352.456 B2
`
`-800
`
`QUERY
`
`
`
`810
`
`DATABASE
`INTERFACE
`COMPONENT |
`
`RESULT
`- RESULT
`:
`
`RESULT
`
`QUERY
`PROCESSOR
`COMPONENT
`
`DATABASE
`
`Fig. 8
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 10
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 9 of 13
`
`US 8,352.456 B2
`
`
`
`START
`
`900
`
`Y
`
`IDENTIFY A PROGRAMMATIC PRODUCER
`AND ASSOCIATED CONSUMER
`
`OPTIMIZE INTERACTION BETWEEN THE
`CONSUMER AND THE PRODUCER
`
`910
`
`920
`
`Fig. 9
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 11
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 10 of 13
`
`US 8,352.456 B2
`
`
`
`START
`
`y
`
`1000
`
`ANALYZE A CODE OF A
`PRODUCER WITH INTEGRATED
`CONSUMER
`
`1010
`
`DETERMINING AN EXECUTION
`STRATEGY BASED ON THE ANALYSIS
`
`1020
`
`EXECUTE THE CODE
`UTILIZING THE STRATEGY
`
`1030
`
`Fig. 10
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 12
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 11 of 13
`
`US 8,352.456 B2
`
`
`
`1100
`
`Y
`
`IDENTIFY ONE OR MORE CONSUMERS
`ASSOCIATED WITH A PRODUCER
`
`1110
`
`IDENTIFY CONSUMERACTIONS
`
`1120
`
`MODIFY THE PRODUCER AS A
`FUNCTION OF CONSUMERACTIONS
`
`1130
`
`Fig.11
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 13
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 12 of 13
`
`US 8,352.456 B2
`
`4. 1228
`- - - - -
`APPLICATION(S)
`I
`- - - - - - - -
`
`-
`
`-
`
`-
`
`-
`
`-
`
`
`
`^ 1210
`
`1212
`
`PROCESSING
`UNIT(S)
`
`SYSTEM
`MEMORY
`
`MASS
`STORAGE
`
`INTERFACE
`COMPONENT(S)
`
`INPUT OUTPUT
`
`Fig. 12
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 14
`
`

`

`U.S. Patent
`
`Jan. 8, 2013
`
`Sheet 13 of 13
`
`US 8,352.456 B2
`
`
`
`1310
`
`CLIENT(S)
`
`
`
`
`
`
`
`CLIENT
`DATA
`STORE(S)
`
`
`
`1330
`
`SERVER(S)
`
`1340
`
`SERVER
`DATA
`STORE(S)
`
`
`
`
`
`COMMUNICATION
`FRAMEWORK
`
`1350
`
`Fig. 13
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 15
`
`

`

`1.
`PRODUCERFCONSUMER OPTIMIZATION
`
`BACKGROUND
`
`Computer programs are groups of instructions that 5
`describe actions to be performed by a computer or other
`processor-based device. When a computer program is loaded
`and executed on computer hardware, the computer will
`behave in a predetermined manner by following the instruc
`tions of the computer program. Accordingly, the computer 10
`becomes a specialized machine that performs the tasks pre
`scribed by the instructions.
`A programmer using one or more programming languages
`creates the instructions comprising a computer program.
`Typically, source code is specified or edited by a programmer 15
`manually and/or with help of an integrated development envi
`ronment (IDE). Subsequently, the source code can be com
`piled or otherwise transformed by another program into com
`puter instructions executable by a computer or like device.
`In Software engineering, a plurality of design patterns are 20
`conventionally utilized in program development. A design
`pattern provides a framework for describing a particular issue
`and Solutions thereto. More specifically, a design pattern is a
`general, repeatable solution for common issues that occur in
`Software design. Among other things, use of design patterns 25
`speeds up development, helps prevent Subtle issues and
`improves program readability and comprehension by those
`familiar with the pattern.
`One basic design pattern is producer/consumer. A pro
`ducer/consumer relationship is one in which a producer gen- 30
`erates data and the consumer uses the data. This pattern is
`utilized in a myriad of different environments for a number of
`processes including, at a higher level, data warehousing for
`cleansing and transforming data and image processing for
`iterative refinement. In fact, the pattern can apply to any 35
`situation in which data is produced and consumed. One par
`ticularly prevalent use case pertains to queries.
`Query execution can be seen as a traditional client/server or
`consumer/producer model where an entity A requests a ser
`vice from another entity B, in this case the retrieval of some 40
`data that satisfies criteria and is in the shape requested. Some
`bi-directional communication mechanism is required Such
`that A can instruct B about its desire and so that B may
`respond to A with the results. The entire result set is returned
`in Some form and thereafter consumed for Some purpose. 45
`Conventional relational database management Software
`(RDBMS) employs cursors for the output interface with
`which to stream query output to the consumer. A cursor is a
`single stream of data that facilitates Supply of data in some
`definite, sequential ordering.
`
`50
`
`SUMMARY
`
`The following presents a simplified Summary in order to
`provide a basic understanding of some aspects of the claimed 55
`Subject matter. This Summary is not an extensive overview. It
`is not intended to identify key/critical elements or to delineate
`the scope of the claimed Subject matter. Its sole purpose is to
`present some concepts in a simplified form as a prelude to the
`more detailed description that is presented later.
`Briefly described, the subject disclosure pertains to opti
`mization of producer/consumer code. Instead of treating pro
`ducers and consumers as blackboxes, activities of a producer
`and one or more associated consumers can be analyzed and
`employed to facilitate efficient execution. In other words, 65
`optimization can be performed on a producer/consumer
`aggregate rather than on the producer and consumer in isola
`
`60
`
`US 8,352,456 B2
`
`2
`tion. Furthermore, the producer and/or consumer can be an
`aggregate of some other producer and/or consumer Such that
`the optimization can be recursive.
`In accordance with one aspect of the disclosure, consumer
`activity can be merged or integrated with producer activity. In
`this case, consumer activity can simply be considered an
`extension of producer activity. As a result, an execution strat
`egy can be generated over the integration or aggregate to
`implement parallelism, among other things. By way of
`example, if the producer corresponds to a query, novel and/or
`conventional query optimization approaches can be utilized
`that cross the producer/consumer boundary.
`According to another aspect of the disclosure, consumer
`activity can be analyzed and employed to streamline producer
`execution alone or in conjunction with other optimization
`techniques. More particularly, a producer can be specialized
`to produce data solely of interest to one or more associated
`COSU.S.
`To the accomplishment of the foregoing and related ends,
`certain illustrative aspects of the claimed Subject matter are
`described herein in connection with the following description
`and the annexed drawings. These aspects are indicative of
`various ways in which the Subject matter may be practiced, all
`of which are intended to be within the scope of the claimed
`Subject matter. Other advantages and novel features may
`become apparent from the following detailed description
`when considered in conjunction with the drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram of a code optimization system in
`accordance with an aspect of the claimed subject matter.
`FIG. 2 is a block diagram of a representative optimization
`component.
`FIG. 3 is a block diagram of a representative optimization
`component for query processing.
`FIG. 4 is a graphical illustration of a consumer/producer
`processing approach.
`FIG. 5 is a graphical illustration of a consumer/producer
`processing approach utilizing partition parallelism.
`FIG. 6 is a graphical illustration of a consumer/producer
`processing approach utilizing pipeline parallelism.
`FIG. 7 is a block diagram of a representative optimization
`component.
`FIG. 8 is a block diagram of a database management sys
`tem operable to return data for parallel consumption.
`FIG. 9 is a flow chart diagram of a method of optimizing
`producer/consumer interaction.
`FIG. 10 is a flow chart diagram of a method of data pro
`cessing.
`FIG. 11 is a flow chart diagram of a method of code opti
`mization.
`FIG. 12 is a schematic block diagram illustrating a suitable
`operating environment for aspects of the Subject disclosure.
`FIG. 13 is a schematic block diagram of a sample-comput
`ing environment.
`
`DETAILED DESCRIPTION
`
`Systems and methods are described hereinafter concerning
`data processing. Processing efficiency can be improved by
`optimizing interactions between producers and consumers.
`More particularly, producers and consumers can be consid
`ered en masse rather than as isolated, atomic processes. Opti
`mizations can then be performed with respect to both data
`production and consumption resulting in a synergistic result.
`Such optimization can be accomplished via direct program
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 16
`
`

`

`15
`
`3
`specification, an application programming interface, a com
`piler and/or a query processor, amongst other mechanisms.
`Various aspects of the subject disclosure are now described
`with reference to the annexed drawings, wherein like numer
`als refer to like or corresponding elements throughout. It
`should be understood, however, that the drawings and
`detailed description relating thereto are not intended to limit
`the claimed subject matter to the particular form disclosed.
`Rather, the intention is to coverall modifications, equivalents
`and alternatives falling within the spirit and scope of the
`claimed Subject matter.
`Referring initially to FIG. 1, a code optimization system
`100 is illustrated in accordance with an aspect of the claimed
`Subject matter. The system includes programmatic code 110
`comprising one or more producer components 112 and con
`Sumer components 114. The producer component 112 pro
`duces or generates data that is consumed or utilized by one or
`more consumer components 114. As will be described further
`infra, consumer components 114 can also act as or include
`functionality of producer components 112 or vice versa, for
`example where there is a sequence of components that receive
`data, perform some action and pass the data to the next com
`ponent. Further yet, the producer component 112 and con
`Sumer component 14 can operate with respect to arbitrary
`data types including, streams, graphs and trees, among others.
`Interface component 120 is operable to receive, retrieve or
`otherwise obtain or identify related producer components 112
`and consumer components 114 within programmatic code
`110. These components or identities thereof can be provided
`to or made accessible by optimization component 130. The
`interface component 120 therefore acts as a conduit between
`the programmatic code 110 and the optimization component
`130. Accordingly, it is also to be noted that the interface
`component 120 and/or portions thereof can facilitate modifi
`cation of the code 110 by optimization component 130.
`The optimization component 130 facilitates improving
`data processing by optimizing producer components 112,
`consumer components 114 and/or interaction between the
`components. Conventionally, producer components 112 and
`consumer components 114 are treated as isolated atomic pro
`cesses. Such treatment is likely a function of traditional
`modular program design and/or presumed communication
`latency or low throughput. However, the conventional treat
`ment can result in Sub-optimal execution in many cases. For
`instance, this is quite limiting for Scenarios with Small dis
`tances or high throughput including in memory, in the same
`process and/or on the same machine. Further, with an increase
`in availability of parallel hardware and decrease in commu
`nication latency, it is important that Software evolve to do
`more things in parallel. That includes consuming items from
`a producer.
`Rather than treating consumer components 114 as black
`boxes, the optimization component 130 is operable to analyze
`consumer activity to facilitate optimization of the producer
`component 112, the consumer component 114 or both. In one
`instance, consumeractivity associated with a producer can be
`utilized to refine data produced by a producer. Additionally or
`alternatively, at least a portion of consumer activity can be
`merged within a producer or at least treated as Such for opti
`mization purposes.
`In essence, structure of a producer component 112 and at
`least one associated consumer component 114 is known or
`can be determined. Based thereon, the producer component
`112 and the consumer component 114 are matched up in a
`way to optimize performance, parallelism and/or data com
`65
`munication. Furthermore, it is to be appreciated that once
`consumeractivity is exposed optimizations can be performed
`
`40
`
`45
`
`50
`
`55
`
`60
`
`US 8,352,456 B2
`
`10
`
`25
`
`30
`
`35
`
`4
`recursively thereon providing deeper optimizations than
`would otherwise have been possible. By way of example,
`where there is a sequence of producers and consumers, the
`consumers can also be producers that can be optimized.
`The interface component 120 and/or optimization compo
`nent 130 can be embodied in any number of systems or
`mechanisms. By way of example, the interface component
`120 and optimization component 130 can be embodied in an
`application programming interface (API) associated with
`producers and consumers. Additionally or alternatively, the
`components 120 and 130 can form part of a compiler such that
`generated code including producers and consumers is opti
`mized for execution. Further yet, the components can form
`part of a query processor and/or associated components.
`However, the claimed subject matter is not limited to these
`exemplary embodiments. Other embodiments are also pos
`sible and are to be deemed with in the scope of innovation.
`Turning attention to FIG. 2, a representative optimization
`component 130 is depicted in accordance with an aspect of
`the disclosure. The optimization component 130 includes an
`integration component 210 to integrate or merge consumer
`activity with producer activity. By integrating producers and
`consumers, a bottleneck can be removed from the process
`wherein generated data is merged or packaged into a data
`structure and transmitted or otherwise identified to the con
`Sumer for processing.
`Integration component 210 can operate over a plurality of
`consumers and producers. By way of example, a sequence of
`producers and consumers can exist where data is transformed
`and passed to the next element that transforms that data and
`passes it on to the next, etc. Essentially, there can be a pipeline
`of consumers that are producers themselves. The integration
`component 210 can integrate or aggregate all these consum
`ers and producers together into one big group Such that opti
`mization can be applied across all consumers and producers.
`For instance, Suppose there is a process that takes an image
`and performs successive transformations thereon. These
`transformations can be glued together into one big process
`such that optimization can be performed with respect to the
`image as a whole instead of separate parts where intermediate
`values are sent to the next black box. More specifically, the
`integration component 210 can Support intra and/or inter
`component optimizations.
`Execution strategy component 220 is operable to optimize
`execution of an integrated producer and consumer. In particu
`lar, the execution strategy component 220 can identify an
`optimized execution strategy and/or modify the code in
`accordance therewith. For instance, the execution strategy
`component 220 can employ conventional optimization
`approaches across a producer/consumer boundary including
`without limitation common Sub-expression elimination and
`operation reordering. While these optimizations are similar to
`what compilers do everyday, they are different in that the
`execution strategy component 220 can take advantage of rich
`semantic knowledge to rewrite higher level operations in
`addition to lower level program Statements.
`Various parallelism approaches can also be employed by
`the execution strategy component 230 including partitioning
`and pipelining. In partitioning, code is replicated, and the
`replicas execute simultaneously over disjoint Subsets of data.
`In one particular instance, multistreaming can be employed
`between producers and consumers to permit parallelism to
`extend from core production operations into consumption
`operations without any Superfluous merge operation or other
`bottleneck, for example. Pipelining or pipeline parallelism
`divides processing tasks into a set of tasks connected in series.
`Tasks in each pipeline are often parallelized for example
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 17
`
`

`

`US 8,352,456 B2
`
`10
`
`15
`
`25
`
`30
`
`35
`
`5
`utilizing partitioning. This approach can be employed
`instances where partitioning alone is not efficient because of
`varying costs associated with task execution.
`Cost component 230 identifies execution cost associated
`with producer and/or consumer operations. This information
`can be determined or inferred and supplied to the execution
`strategy component 220 to facilitate identification of an
`appropriate parallel execution strategy or degree of parallel
`ism given the costs and available resources, for instance.
`FIG. 3 depicts a representative optimization component
`130 with respect to a query-processing embodiment.
`Although not limited thereto, produces/consumer data pro
`cessing can apply to query processes, namely generation of
`query results and processing thereof. Such queries can be
`traditional database queries (e.g., relational database query)
`or language integrated queries, among others. Similar to FIG.
`2, FIG.3 includes the integration component 210 for integrat
`ing, in this case, queries and consumers of query results and
`cost component 220 for determining or inferring execution
`cost associated with query and/or consumption operations.
`Further, optimization component 130 can include a query
`plan component 320 that can generate and/or optimize a
`query plan. A query plan defines a query execution strategy
`for retrieving data. In accordance with an aspect of the
`claimed subject matter the query plan component 320 can
`generate and/or optimize plans as a function of consumer
`activity. In other words, consumption action can be incorpo
`rated into query planning optimization itself. Particular plans
`can be generated or modified as a function of execution cost
`associated with producer and/or consumer costs provided by
`the cost component 230.
`In one implementation, the query plan component 320 can
`generate an abstract syntax tree or graph that represents the
`query itself wherein nodes identify query operations and
`edges identify flow between operators. Consumeractivity can
`be incorporated therein and represented as another node in the
`graph or tree. Conventional and/or novel optimization tech
`niques can be applied to the integrated representation and
`subsequently employed during execution via a query proces
`40
`sor, engine, component or the like (not shown). By way of
`example and not limitation, such optimization techniques can
`include standard relational algebra-based rewrite rules, as
`most relational databases are employed against tree-based
`representations of queries.
`FIGS. 4-6 provide exemplary query scenarios to facilitate
`clarity and understanding with respect to aspects of the
`claimed subject matter. It is to be appreciated that the
`examples and associated description are provided solely for
`clarity and understanding and are not meant to limit the scope
`of the claimed subject matter.
`The following exemplary code pertains to a consumer pro
`cessing situation in the context of a language integrated
`query:
`
`45
`
`50
`
`void ProcessCustomer(var c) { /*...* }
`void ProcessXyzCustomers(List<Customers customers) {
`DateTime orderDate = DateTime.Now. Subtract(-30):
`var filtered = from c in customers
`where c. Active &&.
`(c. State == "AK ||c. State == “HI”) &&.
`c.Orders.Count(o => o.OrderDate > orderDate) > 0
`select c;
`foreach (varc in filtered)
`ProcessCustomer(c);
`
`55
`
`60
`
`65
`
`6
`More specifically, the query var filtered from c in customers
`retrieves customers filtered by or including particular prop
`erties (e.g., customer activation, state and orders). Subse
`quently, a loop for each (varc in filtered) performs an action
`on each customer returned by the query, namely ProcessCus
`tomer(c) (implementation omitted).
`Referring to FIG. 4, a conventional approach to executing
`the above exemplary query is depicted. More specifically,
`FIG. 4 illustrates what might occur in a traditional approach
`executing on a four central processing unit (CPU) machine
`with a partition-parallelism approach. As shown, data is
`retrieved from a store and split or partitioned across the four
`CPUs for parallel query processing (Where, Select). Results
`are captured from the parallelized query, merged and Subse
`quently provided to the consumer. Note the loss of parallelism
`as soon as the “end” of the producer is reached via the explicit
`merge. While the query processing is executing across four
`processors all produced data needs to be merged prior to
`delivery to the consumer, which is then executed without
`parallelism. The merge is a bottleneck in this process.
`FIG.5 graphically illustrates an optimized execution plan
`for the exemplary code utilizing partitioning. The limitations
`imposed by the previous approach are unnecessary and can
`hamper any query engine's ability to generate efficient plans
`and decisions during execution. This can be solved by mod
`eling consumer action as another query operation. It is to be
`noted that this can apply to scenarios in which a consumerand
`producer are in the same process, in separate processes or
`even on separate machines. In other words, the distance
`between consumer and producer can be arbitrary. Accord
`ingly, the merge action can be removed and the consumer
`ProcessCustomer() function run in parallel with other query
`operations. Further, the consumer could itself invoke other
`sub-queries or operations that could become part of an overall
`query execution plan.
`In the code, the semicolon between select c and for each
`identifies the split between producer and consumer. In that
`case, a filtered object needs to be created and passed into the
`consumer loop, as previously described. What is desired is to
`fuse the producer and the consumer so that the loop is inte
`grated into the query. This can result be accomplished be
`removing the loop and altering the query such that select c
`becomes select Produceconsumer(c). Now the consumer
`activity can simply appear as another query operation. During
`query plan generation and/or optimization, consumers can be
`partitioned across the four CPUs and operate on a subset of
`data provided by the producers.
`Turning to FIG. 6, another optimization approach is illus
`trated utilizing pipelining. Just as a query plan component
`and/or related components can peek inside of query opera
`tions such as Where and Select to determine and adaptively
`gather selectivity and cost, using this data to make better
`judgments about query execution now and in the future, it
`may also do so with a consumer. Therefore, not only can
`consumer activities be run in parallel, a query planner can
`make decisions based on this. For example, pipelining is often
`used to balance a query with imbalanced costs. This assumes
`that the planner is aware of consumer costs as traditional
`approaches are limited to producer activity. With this
`approach, a planner may choose to introduce new pipeline
`stages including running the consumer as its own pipeline,
`simply because it knows about this extension of the query.
`For example, if the aforementioned exemplary query is to
`be executed on a multiple CPU machine and the consumer
`costs four times the cost of the Where which costs two times
`the cost of the Select portion of the query, a reasonable expec
`tation since production is often an expensive operation, then
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2174, p. 18
`
`

`

`US 8,352,456 B2
`
`10
`
`15
`
`25
`
`7
`pipelining can be used to balance the query as shown in FIG.
`6. This style of optimization is simply not possible with
`conventional techniques and/or approaches.
`Referring to FIG. 7, a representative optimization compo
`nent 130 is illustrated in accordance with an aspect of the
`claimed Subject matter. As illustrated, the optimization com
`ponent 130 includes a consumer analysis component 710.
`The consumer analysis component 710 is a mechanism for
`analyzing activity across one or more consumers. Producer
`modification component 720 is communicatively coupled to
`the consumer analysis component 710 and operable to
`receive, retrieve or otherwise obtain consumer information.
`The producer modification component 720 can then modify
`producer activity as a function of consumer activity. By way
`of example, consider a situation where the producer yields
`customers and the consumer processes customers. If the con
`Sumer analysis component 710 can determine that a set of one
`or more consumers only ever utilizes a customer name, the
`producer modification component 720 can alter the query to
`produce only customer names rather than all customer infor
`mation. Accordingly, large portions of computation may be
`able to be eliminated as a function of consumer activity Such
`as where the producer produces that which the consumer is
`not interested, among other things.
`It should be appreciated that by removing a black box
`distinction between producers and consumers various opti
`mizations can occur. Here, optimization component 130
`facilitates producer modification in light of consumer activ
`ity. This functionality can be provided alone or in combina
`tion with other optimizations including parallelism. For
`example, the functionality captured by the consumer analysis
`component 710 and producer modification component 720
`can be captured and/or performed by execution strategy com
`ponent 220 of FIG. 2 or query plan component 320 of FIG. 3.
`Referring to FIG. 8, a database management system 800 is
`illustrated in accordance with an aspect of the claimed subject
`matter. The database management system 800 includes a
`database interface component 810, query processor compo
`nent 820 and database 830 for processing queries. The data
`base interface component 810 receives, retrieves or otherwise
`obtains a database query and transmits it to the query proces
`sor component 820. The query processor component 820
`process the query against the database 830 and returns results.
`In accordance with one aspect, the query processor compo
`nent 820 can produce results in accordance with a particular
`45
`query plan transmitted with the query or otherwise o

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket