throbber
111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US007185003B2
`
`c12) United States Patent
`Bayliss et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,185,003 B2
`Feb.27,2007
`
`(54) QUERY SCHEDULING IN A
`PARALLEL-PROCESSING DATABASE
`SYSTEM
`
`(75)
`
`Inventors: David Bayliss, Delray Beach, FL (US);
`Richard Chapman, Boca Raton, FL
`(US); Jake Smith, London (GB); Ole
`Poulsen, Bend, OR (US); Gavin
`Halliday, Royston (GB); Nigel Hicks,
`London (GB)
`
`(73) Assignee: Seisint, Inc., Boca Raton, FL (US)
`
`5,471,622 A
`5,495,606 A
`5,551,027 A
`5,555,404 A
`5,655,080 A
`5,732,400 A
`
`1111995 Eadline
`2/1996 Borden eta!.
`8/1996 Choy eta!.
`9/1996 Torbj0rnsen et a!.
`8/1997 Dias eta!.
`3/1998 Mandler et a!.
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 774 days.
`
`Eike Schallehn et a!., "Advanced Grouping and Aggregation for
`Data Integration," Department of Computer Science, Paper ID: 222,
`pp. 1-16.
`
`(21) Appl. No.: 10/293,489
`
`(22) Filed:
`
`Nov. 14, 2002
`
`(65)
`
`Prior Publication Data
`
`US 2004/0098374 Al May 20, 2004
`
`(51)
`
`Int. Cl.
`G06F 17130
`(2006.01)
`G06F 15116
`(2006.01)
`G06F 9146
`(2006.01)
`(52) U.S. Cl. ............................ 707/3; 707/10; 718/102;
`709/201
`(58) Field of Classification Search .................... 707/3,
`707/10; 718/102; 709/201-203
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,543,630 A
`4,860,201 A
`4,870,568 A
`4,925,311 A
`5,006,978 A
`5,276,899 A
`5,303,383 A
`5,423,037 A
`
`9/1985 Neches
`8/1989 Stolfo et al.
`9/1989 Kahle et al.
`5/1990 Neches eta!.
`4/1991 Neches
`111994 Neches
`4/1994 Neches eta!.
`6/1995 Hvasshovd
`
`(Continued)
`
`Primary Examiner-loon Hwan Hwang
`(74) Attorney, Agent, or Firm-Hunton & Williams LLP
`
`(57)
`
`ABSTRACT
`
`A system and method for scheduling database operations to
`one or more databases in a parallel-processing database
`system are described herein. After a query server generates
`a dynamic-link library (DLL) or other executable represen(cid:173)
`tative of one or more database operations to a database, the
`query server notifies a scheduling services module of the
`generation of the DLL and submits the DLL to a query agent.
`The query agent notifies the scheduling services module of
`its receipt of the DLL. Based on any of a variety of
`considerations, the scheduling services module schedules a
`time of execution for the DLL by one or more processing
`matrices that store the database. At the scheduled time, the
`scheduling services module directs the query agent to submit
`the DLL to the indicated processing matrices. The schedul(cid:173)
`ing services module also can be adapted to monitor the
`execution of previously submitted DLLs by one or more
`processing matrices and adjust the scheduled times of execu(cid:173)
`tion for subsequent DLLs accordingly.
`
`25 Claims, 21 Drawing Sheets
`
`Blue Coat Systems - Exhibit 1011 Page 1
`
`

`
`US 7,185,003 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`707/2
`
`4/1998 Jhingran et a!.
`5,745,746 A
`5,857,180 A *
`111999 Hallmark et a!.
`3/1999 Van Ruben et al.
`5,878,408 A
`3/1999 Ramesh eta!.
`5,884,299 A
`4/1999 Lasser eta!.
`5,897,638 A
`1111999 Kobayashi et al.
`5,983,228 A
`12/1999 Leong
`6,006,249 A
`212000 Tsuchida et a!.
`6,026,394 A
`6/2000 Cochrane et al.
`6,081,801 A
`6,266,804 B1
`7/2001 Isman
`6,311,169 B2
`10/2001 Duhon
`6,427,148 B1
`7/2002 Cos sock
`6,990,503 B1 *
`112006 Luo eta!. ................... 707/200
`2003/0037048 A1 * 2/2003 Kabra et al .................... 707/4
`OTHER PUBLICATIONS
`
`Vincent Coppola, "Killer APP," Men's Journal, vol. 12, No. 3, Apr.
`2003, pp. 86-90.
`Eike Schallehn et a!., "Extensible and Similarity-based Grouping
`for Data Integration," Department of Computer Science, pp. 1-17.
`Rohit Ananthakrishna et a!., "Eliminating Fuzzy Duplicates in Data
`Warehouses," 12 pages.
`Peter Christen et a!., "Parallel Computing Techniques for High(cid:173)
`Performance Probabilistic Record Linkage," Data Mining Group,
`Australian National University, Epidemiology and Surveillance
`Branch, Project web page: http://datamining.anu.edu.au/linkage.
`htrnl, 2002, pp. 1-11.
`Peter Christen et a!., "Parallel Techniques for High-Performance
`Record Linkage (Data Matching)," Data Mining Group, Australian
`National University, Epidemiology and Surveillance Branch,
`Project web page: http://datamining.anu.edu.au/linkage.htrnl, 2002,
`pp. 1-27.
`Peter Christen eta!., "High-Performance Computing Techniques for
`Record Linkage," Data Mining Group, Australian National Univer(cid:173)
`sity, Epidemiology and Surveillance Branch, Project web page:
`http://datamining.anu.edu.au/linkage.htrnl, 2002, pp. 1-14.
`William E. Winkler, "Matching And Record Linkage," U.S. Bureau
`of the Census, pp. 1-38.
`Peter Christen eta!., "High-Performance Computing Techniques for
`Record Linkage," ANU Data Mining Group, Australian National
`University, Epidemiology and Surveillance Branch, Project web
`page: http:/ /datamining.anu.edu.au/linkage.htrnl, pp. 1-11.
`William E. Winkler, "The State of Record Linkage and Current
`Research Problems," U.S. Bureau of the Census, 15 pages.
`William E. Winkler, "Advanced Methods For Record Linkage,"
`Bureau of the Census, pp. 1-21.
`William E. Winkler, Frequency-Based Matching in Fellegi-Sunter
`Model of Record Linkage, Bureau Of The Census Statistical
`Research Division, Oct. 4, 2000, 14 pages.
`
`William E. Winkler, "State of Statistical Data Editing And Current
`Research Problems," Bureau Of The Census Statistical Research
`Division, 10 pages.
`The First Open ETLIEAI Software For The Real-Time Enterprise,
`Sunopsis, A New Generation ETL Tool, "Sunopsis™ v3 expedites
`integration between heterogeneous systems for Data WARE(cid:173)
`HOUSE, Data Mining, Business Intelligence, and OLAP projects,"
`<www.suopsis.com>, 6 pages.
`Alan Dumas, "The ETL Market and Sunopsis™ v3 Business
`Intelligence, Data Warehouse & Datamart Projects," 2002,
`Sunopsis, pp. 1-7.
`Teradata Warehouse Solutions, "Teradata Database Technical Over(cid:173)
`view," 2002, pp. 1-7.
`WhiteCross White Paper, May 25, 2000, ''wx/des-Technical Infor(cid:173)
`mation," pp. 1-36.
`Teradata Alliance Solutions, "Teradata and Ab Initio," pp. 1-2.
`Peter Christen et a!., The Australian National University,
`"Febrl-Freely extensible biomedical record linkage," Oct. 2002,
`pp. 1-67.
`William E. Winkler, "Using the EM Algorithim for Weight Com(cid:173)
`putation in the Fellegi-Sunter Model of Record Linkage," Bureau
`Of The Census Statistical Research Division, Oct. 4, 2000, 12 pages.
`William E. Winkler et a!., "An Application Of The Fellegi-Sunter
`Model Of Record Linkage To The 1990 U.S. Decennial Census,"
`U.S. Bureau of the Census, pp. 1-22.
`William E. Winkler, "Improved Decision Rules In The Fellegi(cid:173)
`Sunter Model Of Record Linkage," Bureau of the Census, pp. 1-13.
`Fritz Scheuren eta!., "Recursive Merging and Analysis of Admin(cid:173)
`istrative Lists and Data," U.S. Bureau of the Census, 9 pages.
`William E. Winkler, "Record Linkage Software and Methods for
`Merging Administrative Lists," U.S. Bureau of the Census, Jul. 7,
`2001, 11 pages.
`Enterprises, Publishing and Broadcasting Limited, Acxiom(cid:173)
`Abilitec, pp. 44-45.
`TransUnion, Credit Reporting System, Oct. 9, 2002, 4 pages,
`<http://www. transunion. corn/ content/page .j sp? id ~/transunion/ gen(cid:173)
`eral/data/business/BusCre ... >.
`TransUnion, ID Verification & Fraud Detection, Account Acquisi(cid:173)
`tion, Account Management, Collection & Location Services,
`Employment Screening, Risk Management, Automotive, Banking(cid:173)
`Savings & Loan, Credit Card Providers, Credit Unions, Energy &
`Utilities, Healthcare, Insurance, Investment, Real Estate, Telecom(cid:173)
`munications, Oct. 9, 2002,46 pages, <http://www.transunion.com>.
`White Paper An Introduction to OLAP Multidimensional Terminol(cid:173)
`ogy and Technology, 20 pages.
`* cited by examiner
`
`Blue Coat Systems - Exhibit 1011 Page 2
`
`

`
`D
`
`108
`
`132
`
`Repository
`.1.1Q.
`
`Attribute
`
`Attribute
`
`Naming Services
`.112.
`
`Query Builder
`1Q2
`
`Query Server
`1Ql
`
`Scheduling
`Services
`l.H
`
`Results·
`
`Fig. 1
`
`Query Phase
`
`Global Results
`Processing Matrix
`ill
`
`14-----Results
`
`General Purpose Query
`Processing Matrix
`120
`
`Index-Based Query
`Processing Matrix
`122
`
`ill
`
`~---------------------------------------------------------------------------------------------------------4
`
`-
`
`Results-----
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`
`rFJ =-('D
`.....
`....
`0 .....
`N ....
`
`d
`rJl
`"'--...1
`
`""""' 00 u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 3
`
`

`
`Fig. 2
`
`200
`
`Query Server
`.1Q2.
`
`Query Builder
`1Qa
`
`Query Agent
`1M
`
`Global Results
`Processing Matrix
`ill
`
`Scheduling Services
`m
`
`General Purpose
`Query Processing
`Matrix
`120
`
`Index-Based Query
`Processing Matrix
`122
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`N
`0 .....
`N ....
`
`d
`rJl
`
`"'--...1
`
`""""' 00 u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 4
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 3 of 21
`
`US 7,185,003 B2
`
`Creation and
`Submission of Query
`~
`
`Fig. 3
`
`Compile DLL from
`Query
`~
`
`Process DLL at
`Index-Based Query
`Processing Matrix
`.llQ.
`
`Process DLL at
`General Purpose
`Query Processing
`Matrix
`ill
`
`Process DLL at
`Global Results
`Processing Matrix
`m
`
`Process DLL at
`Query Agent
`~
`
`Provide Results to
`Global Results
`Processing Matrix
`~
`
`Processing Matrix
`~
`
`Provide Results to
`Query Agent
`ill
`
`Provide Results to
`Work-Unit Reporting
`~
`
`Provide Results to
`Work-Unit Reporting
`~
`
`Provide Results to
`Work-Unit Reporting
`~
`
`Blue Coat Systems - Exhibit 1011 Page 5
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 4 of 21
`
`US 7,185,003 B2
`
`Fig. 4
`
`Submitted ECL Query
`
`Submitted XML Query
`1
`I
`I
`
`_______ y ______ _
`
`I
`I
`:
`~ Generate ECL
`: Query from XML :
`:
`Template
`:
`4n~
`I
`I
`'-------.. .,-------·
`.:::r.:&=.
`I
`I
`'Y
`
`... ,..
`
`Syntax Check
`~
`
`~,
`
`Incorporate
`Attributes
`406
`
`lF
`
`Generate C++
`code from ECL
`Query Syntax
`iQa
`
`,,
`
`Optimize C++
`code
`lli
`
`~,
`
`Compile C++ code
`into DLL(s)
`lli
`
`Blue Coat Systems - Exhibit 1011 Page 6
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 5 of 21
`
`US 7,185,003 B2
`
`It)
`•
`
`C) ·-u.
`
`..
`
`~
`
`+a-
`+
`rl>
`UUl'::::'
`c
`«>E(I)~
`-OE
`co
`......
`(1)
`'- -
`Q)Q)-
`ai-ol!!
`C98(J)
`
`+
`+
`.. Q]d>~
`u
`.. ·- 0
`N"C..-
`Eo
`-..::
`c.
`0
`
`(I)
`~
`0
`u -
`+~
`• +j~
`Uo-
`.~ 0
`a.c:
`E ·-
`0
`()
`
`~
`
`...J-
`0~
`.... _
`Ul ~col
`oEO
`COQJI.O
`x <U
`w(j)
`
`~~
`
`~
`
`"''
`r-- ,__.
`~ §1 ~
`-1
`....J
`...J 0
`a
`en
`- -
`'
`
`...J
`0
`U)
`
`..1
`
`Blue Coat Systems - Exhibit 1011 Page 7
`
`

`
`Fig. 6
`
`602
`
`I
`
`(
`(
`1~11~1
`Query
`
`Query List
`Quecy_1
`Query_2
`Query_3
`
`QQ.!
`
`OUTPUT(Per~, {Person.per_full_city, Person.per_first_name});
`
`640
`
`ECL List
`Datasets
`- Dataset_1
`- Dataset_2
`- Dataset_3
`-Dataset_ 4
`
`2.Q2
`
`2M
`
`Results
`
`626
`
`628
`
`630
`
`(
`(
`(
`I Send II Syntax II Clear
`
`Lang. References
`Built-in Attributes
`- Attrib_1
`- Attrib_2
`User Attributes
`- Uattrib_1
`- Uattrib_2
`Actions
`- Action_1
`- Action_2
`- Action_3
`
`City
`Dallas
`Seatle
`Memphis
`Albuquerque
`Urbana
`
`First Name
`Doug
`Andrew
`Carla
`Zachary
`Mary
`
`650
`
`I
`
`Q.1Q
`
`1~11~11~1
`I
`\
`\
`
`\
`
`~
`
`632
`
`634
`
`636
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`0\
`0 .....
`N ....
`
`d
`rJl
`
`"'-....1
`
`'"""" 00 u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 8
`
`

`
`Master
`ng
`
`Collator
`1Q6.
`
`.12Q
`
`Fig. 7A
`
`Collator
`~
`
`Collator
`1.Q!
`
`Node
`I1Q
`
`Node
`ill
`
`Node
`ill
`
`Node
`lli.
`
`Node
`ill.
`
`Node
`720
`
`MemUQ
`
`Memm
`
`Mem~
`
`Memm
`
`Mem.u,a
`
`Mem1..1Q
`
`Processing Matrix
`ill
`
`Database
`742
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`-....l
`0 .....
`N ....
`
`d
`rJl
`
`"'--...1
`
`""""' 00 u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 9
`
`

`
`From Query Agent 104
`
`Fig. 78
`
`120
`
`To PM
`
`1\
`
`Collator
`1.!M
`
`Collator
`IQ2
`
`Collator
`~
`
`To PM
`
`/18
`
`Resuits 70~
`
`•
`
`a
`
`•
`
`•
`
`•
`
`II
`
`. . . . . .
`
`. . . . . . . . . . . . . . .
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`rFJ =(cid:173)
`.....
`
`('D
`('D
`
`QO
`
`0 .....
`N ....
`
`d
`rJl
`
`"'--...1
`
`""""' 00 u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 10
`
`

`
`DLL from Query Agent
`
`Execute Portion A of
`DLL at Master
`w
`
`Execute Entry
`Point A of DLL
`
`at Master w.
`
`Execute Entry
`Point B of DLL
`.eQ2.
`
`Fig. 8
`
`Execute Entry
`Point B of DLL
`~
`
`-- ....
`
`/
`
`.aoo
`
`Execute Entry
`Point B of DLL
`~
`
`I
`\
`
`Execute Entry
`Point C of DLL
`a1Q
`
`Execute Entry
`Point C of DLL
`ill
`
`Execute Entry
`Point C of DLL
`lli
`
`Execute Entry
`Point C of DLL
`lli
`
`Execute Entry
`Point C of DLL
`.6.1.6.
`
`Execute Entry
`Point C of DLL
`am
`
`Send Results to
`Global Results
`Proc. Matrix
`~
`
`Send Results to
`Global Results
`Proc. Matrix
`842
`
`Send Results to
`Global Results
`Proc. Matrix
`~
`
`Send Results to
`Global Results
`Proc. Matrix
`~
`
`Send Results to
`Global Results
`Proc. Matrix
`§§
`
`Send Results to
`Global Results
`Proc. Matrix
`§§Q
`
`e •
`
`7J).
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`\0
`0 .....
`N ....
`
`d
`rJl
`
`"'-....1
`""""' 00
`
`U"l = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 11
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 10 of 21
`
`US 7,185,003 B2
`
`<(
`en
`•
`·-
`C)
`LL
`
`~~
`
`:lE
`
`Ql
`'C
`0
`
`~
`;~ ~
`>
`i5
`ll!
`(/)
`
`Cl)
`
`Ql
`'C
`0
`
`~~ ~
`>
`i5
`JJl
`(/)
`
`(/)
`
`Ql
`IJ)
`
`m "ll
`.0"1"
`.l!!m
`m a
`
`Blue Coat Systems - Exhibit 1011 Page 12
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 11 of 21
`
`US 7,185,003 B2
`
`Fig. 98
`
`90
`
`Master
`.9.02
`
`Results
`
`Results
`
`Slave Node
`.9..12
`
`Slave Node
`aM
`
`Slave Node
`.a12
`
`~
`
`Slave Node
`ill
`
`I Disk 9281
`
`Blue Coat Systems - Exhibit 1011 Page 13
`
`

`
`DLL from Query Agent
`
`1000A
`
`Execute DLL
`
`Fig. 10A
`
`----------
`---~
`,........ ----~~ Results from
`General
`,.,,..
`Execute DLL
`----, PurposeQuery
`atSiaveNode
`Processing
`.1Q1Q
`.........
`,....,,.,.
`Matrix
`, ...
`
`, ..
`
`_,
`
`,.. ..
`
`DLL
`
`DLL
`
`Execute DLL
`....
`at Slave Node
`.ti2Q§
`', L'---,(cid:173)
`'
`"',. .... ....
`.. ...,
`
`Execute DLL
`atSiaveNode
`~
`
`... ______ _
`
`Results
`
`Results Results Results
`
`Perform operations
`on results at
`Master Node
`1Q12.
`
`Provide results
`to Work-Unit
`Module
`.tilli
`
`Provide results
`to client
`1Q.1Q
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`....
`N
`0 .....
`N ....
`
`d
`rJl
`
`"'--...1
`""""' 00
`
`"'u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 14
`
`

`
`Fig. 108
`
`Oll from Query Agent
`
`10008
`
`Execute DLL
`at Master
`~
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`DLL
`
`DLL
`
`DLL
`DLL
`__ ...... -~---
`..
`.....
`... --
`,,.. ... ' f Exec~te DLL I
`
`at Slave Node
`.1.QQa
`
`Execute DLL
`at Slave Node
`~
`
`Save Results
`to Disk
`.w2
`
`Execute DLL
`at Slave Node
`.ll2Q§
`
`.... ........ -..
`
`,..
`
`Save Results
`to Disk
`.tim
`
`-.. .. .. ~ ..
`
`~~
`------
`
`-- _____ .,. __
`~-'
`J
`-~~~~
`
`Save Results
`to Disk
`102!
`
`Save Results
`to Disk
`1022.
`
`Results from
`General
`Purpose Query
`Processing
`Matrix
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`....
`0 .....
`N ....
`
`(.H
`
`d
`rJl
`
`"'--...1
`""""' 00
`
`"'u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 15
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 14 of 21
`
`US 7,185,003 B2
`
`Fig. 11A
`
`1100A
`
`Sort Data
`at Each Node
`.1.lll2
`
`For Each Node,
`Determine Suggested
`Partition from
`Sorted Data
`..1..liM
`
`Calculate Tentative
`Partition at
`Master Node
`..11D2
`
`Determine Effect of
`Tentative Partition
`at Each Node
`.1.1M
`
`Transfer Data
`Based on Agreed
`Partition
`11.1Z
`
`Merge Sort
`Incoming Data at
`Each Node
`1.1ll
`
`Blue Coat Systems - Exhibit 1011 Page 16
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 15 of 21
`
`US 7,185,003 B2
`
`Fig. 118
`
`Sort Data
`at Each Node
`.11QZ
`
`~
`
`For Each Node,
`Determine Suggested
`Partition from
`Sorted Data
`.1.1JM
`
`~
`
`Calculate Tentative
`Parent Partition at
`Master Node
`11.1§
`
`~
`
`Partition Slave
`Nodes into Subsets
`1.1.1a
`
`~
`
`Calculate
`Sub-Partition for
`Each Subset
`.1.12Q
`
`r4ll
`
`~
`
`Repeat for Each
`Recursive Subset
`.112!
`
`Partition Slave
`Nodes of Subset
`.1122
`
`1-
`
`J
`
`Transfer Data
`Based on Partitions
`1.11Z
`
`t
`
`Merge Sort
`Incoming Data at
`Each Node
`~
`
`Blue Coat Systems - Exhibit 1011 Page 17
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 16 of 21
`
`US 7,185,003 B2
`
`N
`
`~
`•
`C)
`
`·-LL
`
`~~ ON
`z...-
`
`'E
`
`f~
`
`0
`:::c
`
`(..)
`c..
`a::
`...._
`!!
`.
`11)
`Cl
`-----------
`__..
`(..)
`a.
`-
`a::
`--
`
`11)
`11)
`Cl
`
`F
`
`'E
`~'<t,
`<( .......
`E~
`0
`:::c
`
`ON
`~~
`z...-
`
`.....
`Q)
`
`~~
`
`0
`(..)
`
`...
`
`... , , , , , , , ,
`
`C>,
`1::
`... (5
`,
`... c..
`, ,
`
`...
`...
`,
`,
`
`C>
`.5:
`----- '8 ------
`c..
`'
`'
`' '
`'
`'
`'
`'
`'0>
`:E
`(5,
`
`c.. ' ' ' ' ' ' ' ' ' '
`
`'E
`~NI
`<( .......
`E~
`0 :::c
`
`Q) Nl ON
`
`-oo
`z.,......
`
`Blue Coat Systems - Exhibit 1011 Page 18
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 17 of 21
`
`US 7,185,003 B2
`
`Fig. 13A
`
`Controller
`.12.m
`
`Polling
`
`Polling
`
`'-...Results/
`State
`From
`Prev.
`Node
`
`Results/
`State
`
`Results/
`State
`To
`Next
`Node ~
`
`Fig. 138
`
`Controller
`.ru..a
`
`'-... Resultsf
`State
`From
`Prev.
`Node
`
`Results/
`State
`To
`
`Next "'».
`
`Node
`
`Blue Coat Systems - Exhibit 1011 Page 19
`
`

`
`Repository
`112
`
`Naming Services
`112.
`
`Administrative
`Module
`.HQ2
`
`Directive
`
`Query Server
`1QZ
`
`Scheduling
`Services
`ill
`
`~
`
`Results
`
`Query Agent
`1M
`
`Fig. 14
`
`Production Phase
`
`~-----------------------------1
`I
`I
`I
`I
`I
`I
`I
`I
`
`1406
`
`Data Factory Processing
`Matrix
`ill.Q
`ill.i
`114161
`
`ill2
`
`General Purpose Query
`Processing Matrix
`.1£Q
`
`L M~~~rY~~
`
`Index-Based Query
`Processing Matrix
`m
`
`e •
`
`00
`•
`~
`~
`~
`
`~ = ~
`
`""f'j
`('D
`?'
`N
`~-....l
`N
`0
`0
`-....l
`
`('D
`('D
`
`rFJ =(cid:173)
`.....
`....
`0 .....
`N ....
`
`QO
`
`d
`rJl
`
`"'--...1
`""""' 00
`
`"'u. = = w = N
`
`Blue Coat Systems - Exhibit 1011 Page 20
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 19 of 21
`
`US 7,185,003 B2
`
`Fig. 15
`
`Receive Data
`~
`
`Load Data to
`Staging Zone
`~
`
`Distribute Data
`to Data Factory
`~
`
`Generate DLL(s) i.&....__
`~ .....-------
`
`Procel>s Data to
`Generate
`lntennediate File(s)
`.w..a
`
`Quality
`Assurance
`~
`
`Combine lntennediate
`File(s) into Master File ~
`.lli2
`
`Build Index File
`.rna
`
`Error Check
`~
`
`Distribute Master File
`and Index File to Disk
`152Q
`
`Distribute Master File
`and/or Index File to GP
`Query PM Nodes
`1522.
`
`Distribute Master File
`and/or Index File to
`Index Query PM Nodes
`1.5Zi
`
`Blue Coat Systems - Exhibit 1011 Page 21
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 20 of 21
`
`US 7,185,003 B2
`
`I
`
`•
`•
`•
`
`"<t
`0
`CD
`
`....
`
`~
`
`~ 0 .....
`
`(()
`......
`
`•
`•
`•
`
`N .....
`(() ...
`
`•
`•
`•
`
`•
`
`C) ·-u.
`
`(
`
`<0
`......
`
`~-~
`i~ ~~
`
`-
`
`Blue Coat Systems - Exhibit 1011 Page 22
`
`

`
`U.S. Patent
`
`Feb.27,2007
`
`Sheet 21 of 21
`
`US 7,185,003 B2
`
`Fig. 17
`
`Build
`System
`Architecture
`1702
`
`~,
`
`Optimize System
`Architecture
`1704
`
`,,
`
`Distribute
`Software to
`Components
`11.Q§.
`
`,
`
`Distribute
`Configuration
`Files
`.1ZOO.
`
`~,
`
`Activate System
`and Test
`1l.1Q.
`
`Blue Coat Systems - Exhibit 1011 Page 23
`
`

`
`US 7,185,003 B2
`
`1
`QUERY SCHEDULING IN A
`PARALLEL-PROCESSING DATABASE
`SYSTEM
`
`FIELD OF THE INVENTION
`
`The present invention relates generally to database man(cid:173)
`agement and more particularly to parallel processing of
`database queries in a parallel processing system.
`
`BACKGROUND OF THE INVENTION
`
`The rapid increase in the amount of data generated by
`companies, agencies, and other organizations has taxed the
`capabilities of current relational database management sys(cid:173)
`tems (RDMSs ). To illustrate, some organizations have
`access to databases having hundreds of millions, and even
`billions, of records available through a RDMS. In such
`RDMSs, certain database operations (e.g., database joins,
`complex searches, extract-transform-load (ETL) operations, 20
`etc.) can take minutes, hours, and even days to process using
`current techniques. This processing lag often prevents
`access to the data in a timely manner, thereby inhibiting the
`client in its use of the requested information.
`In response to the increasing lag time resulting from
`increased database sizes, software manufacturers and data
`mining/storage companies have strived to create more effi(cid:173)
`cient RDMSs and data query techniques. In particular, a
`number of database management systems have been devel(cid:173)
`oped to implement parallel processing for performing data(cid:173)
`base management and database operations.
`A typical parallel-processing RDMS implementation
`includes using a symmetric multiprocessing (SMP) system
`for database operations. In general, SMP systems incorpo(cid:173)
`rate a number of processors sharing one or more system
`resources, such as memory or disk storage. The data repre(cid:173)
`senting the database(s) is stored in the memory and/or disk
`storage shared by the processors. Each processor is provided
`a copy of the database operation to be performed and
`executes the database operation on the data in parallel with
`the other processors.
`While SMP systems have the potential to improve the
`efficiency of database operations on large databases by
`removing the processor as the bottleneck, current implemen(cid:173)
`tations have a number of limitations. For one, the shared
`memory/disk storage often becomes the limiting factor as a
`number of processors attempt to access the shared memory/
`disk storage at the same time. Simultaneous memory/disk
`storage accesses in such systems typically result in the
`placement of one or more of the processors in a wait state
`until the memory/disk storage is available. This delay often
`reduces or eliminates the benefit achieved through the
`parallelization of the database operation. Further, the shared
`memory/disk storage can limit the scalability of the SMP
`system, where many such systems are limited to eight 55
`processors or less.
`Another limitation common to SMP database systems is
`the cost of implementation. SMP systems, as a result the
`underlying architecture needed to connect multiple proces(cid:173)
`sors to shared resources, are difficult to develop and manu- 60
`facture, and are, therefore, often prohibitively expensive. In
`many cases, the SMP database systems implement a propri(cid:173)
`etary SMP design, requiring the client of the SMP database
`system to contract with an expensive specialist to repair and
`maintain the system. The development of operating system 65
`software and other software for use in the SMP database
`system is also often complex and expensive to develop.
`
`2
`The performance of parallel processing database systems,
`SMP or otherwise, is often limited by the underlying soft(cid:173)
`ware process used to perform the database operation. In
`general, current parallel-processing database systems imple(cid:173)
`ment one or more interpreted database-enabled program(cid:173)
`ming languages, such as Simple Query Language (SQL),
`Perl, Python and the like. In these systems, the database
`operation is constructed as one or more instructions in the
`interpreted programming language and the set of instruc-
`10 tions are submitted to the SMP system. The SMP system, in
`turn, typically provides one or more of the instructions to
`each of the processors. Each processor implements an inter(cid:173)
`preter to interpret each instruction and generate the corre(cid:173)
`sponding machine-level code. Instruction sets constructed
`15 using an interpreted language typically are transformed into
`a parse tree. The interpreter (executed by the processor) then
`"walks-down" the parse tree and, at each node, instructs the
`processor to execute a predefined library code segment
`associated with the syntax at the node.
`It will be appreciated by those skilled in the art that the use
`of an interpreted language is inherently inefficient from a
`processing standpoint. For one, the step of interpreting and
`then executing a predefined library code segment at run-time
`often requires considerable processing effort and, therefore,
`25 reduces overall efficiency. Secondly, interpreters often use a
`predetermined machine-level code sequence for each
`instruction, thereby limiting the ability to optimize the code
`on an instruction-by-instruction basis. Thirdly, because
`interpreters consider only one node (and its related child
`30 nodes) at a time, interpreters typically are unable to globally
`optimize the database operation by evaluating the instruc(cid:173)
`tions of the database operation as a whole.
`Current techniques for data storage in conventional par(cid:173)
`allel-processing database systems also exhibit a number of
`35 limitations. As noted above, current parallel-processing
`database systems often implement shared storage resources,
`such as memory or disk storage, which result in bottlenecks
`when processors attempt to access the shared storage
`resources simultaneously. To limit the effects of shared
`40 storage, some current parallel-processing systems distribute
`the data of the database to multiple storage devices, which
`then may be associated with one or more processing nodes
`of the database system. These implementations, however,
`often have an inefficient or ineffective mechanism for failure
`45 protection when one or more of the storage devices fail.
`When a failure occurs, the storage device would have to be
`reinitialized and then repopulated with data, delaying the
`completion of the database operation. Additionally, the data
`may be inefficiently distributed among the storage devices,
`50 resulting in data spillover or a lack of proper load-balancing
`among the processing nodes.
`Accordingly, improved systems and techniques for data(cid:173)
`base management and access would be advantageous.
`
`SUMMARY OF THE INVENTION
`
`The present invention mitigates or solves the above(cid:173)
`identified limitations in known solutions, as well as other
`unspecified deficiencies in known solutions. A number of
`advantages associated with the present invention are readily
`evident to those skilled in the art, including economy of
`design and resources, transparent operation, cost savings,
`etc.
`The present invention provides a number of systems and
`methods for efficiently processing database operations on a
`relatively large database. In at least one embodiment, a
`database management system including one or more query
`
`Blue Coat Systems - Exhibit 1011 Page 24
`
`

`
`US 7,185,003 B2
`
`3
`servers, one or more query agents, and a computing matrix
`are used to process one or more queries submitted by a
`client. The computing matrix may comprise one or more of
`a global-results processing matrix, a general-purpose query
`processing matrix, and an index-based query processing
`matrix. Each processing matrix may comprise a plurality of
`interconnected processing nodes, at least a portion of which
`are adapted to process in parallel. In at least one embodi(cid:173)
`ment, each of the processing nodes is a "shared nothing"
`processing node having a separate processor, memory, disc 10
`storage(s), and network interface. Further, in one embodi(cid:173)
`ment, the hardware for each processing node includes
`widely-available general-purpose, single-user microcom(cid:173)
`puter components, such as a personal computer (PC) moth(cid:173)
`erboard, processor, random access memory (RAM), hard 15
`drive, network interface card (NIC), and the like.
`The client preferably provides a set of query-based pro(cid:173)
`gramming instructions representative of the desired query.
`The query server then may be adapted to convert the
`query-based programming instructions to source code in a 20
`high-level progrming language (e.g., C++), which the
`query server may then optimize for more efficient execution.
`The query server then compiles the source code to generate
`one or more executables in machine-level code, such as a
`dynamic link library (DLL) or a fully-linked "program."
`After generating the executable, the query server can
`provide the executable(s) to the query agent. In the event
`that the database operation(s) represented by the executable
`are not relatively processor-intensive, the query agent can be
`adapted to execute the executable(s) itself. Alternatively, or 30
`in addition, the query agent can provide the executable to
`one or more of the processing matrices of the computing
`matrix for processing. Upon receipt of the executable at a
`processing matrix, a subset of the processing nodes of the
`processing matrix execute one or more portions of the 35
`executable in parallel on the portion of the database at each
`processing node. The results of the execution may then be
`returned to the client, stored, or provided to another pro(cid:173)
`cessing matrix for additional processing.
`Also described herein are a system and method for 40
`scheduling database operations to one or more databases in
`a parallel-processing database system in accordance with at
`least one embodiment of the present invention. After a query
`server generates a dynamic-link library (DLL) or other
`executable representative of one or more database opera- 45
`tions to a database, the query server notifies a scheduling
`services module of the generation of the DLL and submits
`the DLL to a query agent. The query agent notifies the
`scheduling services module of its receipt of the DLL. Based
`on any of a variety of considerations, the scheduling services 50
`module schedules a time of execution for the DLL by one or
`more processing matrices that store the database. At the
`scheduled time, the scheduling services module directs the
`query agent to submit the DLL to the indicated processing
`matrices. The scheduling services module also can be 55
`adapted to monitor the execution of previously submitted
`DLLs by one or more processing matrices and adjust the
`scheduled times of execution for subsequent DLLs accord(cid:173)
`ingly.
`In accordance with one embodiment of the present inven- 60
`tion, a system for scheduling database operations on at least
`one database is provided. The system comprises a first
`global-results processing matrix having a plurality of inter(cid:173)
`connected processing nodes operating in parallel and being
`adapted to execute an executable on the at least one data- 65
`base, the executable being representative of a query having
`at least one database operation and a first query agent
`
`4
`operably connected to the first global-results processing
`matrix and being adapted to manage the execution of the
`executable by the first global-results processing matrix. The
`system further comprises a scheduling services module
`operably connected to the query agent and the first global(cid:173)
`results processing matrix and being adapted to schedule a
`time for execution of the executable by the first global(cid:173)
`results processing matrix and direct the first query agent to
`submit the executable to the first global-results processing
`matrix for execution at the scheduled time of execution.
`In a parallel processing based database management sys(cid:173)
`tem, a method is provided for scheduling execution of
`compiled executables representing queries having at least
`one database operation in accordance with at least one
`embodiment of the present invention. The method comprises
`the steps of scheduling a time for execution of an executable
`by a first global-results processing matrix and submitting, at
`the scheduled time, the executable to the first global-results
`processing matrix for execution.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The purpose and advantages of the present invention will
`be apparent to those of ordinary skill in the art from the
`25 following detailed description in conjunction with the
`appended drawings in which like reference characters are
`used to indicate like elements, and in which:
`FIG. 1 is a schematic diagram illustrating an exemplary
`parallel-processing database management system in accor(cid:173)
`dance with at least one embodiment of the present invention.
`FIG. 2 is a schematic diagram illustrating an exemplary
`system for monitoring a work state of the system of FIG. 1
`in accordance with at least one embodiment of the present
`invention.
`FIG. 3 is a fl

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket