throbber
®
`United States Patent 5
`Frenchetal.
`
`— US005794229A
`[11] Patent Number:
`[45] Date of Patent:
`
`|
`5,794,229
`Aug. 11, 1998
`
`[54] DATABASE SYSTEM WITH
`METHODOLOGYFOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`{75]
`
`Inventors: Clark French. Pepperell; Peter White.
`Andover, both of Mass.
`
`[73] Assignee: Sybase, Inc., Emeryville, Calif.
`
`[21] Appl. No.: 570,183
`:
`.
`11,
`[22] Filed:
`Dec. 11, 1995
`Related U.S. Application Data
`[63] Continuation-in-part of Ser. No. 48,637, Apr. 16, 1993,
`abandoned.
`GO6F 17/30
`Int. CLS cooscccccccsscsssssrsssesesesees
`[51]
`[52] US. CL...707/2:7107/3: TOT/10:
`:
`.
`.
`.
`707/205; 711/100; 711/171; 705/35
`[58] Field of Search sneneteesaaeeecenssnscasecensaenseer 395/601-612.
`395/603; 707/2. 3, 10, 205; 705/35; 711/100.
`71
`
`[56]
`
`.
`References Cited
`U.S. PATENT DOCUMENTS
`8/1986 Waisman et al. 0...sees 395/603
`4,606,002
`6/1987 Ferguson........
`wores 395/600
`4,677,550
`4,776,026 LO/1988 Ueyarma on
`eenccceeeceessesseeeeses 382/46
`5,153,590
`LOSL992 Clark wccesssscscsescceseeccesssensessnane 341/51
`5,293,616
`3/1994 Pint «0.asssessssessssssonsssneersessesvees 395/601
`w+. 395/603
`5,377,348
`12/1994 Lan etal....
`
`...
`- 395/601
`3,404,510
`4/1995 Smith et al.
`2/1996 Antoshenkov .
`395/601
`5,495,608
`.....cscscssscsseoseorene 395/603
`5,649,181
`7/1997 Frenchet ab.
`OTHER PUBLICATIONS
`
`
`
`“A Transaction—Based Approach to Vertical Partitioning for
`Relational Database Systems”, Chu et al.. IEEE, v19, n8,
`IEEE Transaction of Software Engineering, Aug. 1993.
`“RDBMSMaturity”, Philip A. Naecker, DEC Professional,
`v10, 012, p. 44(6) [Available—Online; Dialog File 275], Nov.
`1991.
`
`“Ingres Table Sructures”, David Snellen, DBMS, v5. n8. p.
`60(3) |Available: On-Line; Dialog File 275], Jul. 1992.
`Reinartz, K., “Aspects of vertical mode in multiprocessor
`systems, unconventional computation on conventional pro-
`cessors,” Second International Specalist Seminar on the
`Design and Application of Parallel Digital Processors, IEEE,
`1994, pp. 48-54.
`Brodie, M. and Manola, F.. “Database Management: A
`Survey,” 1987, pp. 1-24.
`Hanson-Smith, Ltd.. “Advantage Series System Overview,
`Ver. 2.0.” 1990, pp. 1-132.
`
`Primary Examiner—Thomas G. Black
`Assistant Examiner—Hosain T. Alam
`Attomey, Agent, or Firm—John A. Smart
`57
`ABSTRACT
`57]
`A Client/Server Database System with improved methods
`for performing database queries, particularly DSS-type
`queries,
`is described. The system includes one or more
`Clients (e.g.. Terminals or PCs) connected via a Network to
`a Server. In general operation, Clients store data in and
`retrieve data from one or more database tables resident on
`the Server by submitting SQL commands. some of which
`specify “queries”—criteria for selecting particular records
`of a table. The system implements methods for storing data
`vertically (i.e., by column), instead of horizontally (i.e. by
`row) as is traditionally done. Each column comprises a
`plurality of “cells” (i.e.. column value for a record), which
`are arranged on a data page in a contiguous fashion. By
`storing data in a column-wise basis, the system can process
`a DSS query by bringing in only those columns of data
`which are of interest. Instead of retrieving row-based data
`pages consisting of information which is largely not of
`interest to a query, column-based pages can be retrieved
`consisting of information which is mostly,if not completely,
`of interest to the query. Theretrieval itself can be done using
`more-efficient large block I/O transfers. The system includes
`data compression which is provided at the level of Cache or
`Buffer Managers, thus providing on-the-fly data compres-
`sion in a manner whichis transparent to each object. Since
`—rverrtical storage of data leads to high repetition on a given
`data page.
`the system provides improved compression/
`decompression.
`
`24 Claims, 11 Drawing Sheets
`
`CUSTOMERSTABLE
`
`(LOGICAL VIEW)
`
`FIELDS:
`
`PMC FLY] 22 WORTH LN,| KENT
`
`
`
`
`
`COLUMN-BASED (VERTICAL)
`
`DATA STORAGE
`
`360
`[PGHOR|1007]1002[1003 [1604] 1008[..._|
`PG
`|ETC}
`BG
`a370
` «0 PAGECHAIN,
`355
`
`
`
`380
`
`Commvault Ex. 1030
`Commvault v. Realtime
`
`US Patent No. 9,054,728
`
`Page1
`
`Page 1
`
`Commvault Ex. 1030
`Commvault v. Realtime
`US Patent No. 9,054,728
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 1 of 11
`
`5,794,229
`
`100
`
`104
`
`KEYBOARD
`
`POINTING
`DEVICE
`
`
`
`SCREEN
`DISPLAY
`
`107
`
`108
`
`MASS
`STORAGE
`
`OUTPUT
`DEVICE
`
`102
`
`103
`
`MAIN
`MEMORY
`
`PROCESSOR
`
`0
`CONTROLLER
`
`101
`
`C
`
`mzsjawp>=
`
`110
`
`FIG. 1A
`
`CACHE
`MEMORY
`
`109
`
`Page 2
`
`Page 2
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 2 of 11
`
`5,794,229
`
`Osh
`
`GSOLL
`
`NOLLWONIdd¥
`
`(S)NVYDOUd
`
`SWadu
`
`LNSIT9
`
`JOVIYSLNI
`
`Y43asn—SMOQNIM
`
`TISHS
`
`GtSid
`
`Yasn
`
`
`
`SrlWALSASONILVYadOOrb
`
`Page 3
`
`Page 3
`
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 3 of 11
`
`5,794,229
`
`L1z
`
`0S2
`
`SS¢
`
`YAAYAS
`
`
`
`YaANASASvavIVa
`
`WALSAS
`
`0v2
`
`0&2
`
`YOLVYSNSS3009
`
`YAZINILdOW]
`NATIdWOON
`
`002
`
`'''tt''tl''''tt‘t'‘'‘''''’''''
`
`111I1!1'{‘t141'''tt44'ii11!'!''
`
`MYOMLIN
`
`O¢¢
`
`(S)IN3IT9
`
`O1Z
`
`an
`
`ANaNO
`
`(S)LINSIY
`
`QT
`
`
`(SINLS10S
`
`bbg
`
`yO(S)Od
`
`(S)IVNINYSL
`
`Page 4
`
`Page 4
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 4 of 11
`
`5,794,229
`
`‘T]L001|YaH9a]OeLe
`
`
`
`[~Sa7al4SHOW0S)Sanstataztoamstolaszus|awwwLoueno‘sqTald
` Oe
`00€
`fou]STUAISa
`aTIAAwaNA[NIasvAAS1]TINSi[vo0)||W]ye]pi2ze|XL]_STIAg3G|
`
`
`JeaesofywiNmo.aNvfONINMMOdoO:[HSI[2001|[212WSINIOHHI]NOATx|e001|]fee[zzoselwol__NaTIV|'LSNIVOE[aura“0
`
`
`
`
`9dOd fo13]
`
`
`
`
`FTGVLSYUSWOLSNI
`
`(MIATV9ID07)
`
`VAN]NMOLANV
`
`ONINMOT01
`
`NATIV
`‘LSNIVWOE}
`
`YALNAObbb
`
`
`
`IIVYOLSVLIVG
`
`(TWLNOZIHOH)G3SV8-MOw
`
`
`
`NIVHO3DVd
`
`OSE
`
`VeOld
`
`Page 5
`
`Page 5
`
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 5 of 11
`
`5,794,229
`
`00€
`
`
`
`FNvVYOLSVLVG
`
`(WOUYaAG4SVE-NWNTOO
`
`O08
`
`O9€
`
`
`
`FIGVLSYIWOLSND
`
`(MIA1V¥91907)
`
`LN3Y
`
`NMOLANV
`
`ATMASAA
`YALN3Obbb
`
`NSTI
`‘LSNIVWOE
`
`
`
`
`
` ONINMOG01[STal4JUOW05]SametGovtaztamstuptrazuistaava[esno|‘Sanald
`TT_NIVHOFO
`
`
`
`
`
`|"[soot|roor[e001|zoor[root|GHS|
`
`ole
`
`GSE
`
`géOld
`
`ldod[auail
`
`Page 6
`
`Page 6
`
`
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 6 of 11
`
`5,794,229
`
`
`
`LLe
`
`JEOs
`
`FOVYOLSWIV
`
`
`(“Sa7aldSOW0s]|¥aGNa9svtaAVISighAoiaguis}awn|aisno|:sa7ai4
`00€(WOLLYIN)GISVE-NAN109
`
`
`9GE<NIVHO49DVd
`
`b8eL9e
`|]volvolxufvolww]Yah94]
`odrouai|
`
`
`
`
`
`JIGWLSYIWOLSND
`
`
`
`(M3IA1¥91907)
`
`NMOLANY
`ONINMOGOF
`
`N3TW
`‘LSNIVWOE}
`
`ANNA
`YALNGObbb
`
`Page 7
`
`Page 7
`
`
`
`
`

`

`U.S. Patent
`
`Aug.11, 1998
`
`Sheet 7 of 11
`
`5,794,229
`
`OP
`
`YIDIAINILIG-ce
`
`VbOl
`CH9d|
`
`NOILONGIYVLVG
`TVYNLYN
`
`G3SNNNdOYd
`
`||H
`
`[913]|YTLLOLOLLLEL[OLOLOLEbE|LOObOLLL}
`
`OLOLObLLLOO0000-8000000008000000.TTT
`
`
`
`ODL}OLLLLLO00000-8000000.0800LOLLOLFELL0048Q00000000000000D260;0000;
`
`
`LOOLOLLLL:0e1000,0000000090000680,<< —-
`
`
`
`
`
`LLOLObLELL0000000000880000000000;
`
`SLld
`
`
`d73i4gl48nd
`OLOLOLLIEbb
`
`OOLLOLLLbb
`LOOLOLLIbb
`LOVEOLLLLE
`LLOLOLLIbb
`
`|aasno|(MAIA1¥OID07)
`
`Page 8
`
`Page 8
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 8 of 11
`
`5,794,229
`
`aQ
`
`o
`NI
`~~
`
`30¢P
`
`AIMONIGWIT! av“Old
`[NOISS3udNOO(XId3ud)
`
`[NOISSTYdNOD(LMYZ)$Z71
`
`
`[NOISSTYdWODMZ7]
`[NOISS3YdWOOD374]
` |UGHOd)
`
`NOISSTYdWOOVIVG
`[913]||{LLOLOLELL]OLOLOLLELE|LOO}OLLLHL
`
`0b
`
` |UGH9|
`
`Page 9
`
`Page 9
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 9 of 11
`
`5,794,229
`
`01000010/00000060-00000000-86000000:
`
`
`
`
`00110100/000000000008-800000000000:
`
`L100+100;0080.00000000000000000000:
`
`
`
`
`
`
`
`
`
`
`
`NOILONGAYVV
`
`TVENLVN
`
`
`
`YFDILNILIG-c&
`
`
`
`d13l4JOV
`
`(MIA1791907)
`
`ObOld
`
`Page 10
`
`Page 10
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 10 of 11
`
`5,794,229
`
`ALITYNIGUYO
`
`CFONVHNI
`
`NOISSIYdWOO
`
`
`
`
`IRONAML,|XXXXXAAA|“0000000070880:000000000000:
`
`
`
`
`
`ONIMALSNTDXXAAAAAA+==0000-806000000008-00000000:
`aFONGIYNOLLONGsYviva
`
`
`XAAAXXAA=<900'0000660060000000-6000
`
`
`
`
`
`
`AAAAAAAA.‘#0000000.00000900,000070000!
`[SY
`
`
`
`|MAMAXXAA
`
`TWuNLN
`
`
`
`LVOTd119-79
`
`INIAALSNTO
`
`SANTA40
`
`
`
`Q7al4JOWd
`
`
`
`(M3IAT¥D1907)
`
`Page 11
`
`Page 11
`
`
`
`

`

`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 11 of 11
`
`5,794,229
`
`PER PAGE
`COMPRESSION
`
`PAGE HEADER:
`BTYPE: BITMAP
`
`o\RLE
`
`ii | | | _
`
`
`PAGE HEADER:
`BTYPE: DATA
`
`511
`
`PAGE HEADER:
`
`-,/ BTYPE: DATA
`
`‘CTYPE: LZW
`
`501
`
`503
`
`Page 12
`
`Page 12
`
`

`

`5,794,229
`
`1
`DATABASE SYSTEM WITH
`METHODOLOGYFOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`The present application is a continuation-in-part appli-
`cation of commonly-ownedapplication Ser. No. 08/048.637,
`filed Apr. 16, 1993, now abandoned,the disclosure of which
`is hereby incorporated by reference.
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyoneof the patent documentorthe patent
`disclosure as it appears in the Patent and Trademark Office
`patent file or records, but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates generally to information
`processing environments and. more particularly, to process-
`ing of queries against informationstored in a data processing
`system. such as an SQL Relational Database Management
`System (RDBMS).
`Computers are very powerful tools for storing and pro-
`viding access to vast amounts of information. Computer
`databases are a common mechanism for storing information
`on computer systems while providing easy access to users.
`A typical database is an organized collection of related
`information stored. as “records” having “fields” of informa-
`tion. As an example, a database of employees may have a
`record for each employee. Here, each record contains fields
`designating specifics about the employee, such as name,
`home address, salary, and the like.
`Between the actual physical database itself (i... the
`records contained in data pages stored on a storage device)
`and the users of the system, a database management system
`or DBMSis typically provided as a software cushion or
`layer. In essence, the DBMSshields the database user from
`knowing or even caring about underlying hardware-level
`details. Typically, all requests from users for access to the
`data are processed by the DBMS. For example, information
`may be added or removed from data files.
`information
`retrieved from or updated in such files, and so forth, all
`without user knowledge of underlying system implementa-
`tion. In this manner,
`the DBMS provides users with a
`conceptual view of the database that is removed from the
`hardware level. The general construction and operation of a
`database management system is knownin theart. See e.g..
`Date. C., An Introduction to Database Systems, Volume I
`and II. Addison Wesley, 1990; the disclosure of which is
`hereby incorporated by reference.
`DBMSsystems have long since moved from a centralized
`mainframe environment to a de-centralized or distributed
`environment. One or more PC “‘client” systems, for instance.
`may be connected via a network to one or more server-based
`database systems (SQL database server). Commercial
`examples of these “client/server” systems include Power-
`soft™ clients connected to one or more Sybase SQL
`Server™ database servers. Both Powersoft™ and Sybase
`SQLServer™are available from Sybase, Inc. of Emeryville,
`Calif. As the migration to client/server continues, each day
`more and more businesses are run from mission-critical
`systems which store information on server-based SQL data-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`2
`base systems, such as Sybase SQL Server™. As a result,
`increasingly higher demands are being placed on server-
`based SQL database systems to provide enterprise-wide
`decision support—providing timely on-line accesstocritical
`business information (e.g.. through “queries”). Accordingly.
`there is much interest in improving the performance of such
`systems, particularly in the area of database query perfor-
`mance.
`
`the
`Traditionally, database management systems (e.g..
`above-described client/server database systems) have been
`employed for on-line transaction processing (OLTP)—
`posting data (from transactions) to a database table. As part
`of this process ofputting data “in” a database, OLTP systems
`typically process queries which find a single record. or just
`a few records. A typical query in an OLTP system. for
`instance, might be to retrieve data records for a particular
`customer, such as retrieving records in an airline reservation
`system for Account No. 35. Thus, use of OLTP systems for
`retrieving data has been largely limited to moment-to-
`moment operational needs, where the queries are relatively
`simple and the number of rowsretrieved (relative to table
`size) is few. In other words, OLTP systems have been
`optimized for this task of finding a “needle in a haystack”—
`that is, finding one or few records which meet a given query
`condition.
`Morerecently, however, there has been great interest in
`“data warehousing.” For these Decision Support Systems
`(DSS) applications, database systems are designed not so
`much for putting information “in” (i.e..
`transaction
`processing) but more for getting information “out.” The
`general approach to providing DSS has been to take an
`SQL-based, OLTP database engine (e.g., Sybase or Oracle)
`which wasreally architected for the OLTP environment(and
`the model that it represents) and attempt to extend that
`engine to handle DSS applications. As a result of the effort
`to build DSS functionality on top of OLTP database engines.
`the performance of present-day DSS products is generally
`poor. Quite simply, OLTP database engines have beenarchi-
`tected and optimized for transaction processing to such an
`extentthat they are not well-suited for analytical processing.
`Alternatively, some vendors have implemented non-SQL-
`based systems providing DSS capability. These systems
`exist today, in large part, due to the failure of SQL-based
`database systems at processing DSS tasks. Examples of
`these non-SQL or “on-line analytical processing” (OLAP)
`systems include Pilot™ and Comshare™, These systems
`employ a non-relational model, one which generally allows
`the user to view the data in a cube-like form. Typically. data
`records are imported from SQL databases and then stored in
`various proprietary formats. Once transposed from SQL
`format into a particular proprietary format, the information
`is usually maintained on one or more servers for access by
`clients, through a proprietary interface. These systems typi-
`cally employ proprietary clients or front ends as well.
`This approachalsoentails significant disadvantages, how-
`ever. Users prefer to not have to import and/or transpose
`information into various proprietary formats. Instead, user
`would rather simply keep the information in SQL database
`tables. There are at least two reasons for this. First. the
`information is generally “already there.” having been accu-
`mulated by the posting of transactions over a period oftime.
`Second, users are already familiar with working with one of
`the main SQL database providers, such as Sybase or Oracle.
`Even if a separate tool is required to implement DSS, users
`prefer to work with tools which are in the same family as the
`tools which they are already using on a day-to-day basis.
`Expectedly, the non-SQL approach has met with limited
`success.
`
`Page 13
`
`Page 13
`
`

`

`5,794,229
`
`3
`Whatis needed are system and methods with better DSS
`performance. yet within the traditional SQL/relational
`model—a model which users demand. From the perspective
`of users, such a system should appear to be essentially an
`SQL-based relational system. At the same time. however.
`such a system should yield much-improved DSS perfor-
`mance. The present invention fulfills this and other needs.
`
`SUMMARY OF THE INVENTION
`
`Thepresent invention comprises a Client/Server Database
`System with improved methods for performing database
`queries, particularly DSS-type queries. In an exemplary
`embodiment, the system includes one or more Clients (e.g..
`Terminals or PCs) connected via a Network to a Server. The
`Server, operating under a server operating system (e.g..
`UNIX).
`includes a Database Server System.
`In general
`operation, Clients store data in and retrieve data from one or
`more database tables resident on the Server by submitting
`SQL commands. some of which specify “queries”—criteria
`for selecting particular records of a table.
`According to the present invention, data is not stored in a
`row or a horizontal orientation. as is traditionally done, but
`is instead stored vertically (i.e., by column). By storing data
`in a column-wise basis, the system of the present invention
`can process a DSS query by bringing in only those columns
`of data which are of interest. A DSS query typically entails
`looking at 50% or greater of the data and often includes
`GROUP BYclauses(e.g., on State or Gender) and includes
`SUM, COUNT, and AVG clauses (e.g., average on
`Revenue). Processing such a query using traditional storage
`methodology is highly inefficient, as row-based (horizontal)
`data pages bring in from disk all columns of data, whether
`or not particular columns are of interest to the query. Thus
`for DSS queries, storing data on a column basis by cell is far
`more preferable than storing data in the traditional row or
`tecord format.
`Since the vast majority of information in real-world DSS
`applications is typically low cardinality data (e.g.. State field
`has only 50 unique values, the majority of the columnsof a
`typical DSS table will store low cardinality data on each
`vertical data page. As a result, repetition within a particular
`column is quite high, thus leading to far better compression
`of data pages than can be realized than with row-based
`storage. The nature of the data encountered,
`therefore,
`further enhances the advantages of column-wise storage or
`vertical partitioning.
`In a typical DSS query, a database system needs to bring
`in large amounts of information, these queries typically look
`at a large percentage of the database. If the data pages
`actually brought in for the DSS query are populated largely
`or completely with information necessary for the query, then
`VO block size may be increased to a level which is optimal
`for the environment/platform. suchas to a level of 64K data
`pages. More particularly, by storing information in a
`column-based format.
`in accordance with the present
`invention, a high saturation level of information (such as
`required by a DSS query) can be efficiently met. Instead of
`retrieving row-based data pages consisting of information
`which is largely not of interest to a query, column-based
`pages can be retrieved consisting of information whichis
`mostly, if not completely, of interest to the query. The
`retrieval itself can be done using more-efficient large block
`VO transfers.
`In the system of the present invention, from an SQL
`system catalog, the system can determine a chain of columns
`which represent a particular table. Specifically, from the
`
`15
`
`20
`
`4
`chain of columns, the system can represent a logical table to
`the user. Here, the concept of a “table”in the system of the
`present
`invention is purely a catalog logical entry, as
`opposed to a physical entity in which it traditionally exists.
`The columns are “tied together” logically to represent a
`table. From the perspective of the user, however, the vertical
`partitioning is transparent.
`Each column comprisesa plurality of “cells”(.e., column
`value for a record), which are arranged on a data page in a
`contiguous fashion. For fixed-length fields (e.g., two char-
`acter State field), the offset of any particular cell on a page
`can be quickly computed as a modulusofthat cell (adjusted
`for any header information to the page). Between the indi-
`vidual cells, there is no overhead information, in contrast to
`several row-based implementations. Since the cells,
`in
`essence, exist as a solid stream of data, column scans are
`particularly efficient.
`The pages are further optimized for compression by
`storingin the page header a status flag indicating whether the
`data page is a candidate for compression and (optionally)
`what type of compression is best suited for the data on that
`. page. Since this is settable on a page-by-page basis. one
`particular compression technique can be applied on one
`column yetat the same time another compression technique
`applied on another column (or a different part of the same
`column), all within the same (logical) table.
`Data compression is added to the system at the level of the
`Cache or Buffer Managers. It is preferable to isolate com-
`pression here so that each object need not be concerned
`about compressingitself (or even being aware that compres-
`sion is occurring). As a result, compression is transparently
`addedto all data objects managed by Buffer Managers. The
`data pages of an object are compressed whensentout to disk
`and decompressed whenretrieved from disk, yet the object
`itself is unaware of this action.
`
`30
`
`35
`
`Most objects within the system, such as tables, indexes,
`logs. and the like. exist as pages. As these objects are
`streamed to disk, each simply requests its Buffer Manager to
`store the object on disk. The Manager in turn stores the
`object on disk using the best compression methodology
`known to it, for the given object. In this manner, data
`compression is transparent to the data objects above the
`level of the Buffer Manager.
`To address the potential problem of a modified block no
`longer compressing back downto its original size, a “block
`map” is employed in the system of the present invention.
`Within a Buffer Manager, there can be as many instancesof
`a block map as needed. Typically. one exists per object (or
`portion thereof). For instance. one block map can exist per
`B-Tree, or one per portion of a B-Tree (e.g., non-leaf level
`pages). Each block map. in turn, may be thought of as a
`logical-to-physical translator for the object (or subobject)
`which is its focus. In this manner, the system can concentrate
`on bringing in the object (or portion of the object) which is
`needed. Each page number provided to a client serves as an
`index into the corresponding block map for determining the
`actual physical page number. For implementations without
`compression, this translator may be eliminated.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG.1Ais a block diagram of a computer system in which
`the present invention may be embodied.
`FIG. 1B is a block diagram of a software subsystem for
`controlling the operation of the computer system of FIG. 1A.
`FIG. 2 is a block diagram of a client/server system in
`which the present invention is preferably embodied.
`
`45
`
`55
`
`65
`
`Page 14
`
`Page 14
`
`

`

`5,794,229
`
`5
`FIGS. 3A-C are diagrams illustrating column-based data
`storage or “vertical partitioning.” which is employed for
`storing data tables in the system of the present invention.
`FIGS. 4A-Dare diagrams illustrating “natural data reduc-
`tion” compression of data types which would otherwise
`include unusedbits; natural data reduction may be combined
`with other compression methodologies, as shown particu-
`larly in FIG. 4B.
`FIG. 5 is a diagram illustrating per page compression
`methodology of the present invention, which allows differ-
`ent compression methodology to be applied to different data
`pages (of either different page chains or even within the
`same page chain).
`DETAILED DESCRIPTION OF A PREFERRED
`EMBODIMENT
`
`The following description will focus on the presently
`preferred embodiment of the present invention, which is
`operative in a network environment executing client/server
`database applications. The present invention, however, is not
`limited to any particular application or environment. Instead,
`those skilled in the art will find that the present invention
`may be advantageously applied to any application or envi-
`ronment where optimization of query performance is
`desirable, including non-SQL database management sys-
`tems and the like. The description of the exemplary embodi-
`ments which follows is, therefore, for the purpose of illus-
`tration and not limitation.
`Standalone System Hardware
`The invention may be embodied on a computer system
`such as the system 100 of FIG. 1A, which comprises a
`central processor 101, a main memory 102, an input/output
`controller 103, a keyboard 104, a pointing device 105 (e.g..
`mouse, track ball, pen device, or the like), a screen display
`device 106, and a mass storage 107 (e.g., hard or fixed disk,
`removable disk, optical disk. magneto-optical disk, or flash
`memory). Processor 101 includes or is coupled to a cache
`memory 109 for storing frequently accessed information;
`memory 109 may be an on-chip cache or external cache (as
`shown). Additional output device(s) 108, such as a printing
`device, may be included in the system 100 as desired. As
`shown, the various components of the system 100 commu-
`nicate through a system bus 110 or similar architecture. In a
`preferred embodiment, the system 10@ includes an IBM-
`compatible personal computer system, available from a
`variety of vendors (including IBM of Armonk, N.Y.).
`Standalone System Software
`Iilustrated in FIG. 1B. a computer software system 150 is
`provided for directing the operation of the computer system
`100. Software system 150, which is stored in system
`memory 102 and on mass storage or disk memory 107,
`includes a kernel or operating system (OS) 140 and a
`windowsshell 145. One or more application programs, such
`as application software programs 155, may be “loaded” (Le.,
`transferred from storage 107 into memory 102) for execu-
`tion by the system 100. The system also includes a user
`interface 160 for receiving user commands and data as input
`and displaying result data as output.
`Also shown, the software system 150 includes a Rela-
`tional Database Management System (RDBMS) front-end
`or “client” 170. The RDBMSclient 170 may be any one of
`a number of database front-ends. including PowerBuilder™,
`dBASE®, Paradox®, Microsoft® Access. or the like. In an
`exemplary embodiment,
`the front-end will include SQL
`access drivers (e.g., Borland SQL Links, Microsoft ODBC
`drivers, Intersolv ODBC drivers, and the like) for accessing
`database tables from an SQL database server operating in a
`Client/Server environment.
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`6
`Client/Server Database Management System
`While the present invention may operate within a single
`(standalone) computer (e.g.. system 100 of FIG. 1A). the
`present invention is preferably embodied in a multi-user
`computer system, such as a Client/Server system. FIG. 2
`illustrates the general structure of a Client/Server Database
`System 200 suitable for implementing the present invention.
`As shown, the system 200 comprises one or more Client(s)
`210 connected to a Server 230 via a Network 220.
`Specifically, the Client(s) 210 comprise one or more stan-
`dalone Terminals 211 connected to a Database Server Sys-
`tem 240 using a conventional network. In an exemplary
`embodiment. the Terminals 211 may themselves comprise a
`plurality of standalone workstations, dumb terminals, or the
`like, or comprise personal computers (PCs) such as the
`above-described system 160. Typically, such units would
`operate under a client operating system. such as Microsoft
`Windows/MS-DOSfor PC clients.
`The Database Server System 240. which comprises
`Sybase SQL Server™ (available from Sybase, Inc. of
`Emeryville. Calif.) in an exemplary embodiment, generally
`operates as an independent process (i.c.. independently of
`the Clients), running under a server operating system such as
`Microsoft Windows NT (Microsoft Corp. of Redmond,
`Wash.), NetWare (Novell of Provo, UT), UNIX (Novell), or
`OS/2 (IBM). The Network 220 may be any one of a number
`of conventional network systems. including a Local Area
`Network (LAN)or Wide Area Network (WAN), as is known
`in the art (e.g.. using Ethernet. IBM TokenRing. or the like).
`The Network includes functionality for packaging client
`calls in the well-known SQL (Stcuctured Query Language)
`together with any parameter information into a format (of
`one or more packets) suitable for transmission across a cable
`or wire, for delivery to the Database Server System 240.
`Client/server environments, database servers, and net-
`works are well documented in the technical. trade, and
`patent literature. For a discussion of database servers and
`client/server environments generally, and SQL Server™
`particularly, see, e.g., Nath, A.. The Guide to SQL Server,
`Second Edition, Addison-Wesley Publishing Company.
`1995. Additional documentation of SQL Server™ is avail-
`able from Sybase. Inc. as SQL Server Documentation Set
`(Catalog No. 49600). For a discussion of a computer net-
`work employing Microsoft Networks/OpenNet File Sharing
`Protocol, see METHOD AND SYSTEM FOR OPPORTUNISTIC
`LOCKING IN A NETWORKED COMPUTER SYSTEM,Intl. Appli-
`cation No. PCT/US90/04570, Intl. Publication No. WO
`91/03024, Indl. Publication Date Mar. 7, 1991. For a general
`introduction to a Local Area Network operating under
`NetWare, see Freed, L. et al., PC Magazine Guide to Using
`NetWare, Ziff-Davis Press, 1991. A more detailed discussion
`is available in NetWare 3.x and 4.x and accompanying
`documentation, which is available from Novell of Provo,
`UT. The disclosures of each of the foregoing are hereby
`incorporated by reference.
`In operation, the Client(s) 21 store data in or retrieve
`data from one or more database tables 250, shown in FIG.
`2. Typically resident on the Server 230, each table itself
`comprises one or more horizontal rows or “records” (tuples)
`together with vertical columnsor “fields.” A database record
`includes information which is most conveniently repre-
`sented as a single unit. A record for an employee, for
`example, may include information about the employee’s ID
`Number, Last NameandFirst Initial, Position, Date Hired,
`Social Security Number, and Salary. Thus. a typical record
`includes several categories of information about an indi-
`vidual person, place. or thing. Each of these categories, in
`
`Page 15
`
`Page 15
`
`

`

`5.794.229
`
`7
`turn, represents a database field. In the foregoing employce
`table, for example. Position is one field, Date Hired is
`another, and so on. With this format. tables are easy for users
`to understand and use. Moreover, the flexibility of tables
`permits a user to define relationships between various items
`of data, as needed.
`In operation, the Client(s) issue one or more SQL com-
`mands to the Server. SQL commands may specify, for
`instance. a query for retrieving particular data (ie., data
`records meeting the query condition) from the table 250. The
`syntax of SQL (Structured Query Language) is well docu-
`mented; see. e.g., the above-mentioned An Introduction to
`Database Systems. In addition to retrieving the data from
`Database Servertables. the Client(s) also include the ability
`to insert new rowsof data records into the table; Client(s)
`can also modify and/or delete existing records in the table.
`For enhancing the speed in which the Database Server
`stores. retrieves. and presents particular data records, the
`Server maintains one or more database indexes 271 on the
`table, under control of an Index Manager. A database index,
`typically maintained as a B-Tree data structure, allows the
`records of a table to be organized in many different ways,
`depending on a particular user’s needs. An index may be
`constructed as a single disk file storing index key values
`together with unique record numbers. The formeris a data
`quantity composed of one or more fields from a record;the
`values are used to arrange (logically) the database file
`records by somedesired order (index expression). The latter
`are unique pointers or identifiers to the actual storage
`location of each record in the databasefile. Both are referred
`to internally by the system for locating and displaying
`records in a databasefile. Alternatively, instead of storing
`unique record numbers. a “clustered” index may be
`employed. This is an index which stores the data pagesof the
`records themselveson the terminal orleaf-level nodes ofthe
`index.
`In operation. the SQL statements received from the one or
`more Clients 210 (via Network 220) are processed by
`Engine 260 of the Database Server System 2460. The Engine
`260 itself comprises a Parser 261, Nornalizer 263, Compiler
`265, Execution Unit 269, Access Methods 270, and Buffer
`Manager(s) 272. Specifically, the SQL statements are passed
`to the Parser 261 which converts the statements into a query
`ttee—a binary tree data structure which represents the
`components of the query in a format selected for the
`convenience of the system. In this regard, the Parser 261
`employs conventional parsing methodology (e.g., recursive
`descent parsing).
`The query tree is normalized by the Normalizer 263.
`Normalization includes, for example,
`the elimination of
`redundant data. Additionally, the Normalizer 263 performs
`error checking, such as confirming that table names and
`column names which appear in the query are valid (e.g., are
`available and belong together). Finally, the Nornalizer can
`also look up anyreferential integrity constraints which exist
`and add those to the query.
`After normalizat

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket