`United States Patent 5
`Frenchetal.
`
`— US005794229A
`[11] Patent Number:
`[45] Date of Patent:
`
`|
`5,794,229
`Aug. 11, 1998
`
`[54] DATABASE SYSTEM WITH
`METHODOLOGYFOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`{75]
`
`Inventors: Clark French. Pepperell; Peter White.
`Andover, both of Mass.
`
`[73] Assignee: Sybase, Inc., Emeryville, Calif.
`
`[21] Appl. No.: 570,183
`:
`.
`11,
`[22] Filed:
`Dec. 11, 1995
`Related U.S. Application Data
`[63] Continuation-in-part of Ser. No. 48,637, Apr. 16, 1993,
`abandoned.
`GO6F 17/30
`Int. CLS cooscccccccsscsssssrsssesesesees
`[51]
`[52] US. CL...707/2:7107/3: TOT/10:
`:
`.
`.
`.
`707/205; 711/100; 711/171; 705/35
`[58] Field of Search sneneteesaaeeecenssnscasecensaenseer 395/601-612.
`395/603; 707/2. 3, 10, 205; 705/35; 711/100.
`71
`
`[56]
`
`.
`References Cited
`U.S. PATENT DOCUMENTS
`8/1986 Waisman et al. 0...sees 395/603
`4,606,002
`6/1987 Ferguson........
`wores 395/600
`4,677,550
`4,776,026 LO/1988 Ueyarma on
`eenccceeeceessesseeeeses 382/46
`5,153,590
`LOSL992 Clark wccesssscscsescceseeccesssensessnane 341/51
`5,293,616
`3/1994 Pint «0.asssessssessssssonsssneersessesvees 395/601
`w+. 395/603
`5,377,348
`12/1994 Lan etal....
`
`...
`- 395/601
`3,404,510
`4/1995 Smith et al.
`2/1996 Antoshenkov .
`395/601
`5,495,608
`.....cscscssscsseoseorene 395/603
`5,649,181
`7/1997 Frenchet ab.
`OTHER PUBLICATIONS
`
`
`
`“A Transaction—Based Approach to Vertical Partitioning for
`Relational Database Systems”, Chu et al.. IEEE, v19, n8,
`IEEE Transaction of Software Engineering, Aug. 1993.
`“RDBMSMaturity”, Philip A. Naecker, DEC Professional,
`v10, 012, p. 44(6) [Available—Online; Dialog File 275], Nov.
`1991.
`
`“Ingres Table Sructures”, David Snellen, DBMS, v5. n8. p.
`60(3) |Available: On-Line; Dialog File 275], Jul. 1992.
`Reinartz, K., “Aspects of vertical mode in multiprocessor
`systems, unconventional computation on conventional pro-
`cessors,” Second International Specalist Seminar on the
`Design and Application of Parallel Digital Processors, IEEE,
`1994, pp. 48-54.
`Brodie, M. and Manola, F.. “Database Management: A
`Survey,” 1987, pp. 1-24.
`Hanson-Smith, Ltd.. “Advantage Series System Overview,
`Ver. 2.0.” 1990, pp. 1-132.
`
`Primary Examiner—Thomas G. Black
`Assistant Examiner—Hosain T. Alam
`Attomey, Agent, or Firm—John A. Smart
`57
`ABSTRACT
`57]
`A Client/Server Database System with improved methods
`for performing database queries, particularly DSS-type
`queries,
`is described. The system includes one or more
`Clients (e.g.. Terminals or PCs) connected via a Network to
`a Server. In general operation, Clients store data in and
`retrieve data from one or more database tables resident on
`the Server by submitting SQL commands. some of which
`specify “queries”—criteria for selecting particular records
`of a table. The system implements methods for storing data
`vertically (i.e., by column), instead of horizontally (i.e. by
`row) as is traditionally done. Each column comprises a
`plurality of “cells” (i.e.. column value for a record), which
`are arranged on a data page in a contiguous fashion. By
`storing data in a column-wise basis, the system can process
`a DSS query by bringing in only those columns of data
`which are of interest. Instead of retrieving row-based data
`pages consisting of information which is largely not of
`interest to a query, column-based pages can be retrieved
`consisting of information which is mostly,if not completely,
`of interest to the query. Theretrieval itself can be done using
`more-efficient large block I/O transfers. The system includes
`data compression which is provided at the level of Cache or
`Buffer Managers, thus providing on-the-fly data compres-
`sion in a manner whichis transparent to each object. Since
`—rverrtical storage of data leads to high repetition on a given
`data page.
`the system provides improved compression/
`decompression.
`
`24 Claims, 11 Drawing Sheets
`
`CUSTOMERSTABLE
`
`(LOGICAL VIEW)
`
`FIELDS:
`
`PMC FLY] 22 WORTH LN,| KENT
`
`
`
`
`
`COLUMN-BASED (VERTICAL)
`
`DATA STORAGE
`
`360
`[PGHOR|1007]1002[1003 [1604] 1008[..._|
`PG
`|ETC}
`BG
`a370
` «0 PAGECHAIN,
`355
`
`
`
`380
`
`Commvault Ex. 1030
`Commvault v. Realtime
`
`US Patent No. 9,054,728
`
`Page1
`
`Page 1
`
`Commvault Ex. 1030
`Commvault v. Realtime
`US Patent No. 9,054,728
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 1 of 11
`
`5,794,229
`
`100
`
`104
`
`KEYBOARD
`
`POINTING
`DEVICE
`
`
`
`SCREEN
`DISPLAY
`
`107
`
`108
`
`MASS
`STORAGE
`
`OUTPUT
`DEVICE
`
`102
`
`103
`
`MAIN
`MEMORY
`
`PROCESSOR
`
`0
`CONTROLLER
`
`101
`
`C
`
`mzsjawp>=
`
`110
`
`FIG. 1A
`
`CACHE
`MEMORY
`
`109
`
`Page 2
`
`Page 2
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 2 of 11
`
`5,794,229
`
`Osh
`
`GSOLL
`
`NOLLWONIdd¥
`
`(S)NVYDOUd
`
`SWadu
`
`LNSIT9
`
`JOVIYSLNI
`
`Y43asn—SMOQNIM
`
`TISHS
`
`GtSid
`
`Yasn
`
`
`
`SrlWALSASONILVYadOOrb
`
`Page 3
`
`Page 3
`
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 3 of 11
`
`5,794,229
`
`L1z
`
`0S2
`
`SS¢
`
`YAAYAS
`
`
`
`YaANASASvavIVa
`
`WALSAS
`
`0v2
`
`0&2
`
`YOLVYSNSS3009
`
`YAZINILdOW]
`NATIdWOON
`
`002
`
`'''tt''tl''''tt‘t'‘'‘''''’''''
`
`111I1!1'{‘t141'''tt44'ii11!'!''
`
`MYOMLIN
`
`O¢¢
`
`(S)IN3IT9
`
`O1Z
`
`an
`
`ANaNO
`
`(S)LINSIY
`
`QT
`
`
`(SINLS10S
`
`bbg
`
`yO(S)Od
`
`(S)IVNINYSL
`
`Page 4
`
`Page 4
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 4 of 11
`
`5,794,229
`
`‘T]L001|YaH9a]OeLe
`
`
`
`[~Sa7al4SHOW0S)Sanstataztoamstolaszus|awwwLoueno‘sqTald
` Oe
`00€
`fou]STUAISa
`aTIAAwaNA[NIasvAAS1]TINSi[vo0)||W]ye]pi2ze|XL]_STIAg3G|
`
`
`JeaesofywiNmo.aNvfONINMMOdoO:[HSI[2001|[212WSINIOHHI]NOATx|e001|]fee[zzoselwol__NaTIV|'LSNIVOE[aura“0
`
`
`
`
`9dOd fo13]
`
`
`
`
`FTGVLSYUSWOLSNI
`
`(MIATV9ID07)
`
`VAN]NMOLANV
`
`ONINMOT01
`
`NATIV
`‘LSNIVWOE}
`
`YALNAObbb
`
`
`
`IIVYOLSVLIVG
`
`(TWLNOZIHOH)G3SV8-MOw
`
`
`
`NIVHO3DVd
`
`OSE
`
`VeOld
`
`Page 5
`
`Page 5
`
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 5 of 11
`
`5,794,229
`
`00€
`
`
`
`FNvVYOLSVLVG
`
`(WOUYaAG4SVE-NWNTOO
`
`O08
`
`O9€
`
`
`
`FIGVLSYIWOLSND
`
`(MIA1V¥91907)
`
`LN3Y
`
`NMOLANV
`
`ATMASAA
`YALN3Obbb
`
`NSTI
`‘LSNIVWOE
`
`
`
`
`
` ONINMOG01[STal4JUOW05]SametGovtaztamstuptrazuistaava[esno|‘Sanald
`TT_NIVHOFO
`
`
`
`
`
`|"[soot|roor[e001|zoor[root|GHS|
`
`ole
`
`GSE
`
`géOld
`
`ldod[auail
`
`Page 6
`
`Page 6
`
`
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 6 of 11
`
`5,794,229
`
`
`
`LLe
`
`JEOs
`
`FOVYOLSWIV
`
`
`(“Sa7aldSOW0s]|¥aGNa9svtaAVISighAoiaguis}awn|aisno|:sa7ai4
`00€(WOLLYIN)GISVE-NAN109
`
`
`9GE<NIVHO49DVd
`
`b8eL9e
`|]volvolxufvolww]Yah94]
`odrouai|
`
`
`
`
`
`JIGWLSYIWOLSND
`
`
`
`(M3IA1¥91907)
`
`NMOLANY
`ONINMOGOF
`
`N3TW
`‘LSNIVWOE}
`
`ANNA
`YALNGObbb
`
`Page 7
`
`Page 7
`
`
`
`
`
`
`U.S. Patent
`
`Aug.11, 1998
`
`Sheet 7 of 11
`
`5,794,229
`
`OP
`
`YIDIAINILIG-ce
`
`VbOl
`CH9d|
`
`NOILONGIYVLVG
`TVYNLYN
`
`G3SNNNdOYd
`
`||H
`
`[913]|YTLLOLOLLLEL[OLOLOLEbE|LOObOLLL}
`
`OLOLObLLLOO0000-8000000008000000.TTT
`
`
`
`ODL}OLLLLLO00000-8000000.0800LOLLOLFELL0048Q00000000000000D260;0000;
`
`
`LOOLOLLLL:0e1000,0000000090000680,<< —-
`
`
`
`
`
`LLOLObLELL0000000000880000000000;
`
`SLld
`
`
`d73i4gl48nd
`OLOLOLLIEbb
`
`OOLLOLLLbb
`LOOLOLLIbb
`LOVEOLLLLE
`LLOLOLLIbb
`
`|aasno|(MAIA1¥OID07)
`
`Page 8
`
`Page 8
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 8 of 11
`
`5,794,229
`
`aQ
`
`o
`NI
`~~
`
`30¢P
`
`AIMONIGWIT! av“Old
`[NOISS3udNOO(XId3ud)
`
`[NOISSTYdNOD(LMYZ)$Z71
`
`
`[NOISSTYdWODMZ7]
`[NOISS3YdWOOD374]
` |UGHOd)
`
`NOISSTYdWOOVIVG
`[913]||{LLOLOLELL]OLOLOLLELE|LOO}OLLLHL
`
`0b
`
` |UGH9|
`
`Page 9
`
`Page 9
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 9 of 11
`
`5,794,229
`
`01000010/00000060-00000000-86000000:
`
`
`
`
`00110100/000000000008-800000000000:
`
`L100+100;0080.00000000000000000000:
`
`
`
`
`
`
`
`
`
`
`
`NOILONGAYVV
`
`TVENLVN
`
`
`
`YFDILNILIG-c&
`
`
`
`d13l4JOV
`
`(MIA1791907)
`
`ObOld
`
`Page 10
`
`Page 10
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 10 of 11
`
`5,794,229
`
`ALITYNIGUYO
`
`CFONVHNI
`
`NOISSIYdWOO
`
`
`
`
`IRONAML,|XXXXXAAA|“0000000070880:000000000000:
`
`
`
`
`
`ONIMALSNTDXXAAAAAA+==0000-806000000008-00000000:
`aFONGIYNOLLONGsYviva
`
`
`XAAAXXAA=<900'0000660060000000-6000
`
`
`
`
`
`
`AAAAAAAA.‘#0000000.00000900,000070000!
`[SY
`
`
`
`|MAMAXXAA
`
`TWuNLN
`
`
`
`LVOTd119-79
`
`INIAALSNTO
`
`SANTA40
`
`
`
`Q7al4JOWd
`
`
`
`(M3IAT¥D1907)
`
`Page 11
`
`Page 11
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1998
`
`Sheet 11 of 11
`
`5,794,229
`
`PER PAGE
`COMPRESSION
`
`PAGE HEADER:
`BTYPE: BITMAP
`
`o\RLE
`
`ii | | | _
`
`
`PAGE HEADER:
`BTYPE: DATA
`
`511
`
`PAGE HEADER:
`
`-,/ BTYPE: DATA
`
`‘CTYPE: LZW
`
`501
`
`503
`
`Page 12
`
`Page 12
`
`
`
`5,794,229
`
`1
`DATABASE SYSTEM WITH
`METHODOLOGYFOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`The present application is a continuation-in-part appli-
`cation of commonly-ownedapplication Ser. No. 08/048.637,
`filed Apr. 16, 1993, now abandoned,the disclosure of which
`is hereby incorporated by reference.
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyoneof the patent documentorthe patent
`disclosure as it appears in the Patent and Trademark Office
`patent file or records, but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`The present invention relates generally to information
`processing environments and. more particularly, to process-
`ing of queries against informationstored in a data processing
`system. such as an SQL Relational Database Management
`System (RDBMS).
`Computers are very powerful tools for storing and pro-
`viding access to vast amounts of information. Computer
`databases are a common mechanism for storing information
`on computer systems while providing easy access to users.
`A typical database is an organized collection of related
`information stored. as “records” having “fields” of informa-
`tion. As an example, a database of employees may have a
`record for each employee. Here, each record contains fields
`designating specifics about the employee, such as name,
`home address, salary, and the like.
`Between the actual physical database itself (i... the
`records contained in data pages stored on a storage device)
`and the users of the system, a database management system
`or DBMSis typically provided as a software cushion or
`layer. In essence, the DBMSshields the database user from
`knowing or even caring about underlying hardware-level
`details. Typically, all requests from users for access to the
`data are processed by the DBMS. For example, information
`may be added or removed from data files.
`information
`retrieved from or updated in such files, and so forth, all
`without user knowledge of underlying system implementa-
`tion. In this manner,
`the DBMS provides users with a
`conceptual view of the database that is removed from the
`hardware level. The general construction and operation of a
`database management system is knownin theart. See e.g..
`Date. C., An Introduction to Database Systems, Volume I
`and II. Addison Wesley, 1990; the disclosure of which is
`hereby incorporated by reference.
`DBMSsystems have long since moved from a centralized
`mainframe environment to a de-centralized or distributed
`environment. One or more PC “‘client” systems, for instance.
`may be connected via a network to one or more server-based
`database systems (SQL database server). Commercial
`examples of these “client/server” systems include Power-
`soft™ clients connected to one or more Sybase SQL
`Server™ database servers. Both Powersoft™ and Sybase
`SQLServer™are available from Sybase, Inc. of Emeryville,
`Calif. As the migration to client/server continues, each day
`more and more businesses are run from mission-critical
`systems which store information on server-based SQL data-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`65
`
`2
`base systems, such as Sybase SQL Server™. As a result,
`increasingly higher demands are being placed on server-
`based SQL database systems to provide enterprise-wide
`decision support—providing timely on-line accesstocritical
`business information (e.g.. through “queries”). Accordingly.
`there is much interest in improving the performance of such
`systems, particularly in the area of database query perfor-
`mance.
`
`the
`Traditionally, database management systems (e.g..
`above-described client/server database systems) have been
`employed for on-line transaction processing (OLTP)—
`posting data (from transactions) to a database table. As part
`of this process ofputting data “in” a database, OLTP systems
`typically process queries which find a single record. or just
`a few records. A typical query in an OLTP system. for
`instance, might be to retrieve data records for a particular
`customer, such as retrieving records in an airline reservation
`system for Account No. 35. Thus, use of OLTP systems for
`retrieving data has been largely limited to moment-to-
`moment operational needs, where the queries are relatively
`simple and the number of rowsretrieved (relative to table
`size) is few. In other words, OLTP systems have been
`optimized for this task of finding a “needle in a haystack”—
`that is, finding one or few records which meet a given query
`condition.
`Morerecently, however, there has been great interest in
`“data warehousing.” For these Decision Support Systems
`(DSS) applications, database systems are designed not so
`much for putting information “in” (i.e..
`transaction
`processing) but more for getting information “out.” The
`general approach to providing DSS has been to take an
`SQL-based, OLTP database engine (e.g., Sybase or Oracle)
`which wasreally architected for the OLTP environment(and
`the model that it represents) and attempt to extend that
`engine to handle DSS applications. As a result of the effort
`to build DSS functionality on top of OLTP database engines.
`the performance of present-day DSS products is generally
`poor. Quite simply, OLTP database engines have beenarchi-
`tected and optimized for transaction processing to such an
`extentthat they are not well-suited for analytical processing.
`Alternatively, some vendors have implemented non-SQL-
`based systems providing DSS capability. These systems
`exist today, in large part, due to the failure of SQL-based
`database systems at processing DSS tasks. Examples of
`these non-SQL or “on-line analytical processing” (OLAP)
`systems include Pilot™ and Comshare™, These systems
`employ a non-relational model, one which generally allows
`the user to view the data in a cube-like form. Typically. data
`records are imported from SQL databases and then stored in
`various proprietary formats. Once transposed from SQL
`format into a particular proprietary format, the information
`is usually maintained on one or more servers for access by
`clients, through a proprietary interface. These systems typi-
`cally employ proprietary clients or front ends as well.
`This approachalsoentails significant disadvantages, how-
`ever. Users prefer to not have to import and/or transpose
`information into various proprietary formats. Instead, user
`would rather simply keep the information in SQL database
`tables. There are at least two reasons for this. First. the
`information is generally “already there.” having been accu-
`mulated by the posting of transactions over a period oftime.
`Second, users are already familiar with working with one of
`the main SQL database providers, such as Sybase or Oracle.
`Even if a separate tool is required to implement DSS, users
`prefer to work with tools which are in the same family as the
`tools which they are already using on a day-to-day basis.
`Expectedly, the non-SQL approach has met with limited
`success.
`
`Page 13
`
`Page 13
`
`
`
`5,794,229
`
`3
`Whatis needed are system and methods with better DSS
`performance. yet within the traditional SQL/relational
`model—a model which users demand. From the perspective
`of users, such a system should appear to be essentially an
`SQL-based relational system. At the same time. however.
`such a system should yield much-improved DSS perfor-
`mance. The present invention fulfills this and other needs.
`
`SUMMARY OF THE INVENTION
`
`Thepresent invention comprises a Client/Server Database
`System with improved methods for performing database
`queries, particularly DSS-type queries. In an exemplary
`embodiment, the system includes one or more Clients (e.g..
`Terminals or PCs) connected via a Network to a Server. The
`Server, operating under a server operating system (e.g..
`UNIX).
`includes a Database Server System.
`In general
`operation, Clients store data in and retrieve data from one or
`more database tables resident on the Server by submitting
`SQL commands. some of which specify “queries”—criteria
`for selecting particular records of a table.
`According to the present invention, data is not stored in a
`row or a horizontal orientation. as is traditionally done, but
`is instead stored vertically (i.e., by column). By storing data
`in a column-wise basis, the system of the present invention
`can process a DSS query by bringing in only those columns
`of data which are of interest. A DSS query typically entails
`looking at 50% or greater of the data and often includes
`GROUP BYclauses(e.g., on State or Gender) and includes
`SUM, COUNT, and AVG clauses (e.g., average on
`Revenue). Processing such a query using traditional storage
`methodology is highly inefficient, as row-based (horizontal)
`data pages bring in from disk all columns of data, whether
`or not particular columns are of interest to the query. Thus
`for DSS queries, storing data on a column basis by cell is far
`more preferable than storing data in the traditional row or
`tecord format.
`Since the vast majority of information in real-world DSS
`applications is typically low cardinality data (e.g.. State field
`has only 50 unique values, the majority of the columnsof a
`typical DSS table will store low cardinality data on each
`vertical data page. As a result, repetition within a particular
`column is quite high, thus leading to far better compression
`of data pages than can be realized than with row-based
`storage. The nature of the data encountered,
`therefore,
`further enhances the advantages of column-wise storage or
`vertical partitioning.
`In a typical DSS query, a database system needs to bring
`in large amounts of information, these queries typically look
`at a large percentage of the database. If the data pages
`actually brought in for the DSS query are populated largely
`or completely with information necessary for the query, then
`VO block size may be increased to a level which is optimal
`for the environment/platform. suchas to a level of 64K data
`pages. More particularly, by storing information in a
`column-based format.
`in accordance with the present
`invention, a high saturation level of information (such as
`required by a DSS query) can be efficiently met. Instead of
`retrieving row-based data pages consisting of information
`which is largely not of interest to a query, column-based
`pages can be retrieved consisting of information whichis
`mostly, if not completely, of interest to the query. The
`retrieval itself can be done using more-efficient large block
`VO transfers.
`In the system of the present invention, from an SQL
`system catalog, the system can determine a chain of columns
`which represent a particular table. Specifically, from the
`
`15
`
`20
`
`4
`chain of columns, the system can represent a logical table to
`the user. Here, the concept of a “table”in the system of the
`present
`invention is purely a catalog logical entry, as
`opposed to a physical entity in which it traditionally exists.
`The columns are “tied together” logically to represent a
`table. From the perspective of the user, however, the vertical
`partitioning is transparent.
`Each column comprisesa plurality of “cells”(.e., column
`value for a record), which are arranged on a data page in a
`contiguous fashion. For fixed-length fields (e.g., two char-
`acter State field), the offset of any particular cell on a page
`can be quickly computed as a modulusofthat cell (adjusted
`for any header information to the page). Between the indi-
`vidual cells, there is no overhead information, in contrast to
`several row-based implementations. Since the cells,
`in
`essence, exist as a solid stream of data, column scans are
`particularly efficient.
`The pages are further optimized for compression by
`storingin the page header a status flag indicating whether the
`data page is a candidate for compression and (optionally)
`what type of compression is best suited for the data on that
`. page. Since this is settable on a page-by-page basis. one
`particular compression technique can be applied on one
`column yetat the same time another compression technique
`applied on another column (or a different part of the same
`column), all within the same (logical) table.
`Data compression is added to the system at the level of the
`Cache or Buffer Managers. It is preferable to isolate com-
`pression here so that each object need not be concerned
`about compressingitself (or even being aware that compres-
`sion is occurring). As a result, compression is transparently
`addedto all data objects managed by Buffer Managers. The
`data pages of an object are compressed whensentout to disk
`and decompressed whenretrieved from disk, yet the object
`itself is unaware of this action.
`
`30
`
`35
`
`Most objects within the system, such as tables, indexes,
`logs. and the like. exist as pages. As these objects are
`streamed to disk, each simply requests its Buffer Manager to
`store the object on disk. The Manager in turn stores the
`object on disk using the best compression methodology
`known to it, for the given object. In this manner, data
`compression is transparent to the data objects above the
`level of the Buffer Manager.
`To address the potential problem of a modified block no
`longer compressing back downto its original size, a “block
`map” is employed in the system of the present invention.
`Within a Buffer Manager, there can be as many instancesof
`a block map as needed. Typically. one exists per object (or
`portion thereof). For instance. one block map can exist per
`B-Tree, or one per portion of a B-Tree (e.g., non-leaf level
`pages). Each block map. in turn, may be thought of as a
`logical-to-physical translator for the object (or subobject)
`which is its focus. In this manner, the system can concentrate
`on bringing in the object (or portion of the object) which is
`needed. Each page number provided to a client serves as an
`index into the corresponding block map for determining the
`actual physical page number. For implementations without
`compression, this translator may be eliminated.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG.1Ais a block diagram of a computer system in which
`the present invention may be embodied.
`FIG. 1B is a block diagram of a software subsystem for
`controlling the operation of the computer system of FIG. 1A.
`FIG. 2 is a block diagram of a client/server system in
`which the present invention is preferably embodied.
`
`45
`
`55
`
`65
`
`Page 14
`
`Page 14
`
`
`
`5,794,229
`
`5
`FIGS. 3A-C are diagrams illustrating column-based data
`storage or “vertical partitioning.” which is employed for
`storing data tables in the system of the present invention.
`FIGS. 4A-Dare diagrams illustrating “natural data reduc-
`tion” compression of data types which would otherwise
`include unusedbits; natural data reduction may be combined
`with other compression methodologies, as shown particu-
`larly in FIG. 4B.
`FIG. 5 is a diagram illustrating per page compression
`methodology of the present invention, which allows differ-
`ent compression methodology to be applied to different data
`pages (of either different page chains or even within the
`same page chain).
`DETAILED DESCRIPTION OF A PREFERRED
`EMBODIMENT
`
`The following description will focus on the presently
`preferred embodiment of the present invention, which is
`operative in a network environment executing client/server
`database applications. The present invention, however, is not
`limited to any particular application or environment. Instead,
`those skilled in the art will find that the present invention
`may be advantageously applied to any application or envi-
`ronment where optimization of query performance is
`desirable, including non-SQL database management sys-
`tems and the like. The description of the exemplary embodi-
`ments which follows is, therefore, for the purpose of illus-
`tration and not limitation.
`Standalone System Hardware
`The invention may be embodied on a computer system
`such as the system 100 of FIG. 1A, which comprises a
`central processor 101, a main memory 102, an input/output
`controller 103, a keyboard 104, a pointing device 105 (e.g..
`mouse, track ball, pen device, or the like), a screen display
`device 106, and a mass storage 107 (e.g., hard or fixed disk,
`removable disk, optical disk. magneto-optical disk, or flash
`memory). Processor 101 includes or is coupled to a cache
`memory 109 for storing frequently accessed information;
`memory 109 may be an on-chip cache or external cache (as
`shown). Additional output device(s) 108, such as a printing
`device, may be included in the system 100 as desired. As
`shown, the various components of the system 100 commu-
`nicate through a system bus 110 or similar architecture. In a
`preferred embodiment, the system 10@ includes an IBM-
`compatible personal computer system, available from a
`variety of vendors (including IBM of Armonk, N.Y.).
`Standalone System Software
`Iilustrated in FIG. 1B. a computer software system 150 is
`provided for directing the operation of the computer system
`100. Software system 150, which is stored in system
`memory 102 and on mass storage or disk memory 107,
`includes a kernel or operating system (OS) 140 and a
`windowsshell 145. One or more application programs, such
`as application software programs 155, may be “loaded” (Le.,
`transferred from storage 107 into memory 102) for execu-
`tion by the system 100. The system also includes a user
`interface 160 for receiving user commands and data as input
`and displaying result data as output.
`Also shown, the software system 150 includes a Rela-
`tional Database Management System (RDBMS) front-end
`or “client” 170. The RDBMSclient 170 may be any one of
`a number of database front-ends. including PowerBuilder™,
`dBASE®, Paradox®, Microsoft® Access. or the like. In an
`exemplary embodiment,
`the front-end will include SQL
`access drivers (e.g., Borland SQL Links, Microsoft ODBC
`drivers, Intersolv ODBC drivers, and the like) for accessing
`database tables from an SQL database server operating in a
`Client/Server environment.
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`6
`Client/Server Database Management System
`While the present invention may operate within a single
`(standalone) computer (e.g.. system 100 of FIG. 1A). the
`present invention is preferably embodied in a multi-user
`computer system, such as a Client/Server system. FIG. 2
`illustrates the general structure of a Client/Server Database
`System 200 suitable for implementing the present invention.
`As shown, the system 200 comprises one or more Client(s)
`210 connected to a Server 230 via a Network 220.
`Specifically, the Client(s) 210 comprise one or more stan-
`dalone Terminals 211 connected to a Database Server Sys-
`tem 240 using a conventional network. In an exemplary
`embodiment. the Terminals 211 may themselves comprise a
`plurality of standalone workstations, dumb terminals, or the
`like, or comprise personal computers (PCs) such as the
`above-described system 160. Typically, such units would
`operate under a client operating system. such as Microsoft
`Windows/MS-DOSfor PC clients.
`The Database Server System 240. which comprises
`Sybase SQL Server™ (available from Sybase, Inc. of
`Emeryville. Calif.) in an exemplary embodiment, generally
`operates as an independent process (i.c.. independently of
`the Clients), running under a server operating system such as
`Microsoft Windows NT (Microsoft Corp. of Redmond,
`Wash.), NetWare (Novell of Provo, UT), UNIX (Novell), or
`OS/2 (IBM). The Network 220 may be any one of a number
`of conventional network systems. including a Local Area
`Network (LAN)or Wide Area Network (WAN), as is known
`in the art (e.g.. using Ethernet. IBM TokenRing. or the like).
`The Network includes functionality for packaging client
`calls in the well-known SQL (Stcuctured Query Language)
`together with any parameter information into a format (of
`one or more packets) suitable for transmission across a cable
`or wire, for delivery to the Database Server System 240.
`Client/server environments, database servers, and net-
`works are well documented in the technical. trade, and
`patent literature. For a discussion of database servers and
`client/server environments generally, and SQL Server™
`particularly, see, e.g., Nath, A.. The Guide to SQL Server,
`Second Edition, Addison-Wesley Publishing Company.
`1995. Additional documentation of SQL Server™ is avail-
`able from Sybase. Inc. as SQL Server Documentation Set
`(Catalog No. 49600). For a discussion of a computer net-
`work employing Microsoft Networks/OpenNet File Sharing
`Protocol, see METHOD AND SYSTEM FOR OPPORTUNISTIC
`LOCKING IN A NETWORKED COMPUTER SYSTEM,Intl. Appli-
`cation No. PCT/US90/04570, Intl. Publication No. WO
`91/03024, Indl. Publication Date Mar. 7, 1991. For a general
`introduction to a Local Area Network operating under
`NetWare, see Freed, L. et al., PC Magazine Guide to Using
`NetWare, Ziff-Davis Press, 1991. A more detailed discussion
`is available in NetWare 3.x and 4.x and accompanying
`documentation, which is available from Novell of Provo,
`UT. The disclosures of each of the foregoing are hereby
`incorporated by reference.
`In operation, the Client(s) 21 store data in or retrieve
`data from one or more database tables 250, shown in FIG.
`2. Typically resident on the Server 230, each table itself
`comprises one or more horizontal rows or “records” (tuples)
`together with vertical columnsor “fields.” A database record
`includes information which is most conveniently repre-
`sented as a single unit. A record for an employee, for
`example, may include information about the employee’s ID
`Number, Last NameandFirst Initial, Position, Date Hired,
`Social Security Number, and Salary. Thus. a typical record
`includes several categories of information about an indi-
`vidual person, place. or thing. Each of these categories, in
`
`Page 15
`
`Page 15
`
`
`
`5.794.229
`
`7
`turn, represents a database field. In the foregoing employce
`table, for example. Position is one field, Date Hired is
`another, and so on. With this format. tables are easy for users
`to understand and use. Moreover, the flexibility of tables
`permits a user to define relationships between various items
`of data, as needed.
`In operation, the Client(s) issue one or more SQL com-
`mands to the Server. SQL commands may specify, for
`instance. a query for retrieving particular data (ie., data
`records meeting the query condition) from the table 250. The
`syntax of SQL (Structured Query Language) is well docu-
`mented; see. e.g., the above-mentioned An Introduction to
`Database Systems. In addition to retrieving the data from
`Database Servertables. the Client(s) also include the ability
`to insert new rowsof data records into the table; Client(s)
`can also modify and/or delete existing records in the table.
`For enhancing the speed in which the Database Server
`stores. retrieves. and presents particular data records, the
`Server maintains one or more database indexes 271 on the
`table, under control of an Index Manager. A database index,
`typically maintained as a B-Tree data structure, allows the
`records of a table to be organized in many different ways,
`depending on a particular user’s needs. An index may be
`constructed as a single disk file storing index key values
`together with unique record numbers. The formeris a data
`quantity composed of one or more fields from a record;the
`values are used to arrange (logically) the database file
`records by somedesired order (index expression). The latter
`are unique pointers or identifiers to the actual storage
`location of each record in the databasefile. Both are referred
`to internally by the system for locating and displaying
`records in a databasefile. Alternatively, instead of storing
`unique record numbers. a “clustered” index may be
`employed. This is an index which stores the data pagesof the
`records themselveson the terminal orleaf-level nodes ofthe
`index.
`In operation. the SQL statements received from the one or
`more Clients 210 (via Network 220) are processed by
`Engine 260 of the Database Server System 2460. The Engine
`260 itself comprises a Parser 261, Nornalizer 263, Compiler
`265, Execution Unit 269, Access Methods 270, and Buffer
`Manager(s) 272. Specifically, the SQL statements are passed
`to the Parser 261 which converts the statements into a query
`ttee—a binary tree data structure which represents the
`components of the query in a format selected for the
`convenience of the system. In this regard, the Parser 261
`employs conventional parsing methodology (e.g., recursive
`descent parsing).
`The query tree is normalized by the Normalizer 263.
`Normalization includes, for example,
`the elimination of
`redundant data. Additionally, the Normalizer 263 performs
`error checking, such as confirming that table names and
`column names which appear in the query are valid (e.g., are
`available and belong together). Finally, the Nornalizer can
`also look up anyreferential integrity constraints which exist
`and add those to the query.
`After normalizat