`French et al.
`
`[19]
`
`[111 Patent Number:
`
`5,794,229
`
`[451 Date of Patent:
`
`Aug. 11, 1998
`
`LlSD05794229A
`
`[54]
`
`DATABASE SYSTEM WITH
`METHODOLOGY FOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`[75]
`
`Inventors: Clark Frt-nth. Pepperell: Peter White.
`Andover. both of Mass.
`
`[73] Assignoe: Syhase, Inc.. Emeryville. Calif.
`
`[21]
`
`[22]
`
`Appl No; 570,183
`
`Filed:
`
`Dec. 11, 1995
`
`Related US. Application Data
`
`[63] Continuation-i:n-part of set. No 43.537, Apr. 16. 1993,
`ahnndotzted.
`
`I51!
`I52!
`
`153]
`
`[S6]
`
`US. Cl.
`
`'T0’h'2; 70'?f3: 7'07/I0:
`707.005; 7111100; 7lU17l; 705335
`Field ofsvelrch ..................................... 395r'60l—6I2.
`393603: 70719. 3. 10. 205: ':’05f35: 711! IOD.
`171
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`811986
`31987
`1011988
`1011991’.
`
`
`
`l?Jl99-I Lon et al.
`411995
`‘
`ZII996
`7!l9'97 Frenchet al.
`
`. .
`
`.
`........................... 395K603
`
`4.606.002
`4,6'l'}'.5S0
`4.'}"I-'6.026
`5.15359]
`5.293.616
`5,311,348
`5,404,310
`5.495.608
`5,649.18]
`
`“Ingres Table Sructurcs“. David Snellen. DBMS. V5. n8. p.
`60(3) |Available: On—Line: Dialog File 2’?3J. Jul. 1992.
`Retnartz. l(.. “Aspects of vertical‘ mode in multiprocessor
`systems, unconventional computation: on contentions! pm-
`cessors." Second International Specalist Seminar on the
`Design and Application of Parallel Digital Processors. IEEE.
`1994. pp. 48-54.
`Brodie. M. and Manola. F.. "Database Management: A
`Survey" 1987. pp. 1-24.
`I-Ianson—Srn.ith. L1d.. “Adwmtage Series System Overview:
`Ver: 2.0." 1990. pp. 1-132.
`
`Prr'mdr_v Elrarrtirter-—TlI0ma5 G. Black
`Assistant .Examirrer—I'iI:I6aJ'J'1 '1'‘. Alma
`Attorney, Agent. or Firm--John A. Smart
`
`[571
`
`ABSl'RAC'T
`
`A Clienllsewer Database System with improved methods
`for performing database queries. particularly DSS-type
`queries. is described. The system includes one or more
`Clients (e.g.. Terminals or PCs) connected via a Network to
`a Server. In general operation. Clients store data in and
`rt.'.'tl'it:ve data from one or more database tables resident on
`the Server by submitting SQL commands. some of which
`specify “queries“—crlteria for selecting partictllar records
`of a table. The system implements methods for storing data
`vertically (i.e.. by colutnujl. instead of horizontally (i.c.. by
`row) as is uaditionally done. Each column comprises a
`plurality of "cells" {i.e.. column value for it record). which
`are arranged on a data page in a contiguous fashion. By
`storing data in a oolnmn-wise basis. the system can process
`a DSS query by Izringing in only those columns of data
`which are of interest. Instead of retrieving row-based data
`pages consisting of information which is largely not of
`interest to a query. column-based pages can be retrieved
`consisting of information which is mostly. if not oornpletely.
`of interest to the query. The retrieval itself can be done using
`rn(re»eflicient large block U0 transfers. The system includes
`data compression which is provided at the level of Cache or
`Bufia Managers. thus providing on-the-fly data compres-
`sion in II manner which is transparent to each object. Since
`vertical storage of data leads to high repetition on a given
`data page.
`the system provides improved compression!
`decompression.
`
`24 Claims, 11 Drawing Sheets
`
`OTHER PUBLICATIONS
`
`"A‘]3ransa:tion—Bascd Approach to Vertical Partitioning fcr
`Relational Database Systems“. Chu ct al.. IEEE. v19. n8.
`IEEE Tntnsaction of Software Engineering. Aug. 19%.
`“RDBMS hvlaturity". Philip A. Naeckcr. DEC Professional.
`vlD,n12. p. 44-(6) [Availablc—Online; Dialog File 2'75 1. Nov.
`1991.
`
`Oracle 1014
`Oracle 1014
`
`cus mess mate
`uonrctt tam
`
`
`
`tCi.u|0hBJlSED HTICAI.
`Dal!!! SVCYKEE
`
`l _
`
`j..._:._ P11-GF-CH-IIJPJ T ._—_.Z L.
`355
`
`
`
`U.S. Patent
`
`Aug. .11, 1993
`
`Sheet 1 of 11
`
`5,794,229
`
`E
`
`104
`
`KEYBOARD
`
`
`
`105
`
`107
`
`103
`
`SCREEN
`DISPLAY
`
`MASS
`STORAGE
`
`OUTPUT
`DEVICE
`
`"32
`
`MAIN
`MEMORY
`
`101
`
`CENTRAL
`PROCESSOR
`
`CACHE
`MEMORY
`
`109
`
`
`
`no
`CONTROLLER
`
`110
`
`FIG. 1A
`
`
`
`P3U
`
`UA
`
`M”
`
`9
`
`2
`
`5
`
`0;23,4
`
`m|mE
`
`wezo_Eo:&<
`
`@:§__oomn_
`
`admzmemo&~m=,z_mm$8Imzoozi
`
`9.7.,mfimm
`
`0mi.
`
`
`
` mmm:m:m5>mozcémmo9:
`
`
`
`
`«MU
`
`.m
`
`9
`
`9
`
`P§
`
`A555HHmm>mm_m
`
`
`..m.mmzmmmxmefimzatzmmo
`wm<m<:5mm2.“WmemumamNam
`
`
`
` I
`4",NGE
`
`
`
`7”,._5MM
`
`
`
`
`
`mamH.M3Eo<z§‘WmmE..$xmaz_mm_~:<§_oz‘mmH.M
`
`~_w.__n__28.
`
`359: mmm8<EuWMmmaI‘w-:2:zo_Sumxm_mmMfimmwmmWB_H_m8“ $_.u_o<z§$n_n_:m‘EW8o:ms_mosemzmo38‘mmmm_‘__H@._<z_.$_m»‘_.MOn_
`
`
`
`
`
`
`
`
`US. Patent
`
`Aug. 11, 1993
`
`Sheet 4 of 11
`
`922.,497’5
`
`Lg
`
`EE
`
`ammSm
`
`Egflnmaagmgaaug
`
`
`
`Ezmo2:295.xEHEEEEzm._._<.522:as
`
`~om~
`
`mm
`
`re.83..
`
`
`
`dom_...__>_Emsmz._mm<m>$._._mzm.»
`
`.52.52of
`
`
`
`mflzmo:._._.29:.x
`
`
`
`
`
`F29..z.__._._.mO..5N>.E05..n_
`
`
`
`mos:mERG8m:g_2o...$_ox8.m<m.§o~_
`
`
`
`Emam$:_Em8
`
`§m::§ooa
`
`_...8::mac...3.52%Egg.353%25.“.
`
`
`
`KRwas,223:24wz__,§oE:2.
`
`323
`
`89.
`
`39
`
`32
`
`En.
`
`Om
`
`a9m:
`
`am2:30Hui
`
`5.GE
`
`
`
`m_:_>§msmzdm<m>m_.._.mzmHaEuflEfluEEmjzmmm
`
`
`
`
`
` an
`
`
`
`
`
`
`
`
`
`
`
`«MU
`
`..lHB4...aD:
`
`Aug. 11, 1998
`
`Sheet 5 of 11
`
`922.,497.,5
`
`com
`
`
`moqO....m.E<Q
`
`aqonmms8m<m.z§3oo
`
`sax
`
`<0m_._.__>>mms_m
`
`.z._Ema;mm>._u_0.2mz._mm<m»m_..:m_zm.._.
`
`3”
`
`T|.1|||I23:0mes.
`
`
`
`mmGE
`
`gm
`
`Em
`
`On.E
`
`~98;,lg823259Exam.
`
`can
`
`miE
`
`
`
`EmamEs_Em8
`
`
`
`Ems.3960.:
`
`
`
`_...8::$023Egfiggn__53mama
`
`zm._._<3<222.5.;z<
`.52:58.9%.0
`
`m._.__>m_m_m
`
`
`
`
`
`mm_:,m_o_._.:.20.5.x
`
`
`
`
`
`
`
`
`6:.HemP«MU
`
`Aug. 11, 1998
`
`Sheet 6 of 11
`
`Sm
`
`
`m3EmEa
`
`3<o...Em>am2m.z=s§
`
`anan
`
`IEEEE
`
`E
`
`onN.83
`
`0n_E
`
`
`
`En:3.
`
`I.|.I.|..Eu_2
`
`an
`
`53
`
`
`
`3<22302,:oz_z;on_252
`
`«.0m.:_>._,%_s_m
`2._mm<m>w_jwzmH
`
`Em:
`
`.z._Emoz.an>._n_0.2.n_
`
`
`
`m._.__>mm_mmmnzmo2:.292_x
`
`SE.
`
`so_.
`
`as_.
`
`zm._._<.52.52on...E5.0SE
`
`
`
`Emsmmmzfimno
`
`
`
`§m.s._§oo.:
`
`
`
`_...mamaE0:3Eggéée58mam:
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Aug. 11, 1993
`
`Sheet 7 of 11
`
`922,497.,5
`
`IEEE
`
`8.95
`
`o3.
`
`S‘GI
`
`229::889:888.2.88_
`\8:8::889:888.2888.
`522::_.m.8§8P8888....:_
`:2E:E8888.E:58888"
`
`
`
`ommnpsname
`
`9%.\\\\\x.
`
`8:.2.:S8:2::.
`
`._<wS.$..2
`
`zononammE3
`
`
`
`mmombsb.m.Nm
`
`mam9E8
`
`98.38.063.
`
`
`
`
`
`
`
`9
`
`.4.,ma.mm93%J,
`Ma.
`
`U
`
`P.
`
`8m
`
`.m
`
`m8
`
`.3
`
`
`
`%~zoammmazooEsmfl.3.:
`
`;._o_&.m~.§_ooBE
`
`m:,..o_,£mE:ou2.3mg2..
`
`&zoammmnioo53.
`
` ohm;m;.o.§§_s§..sE§a§ImIEEu—__EEEH§;=:.
`
`5?.
`
`
`
`
`.1.He...l3PSU
`
`Aug. 11, 1998
`
`Sheet 9 of ll
`
`92J,M7.,5
`
`
`
`
`
`E2.88.--8.8.888.8.88_..8..
`
`S52888o888..:588_
`
`2.88.9888.:8888.:5882
`8:2888888....:88888"
`:8:8:.:88888888:a._
`
`8GE
`
`.~<m:.:u2
`
`zocoaommEwe
`
`T
`
`
`
`mmoflz.:88
`
`mm:mow
`
`
`
`Ems89.80.:
`
`
`
`
`
`mt..._<2am<o
`
`omuzqxzm
`
`zo..wmm.m%oo
`
`N5
`
`9
`
` b4.,9.mi
`
`
`
`
`
`25$mxfigm182mm«.33%xx?kc?_LozamzandH%_._A5:,mSoozé
`
`
`temu:amm20:uaamm395m452:2
`S__xi»xx;H-3.3393.9.83§..-_.a.
`
`
`rec,_-..oBBo.o.%E.§o.Be..m._
`
`
`
`U.
`
`4|.mm
`
`asM
`
`
`
`Pmm:.33moS.ozaflmad
`
`
`TllilllEms#960,:
`
`
`
`._.<O.EtméQqmimoE1
`
`
`
`
`
`
`US. Patent
`
`Aug. 11, 1993
`
`Sheet 11 of 11
`
`5,794,229
`
`PER PAGE
`COMPRESSJON
`
`PAGE HEADER:
`BTYPE: BiTMAP
`GTYPE: RLE
`
`\
`
` 521
`JI I} E . CTYPE:LZRW1
`
`511
`
`501
`
`PAGE HEADER:
`BTYPE: DATA
`
`PAGE HEADER:
`
`503
`
`1/
`
`
`
`1
`1
`1
`BUD
`U D DD
`
`FIG. 5
`
`
`
`5394.229
`
`1
`DATABASE SYSTEM WITH
`METHODOLOGY FOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARHTIONING ALL COLUMNS OF THE
`TABLE
`
`The present application is a conlinuau'on—in—part appli-
`cation of comrnortly—owned application Ser. No. 03/043.637.
`filed Apr. 16. 1993. now abandoned. the disclosure of which
`is hereby incorporated by reference.
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure as it appears in the Patent and Trademark Ofice
`patent ftle or records. but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`'1‘hc. present invention relates generally to information
`processing environments and. more particularly. to process-
`ing of queries against information stored in a data processing
`system. such as an SQL Relational Database Management
`System (RDBMS).
`Computers are very powerful tools for storing and pro-
`viding access to vast amounts of information. Computer
`databases are a common mechanism for storing information
`on computer systems while providing easy access to users.
`A typical database is an organized collection of related
`information stored as “records" having “fields” of infot.'Ine-
`tion. As an example. a database of employees may have a
`record for each employee. Here. each record contains fields
`designating specifics about the employee. such as name.
`home address. salary. and the like.
`the
`Between the actual physical database itself (i.e..
`records contained in data pages stcred on a storage device)
`and the users of the system. a database management system
`or DBMS is typically provided as a software cushion or
`layer. In essence. the DBMS shields the database user from
`knowing or even caring about underlying hardware-rlevel
`details. Typically. all requests from users for access to the
`data are processed by the DBMS. For example. in.fcrruati.on
`may be added or removed from data files.
`information
`retrieved from or updated in such files. and so forth. all
`without user knowledge of underlying system implementa-
`tion.
`In this manner.
`the DBMS provides users with a
`conceptual view of the database that is removed from the
`hardware level. The general construction and operation of a
`database management system is lmown in the art. See e.g..
`Date. (2.. An Introduction to Database Systems. Volume I
`and II. Addison Wesley. 1990; the disclosure of which is
`hereby incorporated by reference.
`DBMS systems have long since moved from a cennalized
`mainframe environment to a decentralized or distributed
`environment. One or more PC "clicltt" systems. for instance.
`may be connected via a network to one or more server-based
`database systems (SQL database server). Commercial
`examples of these “cliertt!sewer" systems include Power-
`soflT" clients connected to one or more Sybase SQL
`Scrver“‘ database servers. Both Powersoffi" and Sybase
`SQL Server?“ are available from Sybase. Inc. of Etneryville.
`Calif. As the migration to clientrserver continues. each day
`more and more businesses are run from mission-critical
`systems which store information on server-based SQL data-
`
`25
`
`3D
`
`45
`
`55
`
`2
`base systems. such as Sybase SQI. Senrerl“. As a result
`increasingly higher demands are beig placed on server-
`bascd SQL database systems to provide enterprise-wide
`decision support—~providing timely on-line access to critical
`business infonnation (e..g.. through "queries“). Accordingly.
`there is much interest in improving the performance of such
`systems. particularly in the area of database query perfor-
`mance.
`
`the
`'1l'ad.itionally. database management systems {e.g..
`above-described clientlserver database systems) have been
`employed for on—Line
`transaction processing (Ol..TP)—
`posting data (from transactions) to a database table. As part
`ofthis process of putting data “in” a database. OLTP systems
`typically process queries which find a single record. or just
`a few records. A typical quay in an Dl'.'l'P system. for
`instance. might be to retrieve data records for a particular
`customer. such as retrieving records in an airline reservation
`system for Account No. 35. Thus. use of OLTP systems for
`retrieving data has been largely limited to moment-to-
`moment operational needs. where the queries are relatively
`simple and the number of rows rctriaeved (relative to table
`size) is few. In other words. OLTP systems have been
`optimized for this task of finding a “needle in a haystacl-:"—
`that is. finding one or few records which meet a given query
`condition.
`More recently. however. there has been great interest in
`“data warehousing." For these Decision Support Systems
`(DSS) applications. database systems are designed not so
`much for putting information “in” {i.e-.
`transaction
`processing} but more for getting information “out." The
`general approach to providing DSS has been to take an
`SQL-based. OLTP database engine (e.g.. Sybase or Oracle)
`which was really architocted for the OLTP environment {and
`the model that it represents] and attempt to extend that
`engine to handle DSS applications. As a result of the etlort
`to build DSS functionality on top of OLTP database engines.
`the performance of present-day DSS products is generally
`poor. Quite simply. OLTP database engines have been archi-
`tected and optimized for transaction processing to such an
`extent that they are not well-suited for analytical processing.
`Alternatively. some vendors have implemented non-SQL-
`based systems providing DSS capability. These systems
`exist today. in large part. due to the failure of SQI.-based
`database systems at processing DSS tasks. Examples of
`these non-SQL or "on-line analytical processing" (OLAP)
`systems include Pilot” and Comsharel“. These systems
`employ a non-relational model. one which generally allows
`the user to view the data in a cube-lilo: form. Typically. data
`records are imported from SQL databases and then stored. in
`various proprietary formats. Once transposed from SQL
`format into a particular proprietary format. the information
`is usually maintained on one or more servers for access by
`clients. through a proprietary interface. These systems typi-
`cally employ proprietary clients or front ends as well.
`This ttp]:l'OllCl‘l also entails significant disadvantages. how-
`ever. Users prefer to not have to import andlor transpose
`information into various proprietary formats. Instead. user
`would rather simply keep the information in SQL database
`tables. There are at least two reasons for tltis. First.
`the
`information is generally “already there." having been accu-
`mulated by the posting of transactions over a period of time.
`Second. users are already familiar with working with one of
`the main SQL database providas. such as Sybase or Oracle.
`Even if a separate tool is required to implement DSS. users
`prefer to work with tools which are in the same fa.m.ily as the
`tools which they are already using on a day-to-day basis.
`Expectedly. the non-SQI. approach has met with limited
`success.
`
`
`
`5.794
`
`-1-...
`999
`
`3
`What is needed are system and methods with better DSS
`performance. yet within the traditional SQlJre1ationa1
`rrtodel—a model which users demand. From the perspective
`of users. such a system should appear to be essentially an-
`SQL-based relational system. At the same time. however.
`such a system should yield much-irnproved DSS perfor-
`mance. The present invention fulfills this and other needs.
`
`SUMMARY OF THE INVENTION
`
`IE-
`
`The present invention comprises a Clientiserver Database
`System with improved methods for performing database
`queries. particularly DSS-type queries. In an exemplary
`embodiment. the system includes one or more Clients (e.g..
`Terminals or PCs) connected via a Network to a Server. The
`Server. operating under a server operating system (e.g..
`UNIX).
`includes at Database Server System. In general
`operation. Clients store data in and retrieve data from one or
`more database tables resident on the Server by submitting
`SQL commands. some of which specify "queries"—critet'ia
`for selecting particular records of a table.
`According to the present invention. data is not stored in a
`row or a horizontal orientation. as is traditionally done. but
`is instead stored vertically (i.e.. by column). By storing data
`in a column-wise basis. the system of the present invention
`can process a DSS query by bringing in only those columns
`of data which are of interest. A DSS query typically entails
`Ioolcing at 50% or greater of the data and often includes
`GROUP BY clauses (e.g.. on State or Gendtx) and includes
`SUM. COUNT. and AVG clauses (e.g.. average on
`Revenue). Processing such a query using traditional storage
`methodology is highly ineflicient. as row-based (horizontal)
`data pages bring in from disk all columns of data. whether
`or not particular columns are of interest to the query. Thus
`for DSS queries. storing data on a column basis by cell is far
`more preferable than storing data in the traditional row or
`record format.
`
`Since the vast majority of information in real-world DSS
`applications is typically low cardinality data (e.g.. State field
`has only 50 unique values. the majority of the columns of a
`typical DSS table will store low cardinality data on each
`vertical data page. As a result. repetition within a particular
`column is quite high. thus leading to far better compression
`of data pages than can be realized than with row-based
`stcnge. The nature of the data encountered.
`therefore.
`fimher enhances the advantages of column-wise storage or
`vertical partitioning.
`In a typical DSS quay. a database system needs to bring
`in large amounts of information: these queries typically look
`at a large percentage of the database. If the data pages
`actually brought in for the DSS query are populated largely
`or completely with information necessary for the query. then
`H0 block size may be int:-eased to a level which is optimal
`for the envirorunentlplatfcrrn. such as to a level of 64K data
`pages. More particularly. by storing information in a
`colum.n—based format.
`in accordance with the present
`invention. a high saturation level of iuforrnation [sud1 as
`required by a DSS query) can be elficiently met. Instead of
`retrieving row-based data pages consisting of information
`which is largely not of interest to a query. column-based
`pages can be retrieved consisting of infonnation which is
`mostly.
`if not completely. of interest to the query. The
`retrieval itself can be done using more-elficient large block
`HO transfers.
`
`In the system of the present invention. from an SQL
`system catalog. the system can determine a chain of columns
`which represent a particular table. Specifically. from the
`
`45
`
`55
`
`65
`
`4
`chain of columns. the system can represent a logical table to
`the user. Here. the concept of a "table" in the system of the
`present
`invention is purely a catalog logical entry. as
`opposed to a physical entity in which it traditionally exists.
`The columns are “tied together" logically to represent a
`table. From the perspective of the user. however. the vertical
`partitioning is transparent.
`Each column comprises a plurality of “cells” (i.e.. column
`value for a record). which are arranged on a data page in a
`contiguous fashion. For fixed-length fields (e..g.. two char-
`acter Slate field). the ofiset of any particular cell on a page
`can be quickly computed as a modulus of that cell (adjusted
`for any header information to the page). Between the indi-
`vidual cells. there is no overhead information. in contrast to
`sevaal row—based implementations. Since the cells.
`in
`essence. exist as a solid stream. of data. column scans are
`particularly eflicicnt.
`The pages are l"ut1l'tcr optimized for compression by
`storing in the page header‘ 3. status flag indicating whether the
`data page is a candidate for compression and (optionally)
`what type of OOl'lJ.|IC5SiOl1 is best suited for the data on that
`page. Since this is senable on a page-by-page basis. one
`particular compression technique can be applied on one
`column yet at the same time another‘ compression technique
`applied on another’ column (or a diflercnt part of the same
`column}. all within the same (logical) table.
`Data compression is added to the system at the level ofdte
`Cache or Bufier Managers. It is preferable to isolate con1—
`pression here so that each object need not be concerned
`about compressing itself (or even being aware that compres-
`sion is occurring]. As a result. compression is tranwently
`added to all data objects managed by Bufler Managers. The
`data pages of an object are compressed when sent out to disk
`and decompressed when retrieved Erorn disk. yet the object
`itself is unaware of this action.
`
`Most objects within the system. such as tables. indexes.
`logs. and die lilte. exist as pages. As these objects are
`streamed to disk. each simply requests its Buffer Manager to
`store the object on disk. The Manager in tttrn stores the
`object on disk using the best cornpresnon methodology
`known to it. for the given object. In this manner. data
`compression is transparent to the data objects above the
`level of the Buficr Manager.
`To address the potential problem of a modified block no
`longer compressing back down to its original size. a “block
`map" is employed in the system of the present invention.
`Within a Bufier Manager. there can be as many instances of
`a block map as needed. Typically. one exists per object (or
`portion thereof}. For instance. one block map can exist pa’
`B—'ll'ee. or one per portion of a B-Tree (e.g.. non-leaf level
`pages). Each block map. in turn. may be thought of as a
`logical-to-physical translator for the object (or subobject)
`which is its focus. In this manner. the system can concentrate
`on bringing in the object (or portion of the object) which is
`needed. Each page number proviclodto a client serves as an
`index into the couesponding block map for determining the
`actual physical page number. For implementations without
`compression. this translator may be eliminated.
`BRIEF DESCRIPTION OF 'I'l'IE DRAWINGS
`
`FIG. IA is a block diagram of a computer system in which
`the present invention may be embodied.
`FIG. 1B is a block diagram of a software subsystem for
`controlling the operation of the computer system of FIG- 1A.
`FIG. 2 is a block diagram of it clientfservcr system in
`which the present invention is preferably embodied.
`
`
`
`5 .79-‘L329
`
`5
`FIGS. 3A-C are diagrams illustrating column-based data
`storage or “vertical partitioning." which is employed for
`storing data tables in the system of the present invention.
`FIGS. £|A—D are diagrams illustrating "natural data reduc-
`tion” compression of data. types which would otherwise
`include unused bits: natural data reduction may be combined
`with other compression methodologies. as shown particu-
`larly in FIG. 415.
`FIG. 5 is a diagram illustrating per page compression
`rnethodology of die present invention. which allows di.lfer-
`ent compression methodology to be applied to different data
`pages {of either dillerent page chains or even within the
`same page chain].
`DETAJLED DE.SCR.lPTIOI\T OF A PREFERRED
`EMBODIMENT
`
`The following description will focus on the presently
`preferred embodiment of the present invention. which is
`operative in a network environment executing clientfserver
`database applications. The present invention. however. is not
`limited to any particular application or environment. Instead.
`those skilled in the art will find that the present invention
`may be advantageously applied to any application or envi-
`ronment where optimization of query performance is
`desirable. including non-SQL database management sys-
`tems and the lilre. The desuiption of the exemplary embodi-
`ments which follows is. therefore. for the purpose of illus-
`tration and not limitation.
`Standalone System Hardware
`The invention may be embodied on a computer system
`such as the system 100 of FIG. 1A. which comprises a
`central processor 101. a main memory 102. an inputfoutpuit
`controller 103. a keyboard 104. a pointing device 105 (e.g..
`mouse. track ball. pen device. or the like). a screen display
`device 106. and a mass storage 107 (e.g.. hard or listed disk.
`removable disk. optical disk. magneto-optical disk. cr flash
`memory). Processor 101 includes or is coupled to a cache
`memory 109 for storing frequently accessed information;
`memory 109 may be an on-drip cache or external cache (as
`shown). Additional output devicetsl 108. such as a. printing
`device. may be included in the system 100 as desired. As
`shown. the various components of the system 100 commu-
`nicate through a system bus 110 or similar architecture. In a
`1Ieferred embodiment. the system 100 includes an IBM-
`cornpatible personal computer system. available from at
`variety of vendors (including IBM ofArmon|r.. N.Y.).
`Standalone System Software
`Illustrated in FIG. LB. 21 computer software system 150 is
`provided for directing the operation of the computer system
`100. Software system 150. which is stored in system
`memory 102 and on mass storage or disk memory 107.
`includes a kernel or operating system (05) 140 and a
`windows shell 145. One or more application programs. such
`as application software programs 155. may be "loaded" {i.e..
`transferred from storage Ill? into memory 102) for execu-
`tion by the system I00. The system also Includes a user
`interface 160 for receiving user commands and data as input
`and displaying result data as output.
`Also shown. the software system 150 includes a Rela-
`tional Database Man-agernent System (RDBMS) from-end
`or "client" 170. The RDB-MS client 17!] may be any one of
`a numba of database front-ends. including PowerBuilde.r"'“.
`dBA.SE®. Paradox®. lVlic1'osofi® Access. or the like. In an
`exemplary embodiment. the front-end will include SQL
`access drivers {e.g.. Borland SQL Links. Microsoft ODBC
`drivers. Intersolv ODBC drivers. and the like) for accessing
`database tables from an SQL database server operating in a
`Clicntfserver environment.
`
`25
`
`35
`
`55
`
`65
`
`6
`ClienU'Server Database Management System
`While the present invention may operate within a single
`[standalone] computer (e.g.. system 100 of FIG. IA]. the
`present invention is preferably embodied in a multi-user
`computer system. such as :1 Clientfserver system. FIG. 2
`illustrates the general structure of a Clienttserver Database
`System 200 suitable for implementing the present invention.
`As shown. the system 200 comprises one or more Client(sJ
`210 connected to a Server 230 via a Network 220.
`Specifically. the Client(s) 210 comprise one or more stan-
`dalone Terminals 211 connected to a Database Server Sys-
`tem 240 using a conventional network. In an exemplary
`embodiment. the Terminals 211 may themselves comprise a
`plurality of standalone workstations. dumb tennina.ls. or the
`like. or comprise personal computers (PCs) such as the
`above-described system 100. Typically. such units would
`operate under a client operating system. such as Microsoft
`v\FuidowslMS-DOS for PC clients.
`The Database Server System 240. which comprises
`Sybase SQI. Serverfi‘ (available from Sybase. Inc. of
`Erneryville. Calif.) in an exemplary embodiment. generally
`operates as an independent process (is... independently oi
`the Clients). running under a server operating system such as
`Microsoft Windows NT (Microsoft Corp. of Redmond.
`Was]-1.). Netware {Ncvell of Provo. UT). UNIX {Nove.ll). or
`08.32 (IBM). The Network 220 may be any one of a number
`of conventional network systems. including a Local Area
`Network (LAN; or Wide Area Network (WAN,1. as is lcnown
`in the art te.g.. using Ethernet. IBM Token Ring. or the lilre}.
`The Network includes functionality for packaging client
`calls in the well-known SQL {Structured Query Language)
`together with any parameter information into a format (of
`one or more packets} suitable for transmission across at cable
`or wire. for delivery to the Database Server System 240.
`Clientlserver environments. database servers. and net-
`works are well documented in the technical. trade. and
`patent literature. For a discussion of database servers and
`clientlserver environrnents generally. and SQL Server"
`particularly. see. e.g.. Nani. A.. The Guide to SQI. Server.
`Second Edition. Addison-Wesley Publishing Company.
`l995. Additional documentation of SQL Server“! is avail-
`able from Sybase. Inc. as SQL Server Documentation Set
`(Catalog No. 49600). For a discussion of a computer net-
`work employing Microsoft Networltsl0penNet File Sharing
`Protocol. see MEI!-[OD AND SYSTEM FOR O1-Porrrtn~ns11c
`Loctcmo us it NETWORKED COMPUTER S‘rS'I'EJul. IntL Appli-
`cation No. PCl':'US90!ll-I|5'i'0.
`lnll. Publication No. W0
`9l!O3ll24. Intl. Publication Date Mar. 7. I991. For a general
`introduction to a Local Area Network operating under
`Netware. see Freed. L. et al.. PC‘ Magazine Guide to Using
`Ncrlihre. Zifi-Davis Press. l99I. A more detailed discussion
`is available in Neiware 3.x and 4.x and accompanying
`docttrnentation. which is available from Novcll of Provo.
`UT. The disclosures of each of the foregoing are hereby
`incorporated by reference.
`In operation. the Client{s) 210 store data in or retrieve
`data from one or more database tables 250. shown in FIG.
`2. 'l')'picaI!y resident on the Serve: 230. each table itself
`comprises one or more horizontal rows or “records” (triples)
`together with vertical columns or “fit-.lds." Adatabase record
`includes information which is most conveniently repre-
`sented as a single unit. A l't'.t.'.‘.Cl't‘.l for an employee. for
`example. may include information about the employee's ID
`Number. Last Name and First Initial. Position. Date Hired.
`Social Security Number. and Salary. Thus. a typical record
`includes several categories of information about an indi-
`vidual person. plaoe. or thing. Each of these categories. in
`
`
`
`5.794.229
`
`7
`turn. represents a database field. In the foregoing employee
`table. for example. Position is one field. Date Hired is
`another. and so on. With this format. tables are easy for users
`to understand and use. Moreover. the flexibility of tables
`permits a user to define relationships between various items
`of data. as needed
`In operation. the Clientts) issue one or more SQL com-
`mands to the Server. SQI. commands may specify. for
`instance. a query for retrieving particular data (i.e.. data
`records meeting the query condition) from the table 250. The
`syntax of SQL (Structured Query Language) is well docu»
`meutcd; sec. 13.3.. the abovomenticncd An Introduction to
`Database Systems. In addition to retrieving the data from
`Database Server tables. the Clicnt(s) also include the alility
`to insert new rows of data records into the table: Clientts)
`can also modify andlor delete existing records in the table.
`For enhancing the speed in which the Database Server
`stores. retrieves. and presents particular data records. the
`Server maintains one or more database indexes 271 on the
`table. unda control of an Index Manager. Adatabase index.
`typically maintained as a B-Tree data structtlre. allows the
`records of a table to be organized in many dilferent ways.
`depending on a particular user's needs. An index may be
`constructed as a single disk file stering index key values
`together with unique record numbers. The former is a data
`quantity composed of one or more fields from a record: the
`values are used to arrange (logically) the database file
`records by some desired order (index expression)- The latter
`are unique pointers or identifiers to the actual storage
`location ct each record in the database tile. Both tuereferred
`to internally by the system for locating and displaying
`records in a database file. Alternatively. instead of stcring
`unique record numbers. a “clustered” index may be
`employed. This is an index which stores the data pages ofthe
`records themselves on til: terminal or leaf-level nodes of the
`index.
`In operation. the SQL statements received from the one or
`more Clients 2]! (via Network 220) are processed by
`Engine 160 of the Database Saver System 240. The Engine
`260 itself comprises a Parser 261. Nornalizcr 263. Compiler
`265. Execution Unit 269. Access Methods 270. and Bufier
`Manageus) 272.. Specifically. the SQL statements are passed
`to the Pwser 261 which converts the statements into a query
`trce—a binary tree data structnn: which represents the
`components of the query in a format selected for the
`convenience of the system. In this regard. the Parser 261
`employs conventional parsing methodology (e.g.. recursive
`descent parsing).
`The query tree is ncrnsalized by the Normalize: 263.
`Nonualization includes. for example. the elimination of
`redundant data. Additionally. the Normalize: 263 performs
`error checlcing. such as confirming that table names and
`column names which appear in the query are valid (e.g.. are
`available and belong together). Finally. the Normalize: can
`also look up any referential integrity constraints which exist
`and add those to the query.
`After normalization.
`the query tree is passed to the
`Compiler 265. which includes an Optimizer 266 and a Code
`Generator 267. The Optimizer is responsible for optimizing
`the query tree. The Optimizer performs a cost-based analysis
`for formulating a query execution plan. The Optimizer will.
`for instance. select the join order of tables (e.g.. when
`working with more than one table): it will select relevant
`indexes {c.g.. when indexes are available}. The Optimizer.
`thaeforc. performs an analysis of the query and picks the
`best execution plan. which in mm results in particular ones
`of the Access Methods 270 being invoked during query
`execution.
`
`I6
`
`35
`
`45
`
`SD
`
`65
`
`8
`The Code Generator 267. on the other hand. converts the
`quay tree into a set of instructions suitable for satisfying the
`query. These instructions are passed to the Execution Unit
`269. Operating undo" the control of these instructions. the
`Execution Unit 269 generates call