throbber
United States Patent
`French et al.
`
`[19]
`
`[111 Patent Number:
`
`5,794,229
`
`[451 Date of Patent:
`
`Aug. 11, 1998
`
`LlSD05794229A
`
`[54]
`
`DATABASE SYSTEM WITH
`METHODOLOGY FOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARTITIONING ALL COLUMNS OF THE
`TABLE
`
`[75]
`
`Inventors: Clark Frt-nth. Pepperell: Peter White.
`Andover. both of Mass.
`
`[73] Assignoe: Syhase, Inc.. Emeryville. Calif.
`
`[21]
`
`[22]
`
`Appl No; 570,183
`
`Filed:
`
`Dec. 11, 1995
`
`Related US. Application Data
`
`[63] Continuation-i:n-part of set. No 43.537, Apr. 16. 1993,
`ahnndotzted.
`
`I51!
`I52!
`
`[53]
`
`[S6]
`
`US. Cl.
`
`'T0’h'2; 70'?f3: 7'07/I0:
`707.005; 7111100; 7lUl7l; 705335
`Field ofsvelrch ..................................... 395r'60l—6I2.
`393603: 70719. 3. 10. 205: ':’05f35: 711! IOD.
`171
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`811986
`31987
`1011988
`1011991’.
`
`
`
`l?Jl99-I Lon et al.
`411995
`‘
`ZII996
`7!l9'97 Frenchet al.
`
`. .
`
`.
`........................... 395K603
`
`4.606.002
`4,6'l'}'.5S0
`4.'}"I-'6.026
`5.15359]
`5.293.616
`5,311,348
`5,404,310
`5.495.608
`5,649.18]
`
`“Ingres Table Sructurcs“. David Snellen. DBMS. V5. n8. p.
`60(3) |Available: On—Line: Dialog File 2’?3J. Jul. 1992.
`Retnartz. l(.. “Aspects of vertical‘ mode in multiprocessor
`systems, unconventional computation: on contentions! pm-
`cessors." Second International Specalist Seminar on the
`Design and Application of Parallel Digital Processors. IEEE.
`1994. pp. 48-54.
`Brodie. M. and Manola. F.. "Database Management: A
`Survey" 1987. pp. 1-24.
`I-Ianson—Srn.ith. L1d.. “Adwmtage Series System Overview:
`Ver: 2.0." 1990. pp. 1-132.
`
`Prr'mdr_v Elrarrtirter-—TlI0ma5 G. Black
`Assistant .Examirrer—I'iI:I6aJ'J'1 '1'‘. Alma
`Attorney, Agent. or Firm--John A. Smart
`
`[571
`
`ABSl'RAC'T
`
`A Clienllsewer Database System with improved methods
`for performing database queries. particularly DSS-type
`queries. is described. The system includes one or more
`Clients (e.g.. Terminals or PCs) connected via a Network to
`a Server. In general operation. Clients store data in and
`rt.'.'tl'it:ve data from one or more database tables resident on
`the Server by submitting SQL commands. some of which
`specify “queries“—crlteria for selecting partictllar records
`of a table. The system implements methods for storing data
`vertically (i.e.. by colutnujl. instead of horizontally (i.c.. by
`row) as is uaditionally done. Each column comprises a
`plurality of "cells" {i.e.. column value for it record). which
`are arranged on a data page in a contiguous fashion. By
`storing data in a oolnmn-wise basis. the system can process
`a DSS query by Izringing in only those columns of data
`which are of interest. Instead of retrieving row-based data
`pages consisting of information which is largely not of
`interest to a query. column-based pages can be retrieved
`consisting of information which is mostly. if not oornpletely.
`of interest to the query. The retrieval itself can be done using
`rn(re»eflicient large block U0 transfers. The system includes
`data compression which is provided at the level of Cache or
`Bufia Managers. thus providing on-the-fly data compres-
`sion in II manner which is transparent to each object. Since
`vertical storage of data leads to high repetition on a given
`data page.
`the system provides improved compression!
`decompression.
`
`24 Claims, I) Drawing Sheets
`
`OTHER PUBLICATIONS
`
`"A‘]3ransa:tion—Bascd Approach to Vertical Partitioning fcr
`Relational Database Systems“. Chu ct al.. IEEE. v19. n8.
`IEEE Tntnsaction of Software Engineering. Aug. 19%.
`“RDBMS hvlaturity". Philip A. Naeckcr. DEC Professional.
`vlD,nl2. p. 44-(6) [Availablc—Online; Dialog File 2'75 1. Nov.
`1991.
`
`Oracle 1010
`Oracle 1010
`
`cus mess mate
`uonrctt tam
`
`
`
`tCi.u|0hBJlSED HTICAI.
`Dal!!! SVCYKEE
`
`l _
`
`j..._:._ P11-GF-CH-IIJPJ T ._—_.Z L.
`355
`
`

`
`U.S. Patent
`
`Aug. .11, 1993
`
`Sheet 1 of 11
`
`5,794,229
`
`E
`
`104
`
`KEYBOARD
`
`
`
`105
`
`107
`
`103
`
`SCREEN
`DISPLAY
`
`MASS
`STORAGE
`
`OUTPUT
`DEVICE
`
`"32
`
`MAIN
`MEMORY
`
`101
`
`CENTRAL
`PROCESSOR
`
`CACHE
`MEMORY
`
`109
`
`
`
`no
`CONTROLLER
`
`110
`
`FIG. 1A
`
`

`
`P3U
`
`UA
`
`M”
`
`9
`
`2
`
`5
`
`0;23,4
`
`m|mE
`
`wezo_Eo:&<
`
`@:§__oomn_
`
`admzmemo&~m=,z_mm$8Imzoozi
`
`9.7.,mfimm
`
`0mi.
`
`
`
` mmm:m:m5>mozcémmo9:
`
`
`

`
`«MU
`
`.m
`
`9
`
`9
`
`P§
`
`A555HHmm>mm_m
`
`
`..m.mmzmmmxmefimzatzmmo
`wm<m<:5mm2.“WmemumamNam
`
`
`
` I
`4",NGE
`
`
`
`7”,._5MM
`
`
`
`
`
`mamH.M3Eo<z§‘WmmE..$xmaz_mm_~:<§_oz‘mmH.M
`
`~_w.__n__28.
`
`359: mmm8<EuWMmmaI‘w-:2:zo_Sumxm_mmMfimmwmmWB_H_m8“ $_.u_o<z§$n_n_:m‘EW8o:ms_mosemzmo38‘mmmm_‘__H@._<z_.$_m»‘_.MOn_
`
`
`
`
`
`
`

`
`US. Patent
`
`Aug. 11, 1993
`
`Sheet 4 of 11
`
`922.,497’5
`
`Lg
`
`EE
`
`ammSm
`
`Egflnmaagmgaaug
`
`
`
`Ezmo2:295.xEHEEEEzm._._<.522:as
`
`~om~
`
`mm
`
`re.83..
`
`
`
`dom_...__>_Emsmz._mm<m>$._._mzm.»
`
`.52.52of
`
`
`
`mflzmo:._._.29:.x
`
`
`
`
`
`F29..z.__._._.mO..5N>.E05..n_
`
`
`
`mos:mERG8m:g_2o...$_ox8.m<m.§o~_
`
`
`
`Emam$:_Em8
`
`§m::§ooa
`
`_...8::mac...3.52%Egg.353%25.“.
`
`
`
`KRwas,223:24wz__,§oE:2.
`
`323
`
`89.
`
`39
`
`32
`
`En.
`
`Om
`
`a9m:
`
`am2:30Hui
`
`5.GE
`
`
`
`m_:_>§msmzdm<m>m_.._.mzmHaEuflEfluEEmjzmmm
`
`
`
`
`
` an
`
`
`
`
`
`
`
`
`
`

`
`«MU
`
`..lHB4...aD:
`
`Aug. 11, 1998
`
`Sheet 5 of 11
`
`922.,497.,5
`
`com
`
`
`moqO....m.E<Q
`
`aqonmms8m<m.z§3oo
`
`sax
`
`<0m_._.__>>mms_m
`
`.z._Ema;mm>._u_0.2mz._mm<m»m_..:m_zm.._.
`
`3”
`
`T|.1|||I23:0mes.
`
`
`
`mmGE
`
`gm
`
`Em
`
`On.E
`
`~98;,lg823259Exam.
`
`can
`
`miE
`
`
`
`EmamEs_Em8
`
`
`
`Ems.3960.:
`
`
`
`_...8::$023Egfiggn__53mama
`
`zm._._<3<222.5.;z<
`.52:58.9%.0
`
`m._.__>m_m_m
`
`
`
`
`
`mm_:,m_o_._.:.20.5.x
`
`
`
`
`
`
`

`
`6:.HemP«MU
`
`Aug. 11, 1998
`
`Sheet 6 of 11
`
`Sm
`
`
`m3EmEa
`
`3<o...Em>am2m.z=s§
`
`anan
`
`IEEEE
`
`E
`
`onN.83
`
`0n_E
`
`
`
`En:3.
`
`I.|.I.|..Eu_2
`
`an
`
`53
`
`
`
`3<22302,:oz_z;on_252
`
`«.0m.:_>._,%_s_m
`2._mm<m>w_jwzmH
`
`Em:
`
`.z._Emoz.an>._n_0.2.n_
`
`
`
`m._.__>mm_mmmnzmo2:.292_x
`
`SE.
`
`so_.
`
`as_.
`
`zm._._<.52.52on...E5.0SE
`
`
`
`Emsmmmzfimno
`
`
`
`§m.s._§oo.:
`
`
`
`_...mamaE0:3Eggéée58mam:
`
`
`
`
`
`
`
`

`
`U.S. Patent
`
`Aug. 11, 1993
`
`Sheet 7 of 11
`
`922,497.,5
`
`IEEE
`
`8.95
`
`o3.
`
`S‘GI
`
`229::889:888.2.88_
`\8:8::889:888.2888.
`522::_.m.8§8P8888....:_
`:2E:E8888.E:58888"
`
`
`
`ommnpsname
`
`9%.\\\\\x.
`
`8:.2.:S8:2::.
`
`._<wS.$..2
`
`zononammE3
`
`
`
`mmombsb.m.Nm
`
`mam9E8
`
`98.38.063.
`
`
`
`
`
`

`
`9
`
`.4.,ma.mm93%J,
`Ma.
`
`U
`
`P.
`
`8m
`
`.m
`
`m8
`
`.3
`
`
`
`%~zoammmazooEsmfl.3.:
`
`;._o_&.m~.§_ooBE
`
`m:,..o_,£mE:ou2.3mg2..
`
`&zoammmnioo53.
`
` ohm;m;.o.§§_s§..sE§a§ImIEEu—__EEEH§;=:.
`
`5?.
`
`
`

`
`.1.He...l3PSU
`
`Aug. 11, 1998
`
`Sheet 9 of ll
`
`92J,M7.,5
`
`
`
`
`
`E2.88.--8.8.888.8.88_..8..
`
`S52888o888..:588_
`
`2.88.9888.:8888.:5882
`8:2888888....:88888"
`:8:8:.:88888888:a._
`
`8GE
`
`.~<m:.:u2
`
`zocoaommEwe
`
`T
`
`
`
`mmoflz.:88
`
`mm:mow
`
`
`
`Ems89.80.:
`
`
`
`

`
`mt..._<2am<o
`
`omuzqxzm
`
`zo..wmm.m%oo
`
`N5
`
`9
`
` b4.,9.mi
`
`
`
`
`
`25$mxfigm182mm«.33%xx?kc?_LozamzandH%_._A5:,mSoozé
`
`
`temu:amm20:uaamm395m452:2
`S__xi»xx;H-3.3393.9.83§..-_.a.
`
`
`rec,_-..oBBo.o.%E.§o.Be..m._
`
`
`
`U.
`
`4|.mm
`
`asM
`
`
`
`Pmm:.33moS.ozaflmad
`
`
`TllilllEms#960,:
`
`
`
`._.<O.EtméQqmimoE1
`
`
`
`
`

`
`US. Patent
`
`Aug. 11, 1993
`
`Sheet 11 of 11
`
`5,794,229
`
`PER PAGE
`COMPRESSJON
`
`PAGE HEADER:
`BTYPE: BiTMAP
`GTYPE: RLE
`
`\
`
` 521
`JI I} E . CTYPE:LZRW1
`
`511
`
`501
`
`PAGE HEADER:
`BTYPE: DATA
`
`PAGE HEADER:
`
`503
`
`1/
`
`
`
`1
`1
`1
`BUD
`U D DD
`
`FIG. 5
`
`

`
`5394.229
`
`1
`DATABASE SYSTEM WITH
`METHODOLOGY FOR STORING A
`DATABASE TABLE BY VERTICALLY
`PARHTIONING ALL COLUMNS OF THE
`TABLE
`
`The present application is a conlinuau'on—in—part appli-
`cation of comrnortly—owned application Ser. No. 03/043.637.
`filed Apr. 16. 1993. now abandoned. the disclosure of which
`is hereby incorporated by reference.
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure as it appears in the Patent and Trademark Ofice
`patent ftle or records. but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`'1‘hc. present invention relates generally to information
`processing environments and. more particularly. to process-
`ing of queries against information stored in a data processing
`system. such as an SQL Relational Database Management
`System (RDBMS).
`Computers are very powerful tools for storing and pro-
`viding access to vast amounts of information. Computer
`databases are a common mechanism for storing information
`on computer systems while providing easy access to users.
`A typical database is an organized collection of related
`information stored as “records" having “fields” of infot.'Ine-
`tion. As an example. a database of employees may have a
`record for each employee. Here. each record contains fields
`designating specifics about the employee. such as name.
`home address. salary. and the like.
`the
`Between the actual physical database itself (i.e..
`records contained in data pages stcred on a storage device)
`and the users of the system. a database management system
`or DBMS is typically provided as a software cushion or
`layer. In essence. the DBMS shields the database user from
`knowing or even caring about underlying hardware-rlevel
`details. Typically. all requests from users for access to the
`data are processed by the DBMS. For example. in.fcrruati.on
`may be added or removed from data files.
`information
`retrieved from or updated in such files. and so forth. all
`without user knowledge of underlying system implementa-
`tion.
`In this manner.
`the DBMS provides users with a
`conceptual view of the database that is removed from the
`hardware level. The general construction and operation of a
`database management system is lmown in the art. See e.g..
`Date. (2.. An Introduction to Database Systems. Volume I
`and II. Addison Wesley. 1990; the disclosure of which is
`hereby incorporated by reference.
`DBMS systems have long since moved from a cennalized
`mainframe environment to a decentralized or distributed
`environment. One or more PC "clicltt" systems. for instance.
`may be connected via a network to one or more server-based
`database systems (SQL database server). Commercial
`examples of these “cliertt!sewer" systems include Power-
`soflT" clients connected to one or more Sybase SQL
`Scrver“‘ database servers. Both Powersoffi" and Sybase
`SQL Server?“ are available from Sybase. Inc. of Etneryville.
`Calif. As the migration to clientrserver continues. each day
`more and more businesses are run from mission-critical
`systems which store information on server-based SQL data-
`
`25
`
`3D
`
`45
`
`55
`
`2
`base systems. such as Sybase SQI. Senrerl“. As a result
`increasingly higher demands are beig placed on server-
`bascd SQL database systems to provide enterprise-wide
`decision support—~providing timely on-line access to critical
`business infonnation (e..g.. through "queries“). Accordingly.
`there is much interest in improving the performance of such
`systems. particularly in the area of database query perfor-
`mance.
`
`the
`'1l'ad.itionally. database management systems {e.g..
`above-described clientlserver database systems) have been
`employed for on—Line
`transaction processing (Ol..TP)—
`posting data (from transactions) to a database table. As part
`ofthis process of putting data “in” a database. OLTP systems
`typically process queries which find a single record. or just
`a few records. A typical quay in an Dl'.'l'P system. for
`instance. might be to retrieve data records for a particular
`customer. such as retrieving records in an airline reservation
`system for Account No. 35. Thus. use of OLTP systems for
`retrieving data has been largely limited to moment-to-
`moment operational needs. where the queries are relatively
`simple and the number of rows rctriaeved (relative to table
`size) is few. In other words. OLTP systems have been
`optimized for this task of finding a “needle in a haystacl-:"—
`that is. finding one or few records which meet a given query
`condition.
`More recently. however. there has been great interest in
`“data warehousing." For these Decision Support Systems
`(DSS) applications. database systems are designed not so
`much for putting information “in” {i.e-.
`transaction
`processing} but more for getting information “out." The
`general approach to providing DSS has been to take an
`SQL-based. OLTP database engine (e.g.. Sybase or Oracle)
`which was really architocted for the OLTP environment {and
`the model that it represents] and attempt to extend that
`engine to handle DSS applications. As a result of the etlort
`to build DSS functionality on top of OLTP database engines.
`the performance of present-day DSS products is generally
`poor. Quite simply. OLTP database engines have been archi-
`tected and optimized for transaction processing to such an
`extent that they are not well-suited for analytical processing.
`Alternatively. some vendors have implemented non-SQL-
`based systems providing DSS capability. These systems
`exist today. in large part. due to the failure of SQI.-based
`database systems at processing DSS tasks. Examples of
`these non-SQL or "on-line analytical processing" (OLAP)
`systems include Pilot” and Comsharel“. These systems
`employ a non-relational model. one which generally allows
`the user to view the data in a cube-lilo: form. Typically. data
`records are imported from SQL databases and then stored. in
`various proprietary formats. Once transposed from SQL
`format into a particular proprietary format. the information
`is usually maintained on one or more servers for access by
`clients. through a proprietary interface. These systems typi-
`cally employ proprietary clients or front ends as well.
`This ttp]:l'OllCl‘l also entails significant disadvantages. how-
`ever. Users prefer to not have to import andlor transpose
`information into various proprietary formats. Instead. user
`would rather simply keep the information in SQL database
`tables. There are at least two reasons for tltis. First.
`the
`information is generally “already there." having been accu-
`mulated by the posting of transactions over a period of time.
`Second. users are already familiar with working with one of
`the main SQL database providas. such as Sybase or Oracle.
`Even if a separate tool is required to implement DSS. users
`prefer to work with tools which are in the same fa.m.ily as the
`tools which they are already using on a day-to-day basis.
`Expectedly. the non-SQI. approach has met with limited
`success.
`
`

`
`5.794
`
`-1-...
`999
`
`3
`What is needed are system and methods with better DSS
`performance. yet within the traditional SQlJre1ationa1
`rrtodel—a model which users demand. From the perspective
`of users. such a system should appear to be essentially an-
`SQL-based relational system. At the same time. however.
`such a system should yield much-irnproved DSS perfor-
`mance. The present invention fulfills this and other needs.
`
`SUMMARY OF THE INVENTION
`
`IE-
`
`The present invention comprises a Clientiserver Database
`System with improved methods for performing database
`queries. particularly DSS-type queries. In an exemplary
`embodiment. the system includes one or more Clients (e.g..
`Terminals or PCs) connected via a Network to a Server. The
`Server. operating under a server operating system (e.g..
`UNIX).
`includes at Database Server System. In general
`operation. Clients store data in and retrieve data from one or
`more database tables resident on the Server by submitting
`SQL commands. some of which specify "queries"—critet'ia
`for selecting particular records of a table.
`According to the present invention. data is not stored in a
`row or a horizontal orientation. as is traditionally done. but
`is instead stored vertically (i.e.. by column). By storing data
`in a column-wise basis. the system of the present invention
`can process a DSS query by bringing in only those columns
`of data which are of interest. A DSS query typically entails
`Ioolcing at 50% or greater of the data and often includes
`GROUP BY clauses (e.g.. on State or Gendtx) and includes
`SUM. COUNT. and AVG clauses (e.g.. average on
`Revenue). Processing such a query using traditional storage
`methodology is highly ineflicient. as row-based (horizontal)
`data pages bring in from disk all columns of data. whether
`or not particular columns are of interest to the query. Thus
`for DSS queries. storing data on a column basis by cell is far
`more preferable than storing data in the traditional row or
`record format.
`
`Since the vast majority of information in real-world DSS
`applications is typically low cardinality data (e.g.. State field
`has only 50 unique values. the majority of the columns of a
`typical DSS table will store low cardinality data on each
`vertical data page. As a result. repetition within a particular
`column is quite high. thus leading to far better compression
`of data pages than can be realized than with row-based
`stcnge. The nature of the data encountered.
`therefore.
`fimher enhances the advantages of column-wise storage or
`vertical partitioning.
`In a typical DSS quay. a database system needs to bring
`in large amounts of information: these queries typically look
`at a large percentage of the database. If the data pages
`actually brought in for the DSS query are populated largely
`or completely with information necessary for the query. then
`H0 block size may be int:-eased to a level which is optimal
`for the envirorunentlplatfcrrn. such as to a level of 64K data
`pages. More particularly. by storing information in a
`colum.n—based format.
`in accordance with the present
`invention. a high saturation level of iuforrnation [sud1 as
`required by a DSS query) can be elficiently met. Instead of
`retrieving row-based data pages consisting of information
`which is largely not of interest to a query. column-based
`pages can be retrieved consisting of infonnation which is
`mostly.
`if not completely. of interest to the query. The
`retrieval itself can be done using more-elficient large block
`HO transfers.
`
`In the system of the present invention. from an SQL
`system catalog. the system can determine a chain of columns
`which represent a particular table. Specifically. from the
`
`45
`
`55
`
`65
`
`4
`chain of columns. the system can represent a logical table to
`the user. Here. the concept of a "table" in the system of the
`present
`invention is purely a catalog logical entry. as
`opposed to a physical entity in which it traditionally exists.
`The columns are “tied together" logically to represent a
`table. From the perspective of the user. however. the vertical
`partitioning is transparent.
`Each column comprises a plurality of “cells” (i.e.. column
`value for a record). which are arranged on a data page in a
`contiguous fashion. For fixed-length fields (e..g.. two char-
`acter Slate field). the ofiset of any particular cell on a page
`can be quickly computed as a modulus of that cell (adjusted
`for any header information to the page). Between the indi-
`vidual cells. there is no overhead information. in contrast to
`sevaal row—based implementations. Since the cells.
`in
`essence. exist as a solid stream. of data. column scans are
`particularly eflicicnt.
`The pages are l"ut1l'tcr optimized for compression by
`storing in the page header‘ 3. status flag indicating whether the
`data page is a candidate for compression and (optionally)
`what type of OOl'lJ.|IC5SiOl1 is best suited for the data on that
`page. Since this is senable on a page-by-page basis. one
`particular compression technique can be applied on one
`column yet at the same time another‘ compression technique
`applied on another’ column (or a diflercnt part of the same
`column}. all within the same (logical) table.
`Data compression is added to the system at the level ofdte
`Cache or Bufier Managers. It is preferable to isolate con1—
`pression here so that each object need not be concerned
`about compressing itself (or even being aware that compres-
`sion is occurring]. As a result. compression is tranwently
`added to all data objects managed by Bufler Managers. The
`data pages of an object are compressed when sent out to disk
`and decompressed when retrieved Erorn disk. yet the object
`itself is unaware of this action.
`
`Most objects within the system. such as tables. indexes.
`logs. and die lilte. exist as pages. As these objects are
`streamed to disk. each simply requests its Buffer Manager to
`store the object on disk. The Manager in tttrn stores the
`object on disk using the best cornpresnon methodology
`known to it. for the given object. In this manner. data
`compression is transparent to the data objects above the
`level of the Buficr Manager.
`To address the potential problem of a modified block no
`longer compressing back down to its original size. a “block
`map" is employed in the system of the present invention.
`Within a Bufier Manager. there can be as many instances of
`a block map as needed. Typically. one exists per object (or
`portion thereof}. For instance. one block map can exist pa’
`B—'ll'ee. or one per portion of a B-Tree (e.g.. non-leaf level
`pages). Each block map. in turn. may be thought of as a
`logical-to-physical translator for the object (or subobject)
`which is its focus. In this manner. the system can concentrate
`on bringing in the object (or portion of the object) which is
`needed. Each page number proviclodto a client serves as an
`index into the couesponding block map for determining the
`actual physical page number. For implementations without
`compression. this translator may be eliminated.
`BRIEF DESCRIPTION OF 'I'l'IE DRAWINGS
`
`FIG. IA is a block diagram of a computer system in which
`the present invention may be embodied.
`FIG. 1B is a block diagram of a software subsystem for
`controlling the operation of the computer system of FIG- 1A.
`FIG. 2 is a block diagram of it clientfservcr system in
`which the present invention is preferably embodied.
`
`

`
`5 .79-‘L329
`
`5
`FIGS. 3A-C are diagrams illustrating column-based data
`storage or “vertical partitioning." which is employed for
`storing data tables in the system of the present invention.
`FIGS. £|A—D are diagrams illustrating "natural data reduc-
`tion” compression of data. types which would otherwise
`include unused bits: natural data reduction may be combined
`with other compression methodologies. as shown particu-
`larly in FIG. 415.
`FIG. 5 is a diagram illustrating per page compression
`rnethodology of die present invention. which allows di.lfer-
`ent compression methodology to be applied to different data
`pages {of either dillerent page chains or even within the
`same page chain].
`DETAJLED DE.SCR.lPTIOI\T OF A PREFERRED
`EMBODIMENT
`
`The following description will focus on the presently
`preferred embodiment of the present invention. which is
`operative in a network environment executing clientfserver
`database applications. The present invention. however. is not
`limited to any particular application or environment. Instead.
`those skilled in the art will find that the present invention
`may be advantageously applied to any application or envi-
`ronment where optimization of query performance is
`desirable. including non-SQL database management sys-
`tems and the lilre. The desuiption of the exemplary embodi-
`ments which follows is. therefore. for the purpose of illus-
`tration and not limitation.
`Standalone System Hardware
`The invention may be embodied on a computer system
`such as the system 100 of FIG. 1A. which comprises a
`central processor 101. a main memory 102. an inputfoutpuit
`controller 103. a keyboard 104. a pointing device 105 (e.g..
`mouse. track ball. pen device. or the like). a screen display
`device 106. and a mass storage 107 (e.g.. hard or listed disk.
`removable disk. optical disk. magneto-optical disk. cr flash
`memory). Processor 101 includes or is coupled to a cache
`memory 109 for storing frequently accessed information;
`memory 109 may be an on-drip cache or external cache (as
`shown). Additional output devicetsl 108. such as a. printing
`device. may be included in the system 100 as desired. As
`shown. the various components of the system 100 commu-
`nicate through a system bus 110 or similar architecture. In a
`1Ieferred embodiment. the system 100 includes an IBM-
`cornpatible personal computer system. available from at
`variety of vendors (including IBM ofArmon|r.. N.Y.).
`Standalone System Software
`Illustrated in FIG. LB. 21 computer software system 150 is
`provided for directing the operation of the computer system
`100. Software system 150. which is stored in system
`memory 102 and on mass storage or disk memory 107.
`includes a kernel or operating system (05) 140 and a
`windows shell 145. One or more application programs. such
`as application software programs 155. may be "loaded" {i.e..
`transferred from storage Ill? into memory 102) for execu-
`tion by the system I00. The system also Includes a user
`interface 160 for receiving user commands and data as input
`and displaying result data as output.
`Also shown. the software system 150 includes a Rela-
`tional Database Man-agernent System (RDBMS) from-end
`or "client" 170. The RDB-MS client 17!] may be any one of
`a numba of database front-ends. including PowerBuilde.r"'“.
`dBA.SE®. Paradox®. lVlic1'osofi® Access. or the like. In an
`exemplary embodiment. the front-end will include SQL
`access drivers {e.g.. Borland SQL Links. Microsoft ODBC
`drivers. Intersolv ODBC drivers. and the like) for accessing
`database tables from an SQL database server operating in a
`Clicntfserver environment.
`
`25
`
`35
`
`55
`
`65
`
`6
`ClienU'Server Database Management System
`While the present invention may operate within a single
`[standalone] computer (e.g.. system 100 of FIG. IA]. the
`present invention is preferably embodied in a multi-user
`computer system. such as :1 Clientfserver system. FIG. 2
`illustrates the general structure of a Clienttserver Database
`System 200 suitable for implementing the present invention.
`As shown. the system 200 comprises one or more Client(sJ
`210 connected to a Server 230 via a Network 220.
`Specifically. the Client(s) 210 comprise one or more stan-
`dalone Terminals 211 connected to a Database Server Sys-
`tem 240 using a conventional network. In an exemplary
`embodiment. the Terminals 211 may themselves comprise a
`plurality of standalone workstations. dumb tennina.ls. or the
`like. or comprise personal computers (PCs) such as the
`above-described system 100. Typically. such units would
`operate under a client operating system. such as Microsoft
`v\FuidowslMS-DOS for PC clients.
`The Database Server System 240. which comprises
`Sybase SQI. Serverfi‘ (available from Sybase. Inc. of
`Erneryville. Calif.) in an exemplary embodiment. generally
`operates as an independent process (is... independently oi
`the Clients). running under a server operating system such as
`Microsoft Windows NT (Microsoft Corp. of Redmond.
`Was]-1.). Netware {Ncvell of Provo. UT). UNIX {Nove.ll). or
`08.32 (IBM). The Network 220 may be any one of a number
`of conventional network systems. including a Local Area
`Network (LAN; or Wide Area Network (WAN,1. as is lcnown
`in the art te.g.. using Ethernet. IBM Token Ring. or the lilre}.
`The Network includes functionality for packaging client
`calls in the well-known SQL {Structured Query Language)
`together with any parameter information into a format (of
`one or more packets} suitable for transmission across at cable
`or wire. for delivery to the Database Server System 240.
`Clientlserver environments. database servers. and net-
`works are well documented in the technical. trade. and
`patent literature. For a discussion of database servers and
`clientlserver environrnents generally. and SQL Server"
`particularly. see. e.g.. Nani. A.. The Guide to SQI. Server.
`Second Edition. Addison-Wesley Publishing Company.
`l995. Additional documentation of SQL Server“! is avail-
`able from Sybase. Inc. as SQL Server Documentation Set
`(Catalog No. 49600). For a discussion of a computer net-
`work employing Microsoft Networltsl0penNet File Sharing
`Protocol. see MEI!-[OD AND SYSTEM FOR O1-Porrrtn~ns11c
`Loctcmo us it NETWORKED COMPUTER S‘rS'I'EJul. IntL Appli-
`cation No. PCl':'US90!ll-I|5'i'0.
`lnll. Publication No. W0
`9l!O3ll24. Intl. Publication Date Mar. 7. I991. For a general
`introduction to a Local Area Network operating under
`Netware. see Freed. L. et al.. PC‘ Magazine Guide to Using
`Ncrlihre. Zifi-Davis Press. l99I. A more detailed discussion
`is available in Neiware 3.x and 4.x and accompanying
`docttrnentation. which is available from Novcll of Provo.
`UT. The disclosures of each of the foregoing are hereby
`incorporated by reference.
`In operation. the Client{s) 210 store data in or retrieve
`data from one or more database tables 250. shown in FIG.
`2. 'l')'picaI!y resident on the Serve: 230. each table itself
`comprises one or more horizontal rows or “records” (triples)
`together with vertical columns or “fit-.lds." Adatabase record
`includes information which is most conveniently repre-
`sented as a single unit. A l't'.t.'.‘.Cl't‘.l for an employee. for
`example. may include information about the employee's ID
`Number. Last Name and First Initial. Position. Date Hired.
`Social Security Number. and Salary. Thus. a typical record
`includes several categories of information about an indi-
`vidual person. plaoe. or thing. Each of these categories. in
`
`

`
`5.794.229
`
`7
`turn. represents a database field. In the foregoing employee
`table. for example. Position is one field. Date Hired is
`another. and so on. With this format. tables are easy for users
`to understand and use. Moreover. the flexibility of tables
`permits a user to define relationships between various items
`of data. as needed
`In operation. the Clientts) issue one or more SQL com-
`mands to the Server. SQI. commands may specify. for
`instance. a query for retrieving particular data (i.e.. data
`records meeting the query condition) from the table 250. The
`syntax of SQL (Structured Query Language) is well docu»
`meutcd; sec. 13.3.. the abovomenticncd An Introduction to
`Database Systems. In addition to retrieving the data from
`Database Server tables. the Clicnt(s) also include the alility
`to insert new rows of data records into the table: Clientts)
`can also modify andlor delete existing records in the table.
`For enhancing the speed in which the Database Server
`stores. retrieves. and presents particular data records. the
`Server maintains one or more database indexes 271 on the
`table. unda control of an Index Manager. Adatabase index.
`typically maintained as a B-Tree data structtlre. allows the
`records of a table to be organized in many dilferent ways.
`depending on a particular user's needs. An index may be
`constructed as a single disk file stering index key values
`together with unique record numbers. The former is a data
`quantity composed of one or more fields from a record: the
`values are used to arrange (logically) the database file
`records by some desired order (index expression)- The latter
`are unique pointers or identifiers to the actual storage
`location ct each record in the database tile. Both tuereferred
`to internally by the system for locating and displaying
`records in a database file. Alternatively. instead of stcring
`unique record numbers. a “clustered” index may be
`employed. This is an index which stores the data pages ofthe
`records themselves on til: terminal or leaf-level nodes of the
`index.
`In operation. the SQL statements received from the one or
`more Clients 2]! (via Network 220) are processed by
`Engine 160 of the Database Saver System 240. The Engine
`260 itself comprises a Parser 261. Nornalizcr 263. Compiler
`265. Execution Unit 269. Access Methods 270. and Bufier
`Manageus) 272.. Specifically. the SQL statements are passed
`to the Pwser 261 which converts the statements into a query
`trce—a binary tree data structnn: which represents the
`components of the query in a format selected for the
`convenience of the system. In this regard. the Parser 261
`employs conventional parsing methodology (e.g.. recursive
`descent parsing).
`The query tree is ncrnsalized by the Normalize: 263.
`Nonualization includes. for example. the elimination of
`redundant data. Additionally. the Normalize: 263 performs
`error checlcing. such as confirming that table names and
`column names which appear in the query are valid (e.g.. are
`available and belong together). Finally. the Normalize: can
`also look up any referential integrity constraints which exist
`and add those to the query.
`After normalization.
`the query tree is passed to the
`Compiler 265. which includes an Optimizer 266 and a Code
`Generator 267. The Optimizer is responsible for optimizing
`the query tree. The Optimizer performs a cost-based analysis
`for formulating a query execution plan. The Optimizer will.
`for instance. select the join order of tables (e.g.. when
`working with more than one table): it will select relevant
`indexes {c.g.. when indexes are available}. The Optimizer.
`thaeforc. performs an analysis of the query and picks the
`best execution plan. which in mm results in particular ones
`of the Access Methods 270 being invoked during query
`execution.
`
`I6
`
`35
`
`45
`
`SD
`
`65
`
`8
`The Code Generator 267. on the other hand. converts the
`quay tree into a set of instructions suitable for satisfying the
`query. These instructions are passed to the Execution Unit
`269. Operating undo" the control of these instructions. the
`Execution Unit 269 generates call

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket