throbber
(12) United States Patent
`Mattis et ai.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US006292880Bl
`US 6,292,880 BI
`Sep.18,2001
`
`(10) Patent No.:
`(45) Date of Patent:
`
`(54) ALIAS-FREE CONTENT-INDEXED OBJECT
`CACHE
`
`(75)
`
`Inventors: Peter Mattis, Belmont; John Plevyak,
`San Francisco, both of CA (US);
`Matthew Haines, Laramie, WY (US);
`Adam Beguelin, San Mateo, CA (US);
`Brian Totty, Foster City, CA (US);
`David Gourley, Palo Alto, CA (US)
`
`(73) Assignee: Inktomi Corporation, Foster City, CA
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.c. 154(b) by 0 days.
`
`(21) Appl. No.: 09/060,886
`
`(22) Filed:
`
`Apr. 15, 1998
`
`(51)
`Int. CI? ...................................................... G06F 12/00
`(52) U.S. CI. ................................ 711/216; 7111118; 707/7
`(58) Field of Search ................................ 7111118, 3, 123,
`7111206,216,221; 707/3, 10,203,7
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`4,918,646 *
`5,542,087 *
`5,611,049
`5,748,954
`5,850,443
`5,870,763
`
`Hirose ...................................... 707/3
`Neimat et al. ......................... 707/10
`Pitts ... ... ... ... .... ... ... ... ... ... .... .... 707/10
`Mauldin ................................... 707/8
`Oorschot et al. ...................... 380/21
`Lomet ... ... ... .... ... ... ... ... ... ...... 707/202
`
`4/1990
`7/1996
`3/1997
`5/1998
`* 12/1998
`2/1999
`
`................... 709/101
`2/1999 Copeland et al.
`5,872,969
`5,978,791 * 11/1999 Farber et al.
`............................ 707/2
`* cited by examiner
`Primary Examiner-Matthew Kim
`Assistant Examiner-Matt Anderson
`(74) Attorney, Agent, or Firm-Hickman Palermo Truong
`& Becker LLP; Christopher 1. Palermo; Marcel K. Bingham
`
`(57)
`
`ABSTRACT
`
`A method for caching information objects is provided.
`Information objects are stored in portions of a non-volatile
`storage device called arenas, which are contiguous regions
`from which space is allocated in parallel. Objects are con(cid:173)
`tiguously allocated within an arena and are mapped to
`directory tables that provide an efficient search mechanism.
`Each object is identified by a name key and a content key.
`The name key is constructed by applying a hash function to
`the composition of the name or URL of the object along with
`implicit or explicit context about the request. The content
`key is constructed by applying a hash function to the entire
`contents of the object data. Buckets and blocks in the
`directory tables store tags and subkeys derived from the
`keys. Since duplicate objects that have different names will
`hash to the same content key, the cache can detect duplicate
`objects even though they have different names, and store
`only one copy of the object. As a result, cache storage usage
`is dramatically reduced, and tracking object aliases is not
`required. The disclosure also encompasses a computer
`apparatus, computer program product, and computer data
`signal embodied in a carrier wave that are configured
`similarly.
`
`26 Claims, 26 Drawing Sheets
`
`c~agTable102
`1Q4a
`
`1Q4b
`
`DirectoryTable 110
`
`112a
`
`110
`
`106
`
`104n
`
`132a-n
`
`121
`
`112n
`
`Size 12.0.
`
`Vector of Alternates
`
`Vector of Alternates
`
`Vector of Alternates
`
`m.a
`Request Info
`Response Info
`Object Key
`
`Page 1
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 1 of 26
`
`US 6,292,880 BI
`
`FIGURE 1
`
`Network 20
`
`3.Q
`Proxy
`
`4Q
`Server
`
`1O.a
`Client
`
`1Qb
`Client
`
`Page 2
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 2 of 26
`
`US 6,292,880 BI
`
`FIGURE 2
`
`au Cache
`
`-.
`
`I Ta~s I
`
`Proxy 3D
`
`IQ
`Protocol
`Engine
`
`~
`
`,.
`I/O Core
`
`6.Q
`
`2.Q
`Intemet
`
`'-
`
`'-
`
`",...
`~
`
`---
`
`~
`
`'-
`
`9.O.a
`
`,
`,
`
`9.Ob
`
`I
`
`90n
`
`-...
`
`-..
`
`~
`
`.......
`....
`
`.-'
`
`5.Q
`Request for object
`
`/
`
`/
`~ J
`
`.1Qa
`Client
`
`Page 3
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 3 of 26
`
`US 6,292,880 BI
`
`FIGURE 3A
`
`52
`Object
`
`5.4
`Hash function
`
`."
`
`(Object Key 56
`
`sa Set Subkey
`
`5..9. Tag
`Subkey
`
`5..[ Remainder Subkey
`
`FIGURE 38
`
`5.3.
`
`Hash function
`
`Object name • SA
`\... •
`
`6.4
`fiB.
`Subkey 1 Subkey 2
`
`62.
`Name Key
`
`Page 4
`
`

`

`...------.
`~ - " -
`131a-- I--
`-- ---
`.----
`131 a.-n-~c"~-----~~ --- - :::--
`.'-
`
`;.
`
`- -'
`
`__ __ •
`
`__ 0-
`
`-
`
`- - - - - - - -
`-----.-
`
`-'---.- - -
`
`-
`
`-f-
`
`FIGURE4A
`
`'-'-,-- --
`
`'"
`
`•
`
`liMa
`
`1Mb
`
`I
`
`--....
`
`(
`104n
`
`106
`
`~
`11Qa
`
`"I
`
`.11Q.b.
`
`11Qc
`
`I
`
`112a
`
`110~
`
`I
`11Qn
`
`e- j-----.
`
`132a ,
`
`-
`
`. -
`
`.- -
`
`.~
`
`. -
`
`-
`
`131n
`
`132n
`
`.-
`
`--
`
`-.- -
`
`- --
`
`~~132a-n
`
`21
`
`r'
`
`d •
`rJl
`•
`~
`~ .....
`~ = .....
`
`'JJ.
`~
`'?
`'""'"
`~CIO
`N
`C
`C
`'""'"
`
`'JJ. =(cid:173)~
`~ .....
`o ....,
`N
`0'1
`
`~
`
`e
`
`rJ'l
`0'1
`N
`\0
`N
`00
`00
`Q
`~
`I--"
`
`112n
`~
`Subkey .116 _
`
`Disk location 118
`
`---~
`
`.--~
`
`Size 120
`
`-L
`
`•
`
`112b
`
`Vector of Alternates 122a J
`Vector of Alternates 122b J
`Vector of Alternates 122n J
`.1 ~ ~
`123a
`Request Info
`Response Info
`I Object Key
`
`I
`
`I
`
`124
`Stored Object
`
`Page 5
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 5 of 26
`
`US 6,292,880 BI
`
`FIGURE 48
`
`Storage device illl.a
`
`.122a
`12.3.g
`"English"J "English"J
`Object Key
`
`12.3..b.
`"IE4"J "IE4"J
`Object Key
`
`123c
`"Navigator/' ..
`Object Key
`
`r-- Stored Object 124a
`
`r 1L-__ 12-::-2_b _----IH Stored Object 1ill. H~,'--__ 1.2_2_n_----'
`
`/
`
`Vectors of Alternates122a-n
`
`Page 6
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 6 of 26
`
`US 6,292,880 BI
`
`DIRECTORY TABLE 110
`
`/
`
`110a
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`SEGMENT 111 a
`
`SEGMENT 111 b
`
`'"'C!":!M~NTA 11
`~I...\..:l c.
`I. C
`
`- -
`
`•
`•
`.!. - -
`•
`•
`•
`•
`•
`.. - - -
`•
`•
`•
`•
`•
`•
`•
`•
`•
`liOn
`
`-
`
`FIGURE 4C
`
`Page 7
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 7 of 26
`
`US 6,292,880 BI
`
`FIGURE 5
`
`Directory Table 110
`r - - - - - - - - - - - - - - - - - - - - - l
`
`Tail
`141e
`
`Head
`141a
`
`141d
`
`141b
`
`Subkey 132 I~
`\
`
`A---+-I -~~ Fragment 208c
`
`I
`I
`
`Fragment 208a
`
`Fragment 208b
`
`Page 8
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 8 of 26
`
`US 6,292,880 BI
`
`FIGURE 6
`
`Pool 2QQa
`
`I
`I
`
`Pool 2llOh
`
`I
`I
`I
`I
`
`]
`
`I
`
`- - - -
`
`~ ~
`
`Arena
`Arena
`.\
`\
`~~---~~-~~~~~204n
`204b
`
`Storage Device lliJB
`
`I
`I
`
`200 n
`'--
`;:
`202
`
`~---:
`/
`
`/
`
`Header
`
`I
`
`I
`
`I
`
`I
`
`I
`
`204a
`
`Arena
`\
`~204a
`
`--~~-
`
`Fragment
`
`208b
`
`I
`
`/
`
`,
`
`2.Q8.d Fragment
`Header
`
`2.O.B..e Fragment Data
`-
`
`/
`
`,
`
`/
`
`/
`
`/
`
`12Qfu; Magic 112flfia Key I [2DJlli Length I
`
`\
`
`~
`
`2.Q2 Pool Header
`Magic
`Version No.
`No. of Arenas
`
`2O.6..a Arena Header
`Empty
`Corrupt
`
`I ,
`2ll6n Arena Header
`Empty
`Corrupt
`
`Page 9
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 9 of 26
`
`US 6,292,880 BI
`
`FIGURE 7
`
`210
`204a
`~~ (
`
`2O.8d Fragment
`Header
`
`2.O.ae Fragment
`Data
`
`Page 10
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 10 of 26
`
`US 6,292,880 BI
`
`FIGURE 8A
`
`to)
`
`"Low water mark"
`"High water mark"
`
`Last arena accessed
`Number of accesses
`
`Garbage
`Collection
`
`8D2
`Select pool
`
`.8.0.4
`Select arena
`
`8.0.6.
`Select fragment
`
`Selection factors
`
`BQ8.
`Fragment to be
`deleted?
`
`B.12
`Mark fragment as
`deleted
`
`81Q
`Write fragment to new
`arena
`
`8.14
`Update Directory Table
`
`( B1fi Don~
`
`Page 11
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 11 of 26
`
`US 6,292,880 BI
`
`FIGURE 88
`
`Write fragment
`to new arena
`
`578
`
`90
`
`Update
`Open
`Index
`
`Yes
`
`Step 806
`
`,..-----!..--....., 582
`Write object header
`in available Arena
`
`Move top
`pointer of Arena
`
`Reset Top
`Pointer
`
`(ma Don~
`
`88
`
`Write object data 1----.1
`in selected Arena
`
`Page 12
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 12 of 26
`
`US 6,292,880 BI
`
`820
`
`822
`
`824
`
`Synchronization
`
`Write object to
`cache
`
`Create metadata
`in Open Directory
`
`Write and synch
`object data to disk
`
`1---826
`
`Get next metadata
`from Open Directory
`
`1---828
`
`FIGURE Be
`
`B21
`Garbage collection
`
`m
`For each fragment in Arena, delete
`directory metadata that points to the
`fragment and write it to disk
`
`B.3.1
`Modify pool header in
`memory to show Arena as
`empty
`
`~
`Write pool header and
`sync to disk
`
`B2.3.
`Copy metadata from Open
`Directory to Directory Table
`
`825
`Sync changes
`
`Page 13
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 13 of 26
`
`US 6,292,880 BI
`
`FIGURE 80
`
`Open Directory:
`"checkout_read"
`
`Key
`
`NO
`
`B32
`Get lock for key
`YES
`
`B..3.4.
`Compute bucket for
`key
`
`NOT FOUND
`
`8.3fi
`Scan blocks in
`bucket's list for key
`FOUND
`
`8.3..8
`Is block being created
`or destroyed?
`
`.8A.Q
`Return "not available"
`error
`
`B..42
`Increment block
`reader count
`
`8..4A
`Return copy of open
`directory block
`
`Ma
`Retum "does not exist"
`error
`
`NOT FOUND
`
`B..4fi
`Cache Directory
`lookup
`FOUND
`
`B.5.Q
`Create Open Directory
`block; initialize block
`from Directory Index;
`add block to bucket's
`block list
`
`Page 14
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 14 of 26
`
`US 6,292,880 BI
`
`FIGURE 8E
`
`Open Directory:
`"checkout_write"
`
`Key
`
`NO
`
`8M
`Get lock for key
`YES
`
`B..5..6
`Compute bucket for
`key
`
`B5.8
`Scan blocks in
`bucket's list for key
`
`.8.6A.
`Is block being created
`or destroyed?
`
`BA.Q
`Return "not available"
`error
`
`a4.2
`Increment block
`reader count
`
`.8..4..4
`Return copy of open
`directory block
`
`B.62
`Return "does not exist"
`error
`
`NOT FOUND
`
`.8.6D
`Cache Directory
`lookup
`FOUND
`
`85D.
`Create Open Directory
`block; initialize block
`from Directory Index;
`add block to bucket's
`block list
`
`Page 15
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 15 of 26
`
`US 6,292,880 BI
`
`FIGURE 8F
`
`Open Directory:
`"checkout_create"
`
`Key
`
`NO
`
`NOT FOUND
`
`Get lock for key
`YES
`BIB.
`Compute bucket for
`key
`
`B..8..Q
`Scan blocks in
`bucket's list for key
`
`FOUND
`
`aB.6
`Return "already exists"
`error
`
`FOUND
`
`.aM
`Cache Directory
`lookup
`
`NOT FOUND
`
`.aM.
`NO
`>----~ Return "not available"
`error
`
`BB.B.
`Create Open Directory
`block; initialize block
`from Directory Index;
`add block to bucket's
`block list
`
`B.9.2
`Return copy of open directory
`block and mark block as
`being modified
`
`BBfi
`Increment block
`reader count
`
`8..8.B.
`Initialize block from
`key
`
`aao
`Increment block writer count
`and mark copy as created
`
`Page 16
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 16 of 26
`
`US 6,292,880 BI
`
`FIGURE 9A
`
`91B.
`Look up key value in
`Directory Table
`
`9Q2
`LOOKUP
`
`9.DA
`Convert object name
`in request to key value
`
`9ili3.
`Look up request key
`value in Open Directory
`
`YES
`
`9.1ll
`Obtain object and
`deliver to client
`
`.922
`Update Open
`Directory if needed
`
`9.12
`Look up request key
`value in Tag Table
`
`YES
`
`am
`Store Tag Table miss
`
`924.
`Cache Miss
`
`92fi
`Obtain object from
`source server
`
`92B
`Done
`
`Page 17
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 17 of 26
`
`US 6,292,880 BI
`
`FIGURE 98
`
`Open Directory
`block checkin
`
`. /
`
`......
`
`I
`I
`
`Block
`
`.....
`./' .
`\1
`
`-
`
`NO
`
`93[l
`Get lock for key
`YES
`
`\I
`
`.9..32
`Is block being
`modified?
`NO
`\/
`9.4.6.
`Decrement reader
`count
`
`YES '-
`B34
`.,. Decrement writer count
`
`\I
`
`.9.3..6
`Is checkin successful?
`
`NO
`
`9..42
`YES .....
`,. Copy block info back to
`original block
`I
`
`NO
`
`9..3.8
`Is delete checkin flag
`set?
`YES
`\V
`
`9.4A
`Mark block as deleted
`
`-
`
`9A.O. Done
`
`Page 18
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 18 of 26
`
`US 6,292,880 BI
`
`FIGURE 9C
`
`Key
`
`95D.
`Return not found
`
`9A.8.
`Check out directory
`blocks
`
`S5A
`Check in directory
`blocks
`
`95fi
`Return found
`
`Page 19
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 19 of 26
`
`US 6,292,880 BI
`
`FIGURE 90
`
`Key
`
`FAILURE
`
`9.5.a
`Check out directory
`blocks
`
`SUCCESS
`
`9.5A.
`Check in directory blocks
`with deletion flag set
`
`Page 20
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 20 of 26
`
`US 6,292,880 BI
`
`I Cache open read
`
`,-....
`
`FIGURE 9E
`
`-
`
`Object name
`Request info
`
`FAILURE
`
`\v
`- - -
`912
`FAILURE ""-
`." Return no document
`error
`
`9M
`Check out vector for
`reading
`SUCCESS
`\1
`
`9.6.6.
`Use request info to select
`alternate from vector
`
`SUCCESS
`,V
`.9.6.8
`Check in vector
`
`9lQ
`Read object pointed to
`by alternate
`
`Page 21
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 21 of 26
`
`US 6,292,880 BI
`
`Cache open write
`
`)IIIi<~--f
`
`FIGURE 9F
`
`Object name
`Request info
`Response info
`
`\V
`
`Check out vector for
`writing
`SUCCESS
`
`-
`
`FAILURE
`
`9.8A
`Create vector
`
`9.B2
`FAILURE
`r--I...----=~~ Return no document
`error
`
`9lfi
`Use request info to define
`alternate for vector; add
`alternate to vector
`
`SUCCESS
`'V
`9lB.
`Check in vector
`
`" 9.8.0.
`
`Write object pointed to
`by alternate
`
`Page 22
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 22 of 26
`
`US 6,292,880 BI
`
`FIGURE 9G
`
`Object name
`Old identifier
`Request info
`Response info
`
`""\
`
`./
`
`-
`
`9Ba
`..... Return not available
`error
`
`/'
`
`\...
`
`FAILURE
`
`Cache update
`
`.,.
`--
`
`\V
`9.8.6
`Check out vector for
`writing
`
`SUCCESS
`\ V
`aaa
`Construct new clone
`vector
`
`\V
`
`9ili1
`Write new clone vector
`
`t
`
`9.92.
`Check in clone vector
`
`\ I
`
`9.9A Done
`
`Page 23
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 23 of 26
`
`US 6,292,880 BI
`
`FIGURE 10A
`
`65
`
`r----I~ Request Arena with
`available space
`
`fi5fi
`Store contents of 'Mite
`aggregation buffer in Arena
`
`660
`
`Update index
`
`642
`
`Allocation
`and Write Object
`
`Look up and retrieve
`object from origin
`server
`
`Get object length
`from HITP request, if
`
`present ,
`
`Store object in write
`aggregation buffer
`
`648
`
`fi5.D.
`Write
`aggregation
`buffer full?
`
`Page 24
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 24 of 26
`
`US 6,292,880 BI
`
`FIGURE 108
`
`620
`
`622
`
`624
`
`Scaled Counter
`Updating
`
`all counters
`
`Sum all
`counter values
`
`626
`
`Compute Maximum
`Value that can be
`represented by a counter
`
`Decrement all
`counter values by 1
`
`632
`
`Page 25
`
`

`

`FIG. 11
`
`DISPLAY
`1112
`
`INPUT DEVICE
`111A
`
`CURSOR
`CONTROL
`111.fi
`
`MAIN I:l
`
`MEMORY
`.11Ofi
`
`STORAGE
`DEVICE
`111.Q
`
`BUS
`
`1102
`
`PROCESSOR
`.11.0.4
`
`COMMUN ICA TION
`INTERFACE
`
`.1118
`
`d •
`rJl
`•
`~
`~ .....
`~ = .....
`
`'JJ.
`~ '?
`'""'"
`~CIO
`N
`C
`C
`'""'"
`
`'JJ. =(cid:173)~
`~ .....
`N
`Ul
`o ....,
`N
`0'1
`
`e
`
`rJ'l
`0'1
`N
`\0
`N
`00
`00
`Q
`~
`I-"
`
`I I SERV~~30
`
`1128
`
`HOST
`.1124
`
`Page 26
`
`

`

`u.s. Patent
`
`Sep.18,2001
`
`Sheet 26 of 26
`
`US 6,292,880 BI
`
`FIGURE 12
`
`1212
`(information object has
`not been loaded recently)
`
`12.1A
`Tell process "no'
`
`121fi
`Update expiration date value
`of information object to the
`current date or time
`
`12il2
`External process asks cache
`whether an object has been
`loaded by a client recently
`
`12M.
`Cache locates the information
`object in the cache
`
`12illi
`Read a Read Counter value
`associated in directory tables
`with the information object
`
`NO
`
`121Q
`Tell process 'yes"
`
`Page 27
`
`

`

`US 6,292,880 Bl
`
`1
`ALIAS-FREE CONTENT-INDEXED OBJECT
`CACHE
`
`FIELD OF THE INVENTION
`
`The present invention relates to information delivery, and
`relates more specifically to a cache for information objects
`that are to be delivered efficiently and at high speed over a
`network to a client.
`
`BACKGROUND OF THE INVENTION
`
`Several important computer technologies rely, to a great
`extent, upon rapid delivery of information from a central
`storage location to remote devices. For example, in the
`client/server model of computing, one or more servers are 15
`used to store information. Client computers or processes are
`separated from the servers and are connected to the servers
`using a network. The clients request information from one of
`the servers by providing a network address of the informa(cid:173)
`tion. The server locates the information based on the pro- 20
`vided network address and transmits it over the network to
`the client, completing the transaction.
`The World Wide Web is a popular application of the
`client/server computing model. FIG. 1 is a simplified block
`diagram of the relationship between elements used in a Web 25
`system. One or more web clients lOa, lOb, each of which is
`a computer or a software process such as a browser program,
`are connected to a global information network 20 called the
`Internet, either directly or through an intermediary such as
`an Internet Service Provider, or an online information ser- 30
`vIce.
`A web server 40 is likewise connected to the Internet 20
`by a network link 42. The web server 40 has one or more
`internet network addresses and textual host names, associ(cid:173)
`ated in an agreed-upon format that is indexed at a central
`Domain Name Server (DNS). The server contains multime(cid:173)
`dia information resources, such as documents and images, to
`be provided to clients upon demand. The server 40 may
`additionally or alternatively contain software for dynami(cid:173)
`cally generating such resources in response to requests.
`The clients lOa, lOb and server 40 communicate using
`one or more agreed-upon protocols that specify the format of
`the information that is communicated. A client lOa looks up
`network address of a particular server using DNS and
`establishes a connection to the server using a communica(cid:173)
`tion protocol called the Hypertext Transfer Protocol
`(HTTP). A Uniform Resource Locator (URL) uniquely
`identifies each information object stored on or dynamically
`generated by the server 40. A URL is a form of network
`address that identifies the location of information stored in
`a network.
`A key factor that limits the performance of the World
`Wide Web is the speed with which the server 40 can supply
`information to a client via the Internet 20. Performance is
`limited by the speed, reliability, and congestion level of the
`network route through the Internet, by geographical distance
`delays, and by server load level. Accordingly, client trans(cid:173)
`action time can be reduced by storing replicas of popular
`information objects in repositories geographically dispersed
`from the server. Each local repository for object replicas is
`generally referred to as a cache. A client may be able to
`access replicas from a topologically proximate cache faster
`than possible from the original web server, while at the same
`time reducing Internet server traffic.
`In one arrangement, as shown in FIG. 1, the cache is
`located in a proxy server 30 that is logically interposed
`
`2
`between the clients lOa, lOb and the server 40. The proxy
`server provides a "middleman" gateway service, acting as a
`server to the client, and a client to the server. A proxy server
`equipped with a cache is called a caching proxy server, or
`5 commonly, a "proxy cache".
`The proxy cache 30 intercepts requests for resources that
`are directed from the clients lOa, lOb to the server 40. When
`the cache in the proxy 30 has a replica of the requested
`resource that meets certain freshness constraints, the proxy
`10 responds to the clients lOa, lOb and serves the resource
`directly. In this arrangement, the number and volume of data
`transfers along the link 42 are greatly reduced. As a result,
`network resources or objects are provided more rapidly to
`the clients lOa, lOb.
`A key problem in such caching is the efficient storage,
`location, and retrieval of objects in the cache. This document
`concerns technology related to the storage, location, and
`retrieval of multimedia objects within a cache. The object
`storage facility within a cache is called a "cache object
`store" or "object store".
`To effectively handle heavy traffic environments, such as
`the World Wide Web, a cache object store needs to be able
`to handle tens or hundreds of millions of different objects,
`while storing, deleting, and fetching the objects simulta(cid:173)
`neously. Accordingly, cache performance must not degrade
`significantly with object count. Performance is the driving
`goal of cache object stores.
`Finding an object in the cache is the most common
`operation and therefore the cache must be extremely fast in
`carrying out searches. The key factor that limits cache
`performance is lookup time. It is desirable to have a cache
`that can determine whether an object is in the cache (a "hit")
`or not (a "miss") as fast as possible. In past approaches,
`35 caches capable of storing millions of objects have been
`stored in traditional file system storage structures. Tradi(cid:173)
`tional file systems are poorly suited for multimedia object
`caches because they are tuned for particular object sizes and
`require multiple disk head movements to examine file sys-
`40 tem metadata. Object stores can obtain higher lookup per(cid:173)
`formance by dedicating DRAM memory to the task of object
`lookup, but because there are tens or hundreds of millions of
`objects, the memory lookup tables must be very compact.
`Once an object is located, it must be transferred to the
`45 client efficiently. Modern disk drives offer high performance
`when reading and writing sequential data, but suffer signifi(cid:173)
`cant performance delays when incurring disk head move(cid:173)
`ments to other parts of the disk. These disk head movements
`are called "seeks". Disk performance is typically con-
`50 strained by the drive's rated seeks per second. To optimize
`performance of a cache, it is desirable to minimize disk
`seeks, by reading and writing contiguous blocks of data.
`Eventually, the object store will become full, and particu(cid:173)
`lar objects must be expunged to make room for new content.
`55 This process is called "garbage collection". Garbage collec(cid:173)
`tion must be efficient enough that it can run continually
`without providing a significant decrease in system
`performance, while removing objects that have the least
`impact on future cache performance.
`60 PAST APPROACHES
`In the past, four approaches have been used to structure
`cache object stores: using the native file system, using a
`memory-blocked "page" cache, using a database, and using
`a "cyclone" circular storage structure. Each of these prior
`65 approaches has significant disadvantages.
`The native file system approach uses the file system of an
`operating system running on the server to create and manage
`
`Page 28
`
`

`

`US 6,292,880 Bl
`
`3
`a cache. File systems are designed for a particular applica(cid:173)
`tion in mind: storing and retrieving user and system data
`files. File systems are designed and optimized for file
`management applications. They are optimized for typical
`data file sizes and for a relatively small number of files (both 5
`total and within one folder/directory). Traditional file sys(cid:173)
`tems are not optimized to minimize the number of seeks to
`open, read/write, and close files. Many file systems incur
`significant performance penalties to locate and open files
`when there are large numbers of files present. Typical file 10
`systems suffer fragmentation, with small disk blocks scat(cid:173)
`tered around the drive surface, increasing the number of disk
`seeks required to access data, and wasting storage space.
`Also, file systems, being designed for user data file
`management, include facilities irrelevant to cache object
`stores, and indeed counter-productive to this application. 15
`Examples include: support for random access and selective
`modification, file permissions, support for moving files,
`support for renaming files, and support for appending to files
`over time. File systems are also invest significant energy to
`minimize any data loss, at the expense of performance, both 20
`at write time, and to reconstruct the file system after failure.
`The result is that file systems are relatively poorly for
`handling the millions of files that can be present in a cache
`of Web objects. File systems don't efficiently support the
`large variation in Internet multimedia object size-in par- 25
`ticular they typically do not support very small objects or
`very large objects efficiently. File systems require a large
`number of disk seeks for metadata traversal and block
`chaining, poorly support garbage collection, and take time to
`ensure data integrity and to repair file systems on restart.
`The page cache extends file systems with a set of fixed
`sized memory buffers. Data is staged in and out of these
`buffers before transmission across the network. This
`approach wastes significant memory for large objects being
`sent across slow connections.
`The database system approach uses a database system as
`a cache. Generally, databases are structured to achieve goals
`that make them inappropriate for use as an object cache. For
`example, they are structured to optimize transaction pro(cid:173)
`cessing. To preserve the integrity of each transaction, they 40
`use extensive locking. As a result, as a design goal they favor
`data integrity over performance factors such as speed. In
`contrast, it is acceptable for an object cache to lose data
`occasionally, provided that the cache does not corrupt
`objects, because the data always can be retrieved from the
`server that is original source of the data. Databases are often
`optimized for fast write performance, since write speed
`limits transaction processing speed. However, in an object
`cache, read speed is equally important. Further, databases
`are not naturally good at storing a vast variety of object sizes 50
`while supporting streaming, pipe lined I/O in a virtual
`memory efficient manner. Databases commonly optimized
`for fixed record size sizes. Where databases support variable
`record sizes, they contain support for maintaining object
`relationships that are redundant, and typically employ slow, 55
`virtual memory paging techniques to support streaming,
`pipelined I/O.
`In a cyclonic file system, data is allocated around a
`circular storage structure. When space becomes full, the
`oldest data is simply removed. This approach allows for fast 60
`allocation of data, but makes it difficult to support large
`objects without first staging them in memory, suffers prob(cid:173)
`lems with fragmentation of data, and typically entails naIve
`garbage collection that throws out the oldest object, regard(cid:173)
`less of its popularity. For a modest, active cache with a 65
`diverse working set, such first-in-first-out garbage collection
`can throw objects out before they get to be reused.
`
`4
`The fundamental problem with the above approaches for
`the design of cache object stores is that the solution isn't
`optimized for the constraints of the problem. These
`approaches all represent reapplication of existing technolo(cid:173)
`gies to a new application. None of the applications above are
`ideally suited for the unique constraints of multimedia,
`streaming, object caches. Not only do the above solutions
`inherently encumber object caches with inefficiencies due to
`their imperfect reapplication, but they also are unable to
`effectively support the more unique requirements of multi(cid:173)
`media object caches. These unique requirements include the
`ability to disambiguate and share redundant content that is
`identical, but has different names, and the opposite ability to
`store multiple variants of content with the same name,
`targeted for particular clients, languages, data types, etc.
`Based on the foregoing, there is a clear need to provide an
`object cache that overcomes the disadvantages of these prior
`approaches, and is more ideally suited for the unique
`requirements of multimedia object caches. In particular:
`1. there is a need for an object store that can store
`hundreds of millions of objects of disparate sizes, and
`a terabyte of content size in a memory efficient manner;
`2. there is a need for an object store that can determine if
`a document is a "hit" or a "miss" quickly, without
`time-consuming file directory lookups;
`3. there is a need for a cache that minimizes the number
`of disk seeks to read and write objects;
`4. there is a need for an object store that permits efficient
`streaming of data to and from the cache;
`5. there is a need for an object store that supports multiple
`different versions of targeted alternates for the same
`name;
`6. there is a need for an object store that efficiently stores
`large numbers of objects without content duplication;
`7. there is a need for an object store that can be rapidly and
`efficiently garbage collected in real-time, insightfully
`selecting the documents to be replaced to improve user
`response speed, and traffic reduction;
`8. there is a need for an object store that that can restart
`to full operational capacity within seconds after soft(cid:173)
`ware or hardware failure without data corruption and
`with minimal data loss.
`This document concerns technology directed to accom-
`45 plishing the foregoing goals. In particular, this document
`describes methods and structures related to the time-efficient
`and space-efficient storage, retrieval, and maintenance of
`objects in a large object store. The technology described
`herein provides for a cache object store for a high(cid:173)
`performance, high-load application having the following
`general characteristics:
`1. High performance, measured in low latency and high
`throughput for object store operations, and large num(cid:173)
`bers of concurrent operations;
`2. Large cache support, supporting terabyte caches and
`billions of objects, to handle the Internet's exponential
`content growth rate;
`3. Memory storage space efficiency, so expensive semi(cid:173)
`conductor memory is used sparingly and effectively;
`4. Disk storage space efficiency, so large numbers of
`Internet object replicas can be stored within the finite
`disk capacity of the object store;
`5. Alias free, so that multiple objects or object variants,
`with different names, but with the same content iden(cid:173)
`tical object content, will have the object content cached
`only once, shared among the different names;
`
`30
`
`35
`
`Page 29
`
`

`

`US 6,292,880 Bl
`
`6
`(E) storing the content key referencing the content object in
`each alternate vector element; and (F) establishing in the
`cache table, mappings from content keys to object content
`data.
`The invention also encompasses an apparatus, computer
`system, computer program product, and a computer data
`signal embodied in a carrier wave configured according to
`the foregoing aspects, and other aspects.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`5
`6. Support for multimedia heterogeneity, efficiently sup(cid:173)
`porting diverse multimedia objects of a multitude of
`types with size ranging over six orders of magnitude
`from a few hundred bytes to hundreds of megabytes;
`7. Fast, usage-aware garbage collection, so less useful 5
`objects can be efficiently removed from the object store
`to make room for new objects;
`8. Data consistency, so programmatic errors and hardware
`failures do not lead to corrupted data;
`9. Fast restartability, so an object cache can begin servic-
`ing requests within seconds of restart, without requiring
`a time-consuming database or file system check opera(cid:173)
`tion;
`10. Streaming, so large objects can be efficiently pipelined
`from the object store to slow clients, without staging
`the entire object into memory;
`11. Support for content negotiation, so proxy caches can
`efficiently and flexibly store variants of objects for the
`same URL, targeted on client browser, language, or 20
`other attribute of the client request; and
`12. General-purpose applicability, so that the object store
`interface is sufficiently flexible to meet the needs of
`future media types and protocols.
`
`10
`
`SUMMARY OF THE INVENTION
`
`The present invention is illustrated by way of example,
`and not by way of limitation, in the figures of the accom(cid:173)
`panying drawings and in which like reference numerals refer
`15 to similar elements and in which:
`FIG. 1 is a block diagram of a client/server relationship;
`FIG. 2 is a block diagram of a traffic server;
`FIG. 3Ais a block diagram of transformation of an object
`into a key;
`FIG. 3B is a block diagram of transformation of an object
`name into a key;
`FIG. 4A is a block diagram of a cache;
`FIG. 4B is a block diagram of a storage mechanism for
`25 Vectors of Alternates;
`FIG. 4C is a block diagram of multi-segment directory
`table;
`FIG. 5 is a block diagram of pointers relating to data
`fragments;
`FIG. 6 is a block diagram of a storage device and its
`contents;
`FIG. 7 is a block diagram showing the structure of a pool;
`FIG. 8A IS a flow diagram of a process of garbage
`35 collection;
`FIG. 8B is a flow diagram of a process of writing
`information in a storage device;
`FIG. 8C is a flow diagram of a process of synchronization;
`FIG. 8D is a flow diagram of a "checkout_read" process;
`FIG. 8E is a flow diagram of a "checkout_write" process;
`FIG. 8F is a flow diagram of a "checkout_create" pro-
`cess;
`FIG. 9A is a flow diagram of a cache lookup process;
`FIG. 9B is a flow diagram of a "checkin" process;
`FIG. 9C is a flow diagram of a cache lookup process;
`FIG. 9D is a flow diagram of a cache remove process;
`FI

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket