throbber
i'r3‘-ifiaflii'i3.""E~nti'.'.'.i”’TENii-'.’i‘.1'".'EH.'I'ti‘.".'1".4iiI]HfIf1:IELII
`
`
`
`
`
`
`
`50269-Ul5(3401-003)
`
`Ikuenf
`
`UNITED STATES PATENT APPLICATION
`
`FOR
`
`ALIAS-FREE CONTENT-INDEXED OBJECT CACHE
`
`INVENTORS I
`
`Q
`PETER MATHS I ‘gt;
`
`‘aw
`DAVID URLEY [,1
`
`
`
`PREPARED BY:
`
`MCDERMOTT.
`99 CA
`
`0
`
`O
`
`RY
`AZA
`
`703) 518-
`
`EXPRESS MAIL CERTIFICATE OF MAILING
`
`7 £
`"Express Mail" mailing label number
`/J 5, Z 3
`DateofDepos1't
`I hereby certify that this paper orfee is being epustt
`wi
`the United States Postal Service "Express Mail Post
`Office to Addressee" service under 37 CFR 1.10 an the date indicated above and is addressed to the Assistant
`Commissioner fur Patents. Washington. DC. 20231.
`
`6
`
`43x5%:vs4.¢A=/uzw
`€xs*:7¢}’
`(Typed or printed name of person rnaiiing paper or Fee)
`5
`or fee)
`
`(Signature of person l'|'|3i]ll'lg p
`
`50259-o1s(34o1-003)
`
`MICROSOFT
`
`EXHIBIT 1005
`
`Page1of119
`
`Page 1 of 119
`
`

`
`
`
`ABSTRACT OF THE DISCLOSURE
`
`A method for caching information objects is provided. Information objects
`
`are stored in portions of a non-volatile storage device called arenas, which are
`
`contiguous regions fiom which space is allocated in parallel. Objects are
`
`S
`
`eontiguously allocated within an arena and are mapped to directory tables that
`
`provide an efficient search mechanism. Each object is identified by a name key and a
`
`content key. The name key is constructed by applying a hash firnction to the
`
`composition of the name or URL of the object along with implicit or explicit context
`
`about the request. The content key is constructed by applying a hash function to the
`
`10
`
`15
`
`entire contents of the object data. Buckets and blocks in the directory tables store
`tags and subkeys derived from the keys. Since duplicate objects that have difierent
`names will hash to the same content key, the cache can detect duplicate objects even
`though they have different names, and store only one copy of the object. As a result.
`cache storage usage is dramatically reduced, andtracking object aliases is not
`required. The disclosure also encompasses a computer apparatus, computer program
`
`product, and computer data signal embodied in a carrier wave that are configured
`
`similarly.
`
`03401-002
`
`Page 2 of 119
`
`Page 2 of 119
`
`

`
`FIELD OF THE INVENTION
`
`The present invention relates to information delivery, and relates more
`
`specifically to a cache for information objects that are to he delivered efficiently and
`
`at high speed over a network to a client.
`
`BACKGROUND OF THE INVENTION
`
`Several important computer technologies rely. to a great extent, upon rapid
`
`delivery of information from a central storage location to remote devices. For
`
`example, in the clientlserver model of computing, one or more servers are used to
`
`10
`
`store information. Client computers or processes are separated from the servers and
`
`are connected to the servers using a network. The clients request information from
`
`one of the servers by providing a network address of the information. The server
`
`locates the information based on the provided network address and transrnits it over
`
`
`
`.:::
`
`.1"-1!-'utE.‘t
`
` IS
`
`the network to the client, completing the transaction.
`
`The World Wide Web is a popular application of the clientlserver computing
`
`model. Figure 1 is a simplified block diagram ofthe relationship between elements
`
`used in a Web system. One or more web clients 10a, 10b, each of which is a
`
`computer or a software process such as a browser program, are connected to a global
`
`information network 20 called the Internet, either directly or through an intermediary
`
`20
`
`such as an Internet Service Provider, or an online information service.
`
`A web server 40 is likewise connected to the Internet 20 by a network link
`
`42. The web server 40 has one or more intemet network addresses and textual host
`
`names, associated in an agreed-upon format that is indexed at a central Domain
`
`50269-015 (340l—003)
`
`Page3of119
`
`Page 3 of 119
`
`

`
`:1‘-"t‘ti
`.5.
`.-'==.
`
`.
`
`
`
`
`
`IE?»Il.':':n{E5"ll-"'|I'1IJlill’”“I-.'||
`
`Name Server (DNS). The server contains multimedia information resources, such as
`
`documents and images, to be provided to clients upon demand. The server 40 may
`
`additionally or alternatively contain software for dynamically generating such
`
`resources in response to requests.
`
`5
`
`The clients 10a, 101:: and server 40 communicate using one or more agreed-
`
`upon protocols that specify the format of the information that is communicated. A
`
`client 10a looks up network address of a particular server using DNS and establishes
`
`a connection to the server using a communication protocol called the Hypertext
`
`Transfer Protocol {I-ITTP}. A Uniform Resource Locator (URL) uniquely identifies
`
`10
`
`each information object stored on or dynamically generated by the server 40. A
`
`URL is a form of network address that identifies the location of information stored
`
`in a network.
`
`A key factor that limits the performance of the World Wide Web is the speed
`
`with which the server 40 can supply information to a client via the Internet 20.
`
`15
`
`Performance is limited by the speed, reliability, and congestion level of the network
`
`route through the Internet, by geographical distance delays, and by server load level.
`
`Accordingly, client transaction time can be reduced by storing replicas of popular
`
`information objects in repositories geographically dispersed from the server. Each
`
`local repository for object replicas is generally referred to as a cache. A client may
`
`20
`
`be able to access replicas from a topologically proximate cache faster than possible
`
`from the original web server, while at the same time reducing Internet server traffic.
`
`In one arrangement, as shown in Figure I, the cache is located in a proxy
`
`server 30 that is logically interposed between the clients Illa, 10b and the server 40.
`
`50269-015 (3401-003}
`
`Page4of119
`
`Page 4 of 119
`
`

`
`
`
`5ill!HITJI".3.-llJllfliSE14HUI
`
`
`
`
`
`
`
`
`
`
`
`
`iifiiilti]:5;"-iiii"-'5:"'ulift"
`
`
`
`
`
`The proxy server provides a “ rniddlernan” gateway service, acting as a server to the
`
`client, and a client to the server. A proxy server equipped with a cache is called a
`
`caching proxy server, or commonly, a “proxy cache”.
`
`The proxy cache 30 intercepts requests for resources that are directed from
`
`5
`
`the clients 10a, 10b to the server 40. When the cache in the proxy 30 has a replica of
`
`the requested resource that meets certain freshness constraints, the proxy responds to
`
`the clients 10a, 10b and serves the resource directly. In this arrangement, the number
`
`and volume of data transfers along the link 42 are greatly reduced. As a result,
`
`network resources or objects are provided more rapidly to the clients 10a, 10b.
`
`ID
`
`A key problem in such caching is the efiicient storage, location, and retrieval
`
`of objects in the cache. This document concerns technology related to the storage,
`
`location, and retrieval of multimedia objects within a cache. The object storage
`
`facility within a cache is called a “cache object store” or “object store".
`
`To effectively handle heavy traflic environments, such as the World Wide
`
`15 Web, a cache object store needs to be able to handle tens or hundreds of millions of
`
`different objects, while storing, deleting, and fetching the objects simultaneously.
`
`Accordingly, cache performance must not degrade significantly with object count.
`
`Performance is the driving goal of cache object stores.
`
`Finding an object in the cache is the most common operation and therefore
`
`20
`
`the cache must be extremely fast in carrying out searches. The key factor that limits
`
`cache performance is lookup time. It is desirable to have a cache that can determine
`
`whether an object is in the cache (a "hit") or not (a “rniss") as fast as possible. In
`
`past approaches, caches capable of storing millions of objects have been stored in
`
`50269-015 (3401-003)
`
`L-{
`
`Page5of119
`
`Page 5 of 119
`
`

`
`traditional file system storage structures. Traditional file systems are poorly suited
`
`for multimedia object caches because they are tuned for particular object sizes and
`
`require multiple disk head movements to examine file system metadata. Object
`
`stores can obtain higher lookup performance by dedicating DRAM memory to the
`
`5
`
`task of object lookup, but because there are tens or hundreds of millions of objects,
`
`the memory lookup tables must be very compact.
`
`Once an object is located, it must be transferred to the client efliciently.
`Modern disk drives offer high performance when reading and writing sequential
`data, but suffer significant performance delays when incurring disk bead movements
`to other parts ofthe disk. These disk head movements are called “seeks”. Disk
`performance is typically constrained by the drive’s rated seeks per second. To
`optimize performance of a cache, it is desirable to
`disk seeks, by reading
`and writing contiguous blocks ofdata.
`Eventually, the object store will become fiill, and particular objects must be
`expunged to make room for new content. This process is called “garbage
`
`10
`
`15
`
`collection" . Garbage collection must be efiicient enough that it can run continually
`
`without providing a significant decrease in system performance, while removing
`
`objects that have the least impact on future cache perfonnance.
`
`20
`
`PAST APPROACHES
`
`In the past, four approaches have been used to structure cache object stores:
`
`using the native file system, using a memory-blocked “ page" cache, using a
`
`S0269-D15 (3401-003)
`
`R
`
`Page6of119
`
`Page 6 of 119
`
`

`
`
`
`
`
`‘}.":_ii"ll-"."l1rlllfii“"‘Jill.'ii:‘SElitill‘Flll'.'.HEn[Ii
`
`
`
`
`
`
`
`
`
`database, and using a “ cyclone" circular storage structure. Each of these prior
`
`approaches has significant disadvantages.
`
`The native file system approach uses the file system of an operating system
`
`running on the server to create and manage a cache. File systems are designed for a
`
`‘J!
`
`particular application in mind: storing and retrieving user and system data files. File
`
`systems are designed and optimized for file management applications. They are
`
`optimized for typical data file sizes and for a relatively small number of files (both
`
`total and within one folderldirectory). Traditional file systems are not optimized to
`
`minimize the number of seeks to open, readfwrite, and close files. Many file
`
`10
`
`systems incur significant performance penalties to locate and open files when there
`
`are large numbers of files present. Typical file systems suffer fragmentation, with
`
`small disk blocks scattered around the drive surface, increasing the number of disk
`
`seeks required to access data, and wasting storage space. Also, file systems, being
`
`designed for user data file management, include facilities irrelevant to cache object
`
`15
`
`stores, and indeed counterproductive to this application. Examples include: support
`
`for random access and selective modification, file permissions, support for moving
`
`files, support for renaming files, and support for appending to files over time. File
`
`systems are also invest significant energy to
`
`any data loss, at the expense
`
`of perforrnance, both at write time, and to reconstruct the file system alter failure.
`
`20
`
`The result is that file systems are relatively poorly for handling the millions of files
`
`that can be present in a cache of Web objects. File systems don't efficiently support
`
`the large variation in Internet multimedia object size —— in particular they typically
`
`do not support very small objects or very large objects efiiciently. File systems
`
`50259-015 (3401-003)
`
`Page? of119
`
`Page 7 of 119
`
`

`
`E
`
`require a large number of disk seeks for rnetadata traversal and block chaining,
`
`poorly support garbage collection, and take time to ensure data integrity and to
`
`repair file systems on restart.
`
`The page cache extends file systems with a set of fixed sized memory
`
`5
`
`bufiers. Data is staged in and out of these buffers before transmission across the
`
`network. This approach wastes significant memory for large objects being sent
`
`across slow connections.
`
`The database system approach uses a database system as a cache. Generally,
`databases are structured to achieve goals that make them inappropriate for use as an
`object cache. For example, they are structured to optimize transaction processing.
`"To preserve the integrity ofeach transaction, they use extensive locking. As a result,
`as a design goal they favor data integrity over performance factors such as speed.
`In
`contrast, it is acceptable for an object cache to lose data occasionally, provided that
`the cache does not corrupt objects, because the data always can be retrieved from the
`server that is original source ofthe data. Databases are often optimized for fast write
`
`10
`
`15
`
`performance, since write speed limits transaction processing speed. However, in an
`
`object cache, read speed is equally important. Further, databases are not naturally
`
`good at storing a vast variety of object sizes while supporting streaming, pipelined
`
`V0 in a virtual memory efficient manner. Databases commonly optimized for fixed
`
`20
`
`record size sizes. Where databases support variable record sizes, they contain
`
`support for maintaining object relationships that are redundant, and typically employ
`
`slow, virtual memory paging techniques to support streaming, pipelined U0.
`
`50269-015 (3401—003)
`
`Page8of119
`
`Page 8 of 119
`
`

`
`
`
`
`
`tiltEin"dT"hlift“'"Ed’I':lIiiiif]!"321!EftliE'.tIU
`
`
`
`
`
`In a cyclonic file system, data is allocated around a circular storage structure.
`
`When space becomes full, the oldest data is simply removed. This approach allows
`
`for Fast allocation of data, but makes it difficult to support large objects without first
`
`staging them in memory, suffers problems with fragmentation of data, and typically
`
`5
`
`entails na'1've garbage collection that throws out the oldest object, regardless of its
`
`popularity. For a modest, active cache with a diverse working set, such first-in-first-
`
`out garbage collection can throw objects out before they get to be reused.
`
`The fundamental problem with the above approaches for the design of cache
`
`object stores is that the solution isn’t optimized for the constraints of the problem.
`
`10
`
`These approaches all represent reapplication of existing technologies to a new
`
`application. None of the applications above are ideally suited for the unique
`
`constraints of multimedia, streaming, object caches. Not only do the above solutions
`
`inherently encumber object caches with inefficierrcies due to their imperfect
`
`reapplication, but they also are unable to effectively support the more unique
`
`15
`
`requirements of multimedia object caches. These unique requirements include the
`
`ability to disambiguate and share redundant content that is identical, but has
`
`different names, and the opposite ability to store multiple variants of content with
`
`the same name, targeted for particular clients, languages, data types, etc.
`
`Based on the foregoing, there is a clear need to provide an object cache that
`
`20
`
`overcomes the disadvantages of these prior approaches, and is more ideally suited
`
`for the unique requirements of multimedia object caches. In particular:
`
`S0269-015 (3401-003}
`
`I‘
`
`Page9of119
`
`Page 9 of 119
`
`

`
`[ii
`
`"'‘El1‘.1':EL‘'-HEP;lifT.'L|1[E11153-:
`
`
`
`
`
`
`
`
`‘tilltfiit.'!5.'ii"ill""5u'nE[1
`
`1.
`
`there is a need for an object store that can store hundreds of millions of
`
`objects of disparate sizes, and a terabyte ofcontent size in a memory
`
`efiicient manner;
`
`2.
`
`there is a need for an object store that can determine if a document is a
`
`S
`
`“ hit” or a “ miss" quickly, without time-constuning file directory
`
`lookups;
`
`3.
`
`there is a need for a cache that minimizes the number of disk seeks to
`
`read and write objects;
`
`4.
`
`there is a need for an object store that pen-nits efficient streaming of data
`
`10
`
`to and fionr the cache;
`
`5.
`
`there is a need for an object store that supports multiple different versions
`
`of targeted alternates for the same name;
`
`6.
`
`there is a need for an object store that efficiently stores large numbers of
`
`objects without content duplication;
`
`l 5
`
`7.
`
`there is a need for an object store that can be rapidly and efiiciently
`
`garbage collected in real-time, insightfully selecting the documents to be
`
`replaced to improve user response speed, and trafiic reduction;
`
`8.
`
`there is a need for an object store that that can restart to full operational
`
`capacity within seconds after software or hardware failure without data
`
`20
`
`corruption and with minimal data loss.
`
`This document concerns technology directed to accomplishing the foregoing
`
`goals. In particular, this document describes methods and structures related to the
`
`time—efficient and space-efficient storage, retrieval, and maintenance of objects in a
`
`50269-015 (3401-003)
`
`Page 10 of119
`
`Page 10 of 119
`
`

`
`-10-
`
`F’;
`‘.—_d
`=
`I.='
`-—-V
`
`
`
`1I‘.'Ii
`
`
`
`large object store. The technology described herein provides for a. cache object store
`
`for a high-performance, high-load application having the following general
`
`characteristics:
`
`1. High performance, measured in low latency and high throughput for
`
`5
`
`object store operations, and large numbers of concurrent operations;
`
`2. Large cache support, supporting terabyte caches and billions of objects,
`
`to handle the Internet’s exponential content growth rate;
`
`3. Memory storage space efiiciency, so expensive semiconductor memory is
`
`used sparingly and effectively;
`
`10
`
`4. Disk storage space efficiency, so large numbers of Lnternet object replicas
`
`can be stored within the finite disk capacity of tlie object store;
`
`5. Alias free, so that multiple objects or object variants, with difierent
`
`names, but with the same content identical object content, will have the
`
`object content cached only once, shared among the different names;
`
`15
`
`6. Support for multimedia heterogeneity, efficiently supporting diverse
`
`multimedia objects of a multitude of types with size ranging over six
`
`orders of magnitude from a few hundred bytes to hundreds of megabytes;
`
`'3’. Fast, usage-aware garbage collection, so less useful objects can be
`
`efficiently removed from the object store to make room for new objects;
`
`20
`
`8. Data consistency, so programmatic errors and hardware failures do not
`
`lead to corrupted data;
`
`50259-015 (3401-0033
`
`X
`
`1/’
`
`Page11of119
`
`Page 11 of 119
`
`

`
`.11-
`
`9. Fast restartability, so an object cache can begin servicing requests within
`
`seconds of restart, without requiring a time-consuming database or file
`
`system check operation;
`
`10. Streaming, so large objects can be efficiently pipelined ii-om the object
`
`5
`
`store to slow clients, without staging the entire object into memory;
`
`1 I. Support for content negotiation, so proxy caches can efficiently and
`
`flexibly store variants of objects for the same URL, targeted on client
`browser, language, or other attribute ofthe client request; and
`12. General-purpose applicability, so that the object store interface is
`sufficiently flexible to meet the needs of future media types and
`
`ll)
`
`protocols.
`
`E i
`
`s
`
`
`
`50259-015 (3401-003)
`
`X /
`
`Page 12 of119
`
`Page 12 of 119
`
`

`
`
`
`
`
`3.|'-flit'.i3|JET!H53:EC!
`
`-12-
`
`SUMMARY OF THE INVENTION
`
`The foregoing needs and other needs are addressed hy the present invention,
`
`which provides, in one aspect, a method of delivering an information object to a
`
`client firom a cache at a server, comprising the steps of (A) establishing a cache table
`
`5
`
`in a memory of the server, the cache table comprising a name key that references a
`
`vector of alternates; (B) computing a content key that uniquely identifies the
`
`information object by applying a hash function to the information object; and (C)
`
`storing the content key in the cache table in one of the alternates. One feature of this
`
`aspect is storing, in the cache table, a reference to the location of the information
`
`10
`
`object on a mass storage device, wherein the reference is associated with the content
`
`key.
`
`Another feature is computing a content key that uniquely identifies the
`information object by applying a one-way hash function to the information object.
`Still another feature is (B) computing a content key that uniquely identifies the
`information object by applying an MD5 one-way hash function to the information
`
`15
`
`object. Yet another feature is receiving a request to retrieve the information object
`
`using the name of the information object; looking up the information object in said
`
`cache using the name key; retrieving a vector of alternates associated with the name
`
`key; selecting a content key from among the alternates based upon the name key;
`
`20
`
`looking up a location of the information object in the mass storage device based on
`
`the content key; and delivering the information object to the client.
`
`According to another feature, the method includes (Al) computing the name
`
`key based upon the name of the information object and a context of the request. In
`
`50269-015 (3401-003)
`
`) A
`
`Page 13 of119
`
`Page 13 of 119
`
`

`
`
`
`
`
`'*.'::.»“'~"-3‘ifisU"'"E59‘viiiIIi1‘E.~‘JFifi‘«!:"'.ii1lIT.l!
`
`
`
`
`
`-13-
`
`yet another feature, step (Al) comprises the steps of computing the name key by
`
`applying a one-way hash function to the name ofthe information object combined
`
`with information describing a context oftire request.
`
`In still another feature, step (A1) comprises the steps ofcomputing the name
`
`5
`
`key by applying the MD5 one-way hash function to the name of the information
`
`object combined with information describing a context of the request. Still another
`
`feature involves (D) establishing in the cache table, mappings from name keys to
`
`vectors of alternates; and (E) storing the content key referencing the content object
`
`in each alternate vector element; and (F) establishing in the cache table, mappings
`
`10
`
`from content keys to object content data.
`
`The invention also encompasses an apparatus, computer system, computer
`
`program product, and a computer data signal embodied in a carrier wave configured
`
`according to the foregoing aspects, and other aspects.
`
`502259-015 (3401-003)
`
`Page 14 of119
`
`Page 14 of 119
`
`

`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The present invention is illustrated by way of example, and not by way of
`
`limitation, in the figures of the accompanying drawings and in which like reference
`
`numerals refer to similar elements and in which:
`
`Figure l is a block diagram of a clientfserver relationshi .
`
`is a block diagram of a traffic server;
`
`Figure
`
`Alternates;
`
`Figure 3A is a block diagram oftransforrnati In 1 fan object into a key;
`Figure 3B is a block diagram oftransfo u,_: n ofan object name into a key;
`Figure 4A is a block diagram of a cac 5'
`
`
`
`Figure 4B is a block diagram of a :- "a rage mechanism for Vectors of
`
`- of rnulti-segment directory table;
`
`Figure 4C is a block dia .
`
`
`
`Figure 5 is a block diagr
`
`w of pointers relating to data fragments;
`
`Figure 6 is a block dia
`
`u of a storage device and its contents;
`
`5
`
`10
`
`
`
`
`
`-'1%'£|Hi'3'a"ll"."ll":-‘ill!"‘E§Ii1lc'5l{:iiEEilLL'l"ii|liI]|fiirdCl
`
`15
`
`Figure 7 is a block d .2
`
`showing the structure of a pool;
`
`Figure 8A is a flo diagram ofa process of garbage collection;
`
`Figure 813 is a -- diagram of a process of writing information in a storage
`
`
`
`Figure 8C is =
`
`' ow diagram of a process of synchronization;
`
`device;
`
`20
`
`Figure 3D is = flow diagram of a "cl1eckoul_read" process:
`
`Figure 8E i a flow diagram of a "checI<out_write" process;
`
`Figure 8F . a flow diagram of a "checkout_create” process;
`37!"E[1 we
`*. is a flow diagram of a cache lookup process;
`
`
`
`S0269-015 9401-003)
`
`Page 15 of119
`
`Page 15 of 119
`
`

`
`-15-
`
`Figure 9B is a flow diagram of a "checkin" process;
`
`Figure 9C is a flow diagram of a cache lockup process;
`
`Figure 9D is a flow diagram of a cache remove process;
`
`Figure 9B is a flow diagram of a cache read process;
`
`5
`
`Figure 9F is a flow diagram of a. cache write process;
`
`Figure 9G is a flow diagram of a cache update process;
`
`IJ-
`
`Figure 10A is a How diagram of a process of allocating and writing objects in
`
`
`
`
`
`
`
`a storage device;
`
`Figure 1013 is a flow diagram of a process of scaled counter updating;
`
`10
`
`Figure 1 lais a block diagram of a computer system that can be used to
`
`implement the present invention;
`
`Figure 12 is a flow diagram of a process of object re-validation.
`
`‘iii‘EH1F§I|‘-H273:ii.""
`
`*
`Eifiiiii
`
`-fa
`:1’.
`in-_5
`:7‘
`L5"’
`‘R
`
`¥L
`
`
`
`‘E15!l':'LuE"
`
`50269-015 (3401-003)
`
`/
`
`Page 16 of119
`
`Page 16 of 119
`
`

`
`iii!".I~'.!iHITIILtiiiELI
`
`
`3;?
`
`
`
`‘Jill{I53‘El:'15"E111I['.Ii
`
`
`
`
`
`DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
`
`A method and apparatus for caching information objects is described. In the
`
`following description, for the purposes of explanation, numerous specific details are
`
`set forth in order to provide a thorough understanding of the present invention.
`
`It
`
`5
`
`will be apparent, however, to one skilled in the art that the present invention may be
`
`practiced without these specific details. In other instances, well-known structures
`
`and devices are shown in block diagram form in order to avoid unnecessarily
`
`obscuring the present invention.
`
`ii]
`
`TRAFFIC SERVER
`
`Figure 2 is a block diagram of the general structure of certain elements of a
`
`proxy 30. In one embodiment, the proxy 30 is called a traflic server and comprises
`
`one or more computer programs or processes that operate on a computer workstation
`
`of the type described fiu'ther below. A client 10a directs a request 50 for an object to
`
`15
`
`the proxy 30 via the Internet 20. In this context, the term "object” means a network
`
`resource or any discrete element of information that is delivered from a server.
`
`Examples of objects include Web pages or documents, graphic images, files, text
`
`documents, and objects created by Web application programs during execution of
`
`the programs, or other elements stored on a server that is accessible through the
`
`20
`
`Internet 20. Alternatively, the client Illa is connected to the proxy 30 through a
`
`network other than the Internet.
`
`The incoming request 50 arrives at an inputfoutput (N0) core 60 of the proxy
`
`30. The IIO core 60 fimctions to adjust the rate of data received or delivered by the
`
`50269-015 (3401-003)
`
`} Q ;.
`
`Page 17 of 119
`
`Page 17 of 119
`
`

`
`.17.
`
`proxy to match the data transmission speed of the link between the client 10a and the
`
`Internet 20. In a preferred embodiment, the U0 core 60 is implemented in the form
`
`of a circularly arranged set ofbuckets that are disposed between input buffers and
`
`output buffers that are coupled to the proxy 30 and the Internet 20. Connections
`
`5
`
`among the proxy 30 and one or more clients 10a are stored in the buckets. Each
`
`bucket in the set is successively examined, and each connection in the bucket is
`
`polled. During polling, the amount of information that has accumulated in a buffer
`
`associated with the connection since the last poll is determined. Based on the
`
`amount, a period value associated with the connection is adjusted. The connection is
`
`It)
`
`then stored in a difierent bucket that is generally identified by the sum of the current
`
`bucket number and the period value. Polling continues with the next connection and
`
`the next bucket. In this way, the elapsed time between successive polls of a
`
`connection automatically adjusts to the actual operating bandwidth or data
`
`communication speed of the connection.
`
`
`
`
`
`
`
`iii17''i|'uIL”!"'“E31IE1‘11111‘ElEllIIEI:lfj?!
`
`
`
`
`
`
`
`iE’éElfin
`
`15
`
`The 110 core 60 passes the request 50 to a protocol engine 70 that is coupled
`
`to the U0 core 60 and to a cache 80. The protocol engine 70 functions to parse the
`
`request 50 and determine what type of substantive action is embodied in the request
`
`50. Based on information in the request 50, the protocol engine 70 provides a
`
`command to the cache 80 to carry out a particular operation. In an embodiment, the
`
`20
`
`cache 80 is implemented in one or more computer programs that are accessible to
`
`the protocol engine 70 using an application programming interface (API). In this
`
`embodiment, the protocol engine decodes the request 50 and performs a function call
`
`5o2s9—o1s(34o1-cos)
`
`Page 18 of 119
`
`Page 18 of 119
`
`

`
`
`
`
`
`
`
`
`
`lliif1L"IiuE35'T:'I"'ll*'ni|T.|iEEEI1EFIJGEEH{El"2";-1%!‘."}iIfiinHICIH
`
`
`
`
`
`
`
`
`
`to the AP] ofthe cache 30. The fimction call includes, as parameter values,
`
`information derived from the request 50.
`
`The cache 30 is coupled to send and receive infonnation to and from the
`
`protocol engine 70 and to interact with one or more non—volatile mass storage
`
`5
`
`devices 90a-9Dn. In an embodiment, the storage devices 90a-90n are high-capacity,
`
`fast disk drives. The cache 80 also interacts with data tables 82 that are described in
`
`more detail herein.
`
`50269-015 (3401-003)
`
`/
`
`Page 19 of119
`
`Page 19 of 119
`
`

`
`OBJECT CACHE INDEXING
`
`CONTENT INDEXING
`
`In the preferred embodiment, the cache 80 stores objects on the storage
`
`devices 90a-9Dn. Popular objects are also replicated into a cache. In the preferred
`
`5
`
`embodiment, the cache has finite size, and is stored in main memory or RAM of tire
`
`proxy 30.
`
`Objects on disk are indexed by fixed sized locators, called keys. Keys are
`
`used to index into directories that point to the location of objects on disk, and to
`
`metadata about the objects. There are two types of keys, called “name keys" and
`
`10
`
`“object keys” . Name keys are used to index metadata about a named object, and
`
`object keys are used to index true object content. Name keys are used to convert
`
`URLs and other information resource names into a metadata structure that contains
`
`object keys for the object data. As will be discussed subsequently, this two-level
`
`indenting structure facilitates the ability to associate multiple alternate objects with a
`
`
`
`
`
`"ll-"ll":llli‘"'‘FE-l!tiltIITII‘FilllitforFLT!
`
`
`
`
`
`
`
`15
`
`single name, while at the same time maintaining a single copy of any object content
`
`on disk. shared between multiple different names or alternates.
`
`Unlike other cache systems that use the name or URL of an object as the key
`
`by which the object is referenced, embodiments of the invention use a “fingerprint”
`
`of the content that makes up the object itself, to locate the object. Keys generated
`
`20
`
`from the content of the indexed object are referred to herein as object keys.
`
`Specifically, the object key 56 is a unique fingerprint or compressed representation
`
`of the contents of the object 52. Preferably, a copy of the object 52 is provided as
`
`input to a hash function 54, and its output is the object key 56. For example, a file or
`
`50269~|.'J15(340l-003)
`
`L{”
`
`Page 20 of 119
`
`Page 20 of 119
`
`

`
`
`
`
`
`tiir"!I"“1J'r:Ell"‘ET!.“|E5i:E:"l'il|.‘lI.“.EP:tTt1['1|H'_'.'f
`
`.20-
`
`other representation of the object 52 is provided as input to the hash function, which
`
`reads each byte of the file and generates a portion of the object key 56, until the
`
`entire file has been read. In this way, an object key 56 is generated based upon the
`
`entire contents of the object 52 rather than its name. Since the keys are content-
`
`5
`
`based, and serve as indexes into tables of the cache 80, the cache is referred to as a
`
`content-indexed cache. Given a content fingerprint key, the content can easily be
`
`found.
`
`In this embodiment, content indexing enables the cache 80 to detect duplicate
`
`objects that have difierent names but the same content. Such duplicates will be
`
`10
`
`detected because objects having identical content will hash to the same key value
`
`even if the objects have different names.
`
`For example, assume that the server 40 is storing, in one subdirectory, a
`
`software program comprising an executable file that is 10 megabytes in size, named
`
`“IE4.exe”. Assume further that the server 40 is storing, in a different subdirectory.
`
`15
`
`a copy of the same file. named "lntentet Explorenexe" . The server 40 is an
`
`anonymous FTP server that can deliver copies of the files over an HTTP connection
`
`using the FTP protocol. In past approaches, when one or more clients request the
`
`two files, the cache stores a copy of each of the files in cache storage, and indexes
`
`each of the files under its name in the cache. As a result, the cache must use 20
`
`20 megabytes of storage for two objects that are identical except for the name.
`
`In embodiments of the invention, as discussed in more detail herein, for each
`
`of the objects, the cache creates a name key and an object key. The name keys are
`
`created by applying a hash function to the name of the object. The object keys are
`
`50259-015 (3401-003)
`
`Page 21 of 119
`
`Page 21 of 119
`
`

`
`
`
`..JET)!ilE";:.iii!
`
`
`
`
`
`
`
`
`
`
`
`1E3IE3:FEET"l|l'"E1:Itfil“.533i":}EE‘:
`
`
`
`
`
`
`
`created by applying a hash function to the content of the object. As a result, for the
`
`two exemplary objects described above, two different name keys are created, but the
`
`object key is the same. W

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket