throbber
Distributed Parallel Data Storage Systems:
`A Scalable Approach to High Speed Image Servers
`
`Brian Tiirney (bltierney@lbLgov),
`William E. Johnstonl(wejohnston@lbl.gov),
`Hanan Herzog, Gary Hoo, Guojun Jin, Jason he,
`Ling Tony Chen”, Doron Rotem”
`
`Imaging and Distributed Computing Group and “Data Management Research Group
`Luwrence Berkeley L.aborator#
`Berkeiey, CA 94720
`
`Abstract
`
`1.0 Introduction
`
`We have designed, built, and analyzed a distributed parallel
`storagesystemthat will supply image streamsfast enoughto per-
`mit multi-user, “real-time”, video-like applicationsin a wide-area
`ATM network-based Internet environment. We have based the
`implementation on user-level code in order
`to secure portability;
`we have characterized
`the performance
`bottlenecks
`arising from
`operating
`system and hardware
`issues, and based on this have
`optimized our design to make the best use of the available perfor-
`mance. Although at this time we have only operated with a few
`classes of data,
`the approach appears to be capable of providing a
`scalable, high-performance,
`and economical mechanism to pro-
`vide a data storage system for several classes of data (including
`mixed multimedia
`streams),
`and for applications
`(clients)
`that
`operate in a high-speed network environment.
`
`1. Corrcspnndcnccsbnuld lx directed to W. Johostnn,Lawrena Bcrkeky bbora-
`tory, MS: SOB-2239, Berketcy, CA, 94720.Tel:51O-4WSO14,fas: 510-4S6-6363:
`or Brian Tmney, Tel: 5 10-4S6-7381.
`2. This wcfk is jointly suppnrwdby ARPA- CSIO, sndby the U. S. Dept. of Energy,
`Enugy Research Division, 0f6cc of Scientific ComputinS, under conrmct DE-
`AC03-76SfWXJ98 with the University of Catifnmia ‘tlris dncurmn! is LBL repnrt
`LBL-3540S. Rcfcrwtcc herein to my specific canrnercial product. p-s,
`or ser-
`vice by traderrartw,trwbrmrk, roarrufacrurcr,or otherwise,dossnnt occesssrilycon-
`stitute or imply its endnrscment of
`recommendation by the Uoituf States
`Govemmcn! OYthe University of Cafifnrnia. 71rc views and npirdnrrsof authors
`expressedbcmink not necessarilystareor reflect tbossof tlw United StatesGovcrrr-
`rncntnr ths University of Czdifnrnia,and shatlnot be usedfor sdverrisingnr product
`endorsementpurpnses.7%s following termswe acknnwkdged m tredcntsrks:UNJX
`(Novell, fnc.), Sun and SPARCStation (Sun Micrnsystcrm, Inc.), DEC and Alpha
`(Digital &quipnwntCap.), SGI and Indigo (Siticon Graphics, Inc.).
`
`Multimedia 94- 10/94 San Francisco, CA, USA
`0-89791 -686-7/94/001 O ‘
`
`advances have made pos-
`1ssrecent years, many technological
`sible distributed multirneda
`servers that will aflow bringing “on-
`line” large amounts of information,
`including images, audio and
`video,
`and hypermedia
`databases.
`Increasingly,
`there
`also are
`applications
`that demand
`high-bandwidth
`access
`to this &@
`either
`in single user streams (e.g.,
`large image browsing,
`incom-
`pressible
`scientific and medical video, and multiple eoordimted
`multimedia streams) or, more commonly,
`in aggregate for multiple
`users. Our work focuses on two examples of high-bandwidth,
`sin-
`the terrain visualization application
`gle-user
`applications.
`First,
`described below requires 300-400 Mbits/s of data to provide
`a
`realistic dynamic visualization.
`Second,
`there are applications
`in
`the scientific
`and medical
`imaging fields where uncompressed
`video cameras
`video (e.g.
`from typical
`laboratory monochrome
`that produce
`115 Mbits/s data streams) needs
`to be stored and
`played back at real-time rates. In these example applications com-
`pression is not practical: in the case of terrain visualization,
`the
`in the case of
`computational cost of decompression is prohibitive;
`medical and scientific images, data loss, coupled with the possible
`introduction
`of attifacts
`during decompression,
`frequently
`pre-
`cludes the use of current compression techniques.
`(See, for exam-
`ple, [7].)
`Although one of the future usesof the system described here
`is for multimedia digital
`libraries containing multiple
`audio and
`the primary design goal for this system
`compressed video streams,
`is to be able to deliver high data rates:
`initially for uncompressed
`for other types of data. Based on the performance
`images,
`later
`that we have observed, we believe, but have not yet verified, that
`the approach described below will also be useful
`for the video
`server problem of delivering many compressed
`streams
`to many
`users simultsmeously.
`Background
`(32
`4 Mbytes/s
`about
`delivers
`technology
`Current
`disk
`Mbits/s),
`a rate that has improved at about 7% each year since
`1980 [8], and there is reason to befieve that
`it will be some time
`before a single disk is capable of delivering streams at the rates
`needed for the applications mentioned. While RAID [8] and other
`parallel disk array technologies
`can deliver higher
`throughput,
`they are still relatively expensive, and do not scale well economi-
`cally, esWcially in an environment of multiple network distributed
`users, where we assume that
`the sources of data, as well as the
`multiple users, will be widely distributed. Asynchronous Transfer
`
`399
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0001
`
`

`

`Mode (ATM) networking technology, due to the architecture
`of
`the SONET infrastructure
`that will underlie lruge scale Al%l net-
`works of the fiture, will provide the bandwidth that will enable
`the approach of using ATM network-based
`distributed,
`parallel
`data servers to provide high-speed,
`scalable storage systems.
`
`from
`in many ways
`here differs
`described
`approach
`The
`RAID, and should not be confused with it. RAID is a pml.icular
`data strategy used to secure reliable data storage and parallel disk
`operation, Our approach, while using parallel dkks
`and servers,
`deliberately
`imposes no particular
`layout strategy, and is imple-
`mented entirely in software (though the data redundancy idea of
`RAID might be usefully applied across servers to provide reliabil-
`ity in the face of network problems).
`
`Overview
`
`The Image Server System (1SS) is an implementation of a dis-
`tributed parallel data storage architecture. It
`is essentially a
`“block server that is distributed acrossa wide area network to
`supply data to applications located anywhere in the network. See
`Figure 1: Parallel Data and Server Architecture Approach to the
`Image Server System. There is no inherent organization
`to the
`blocks, and in particular,
`they would never be organized sequen-
`tially on a server. The data organization
`is determined
`by the
`application as a function of data type and access patterns, and is
`implemented
`during the data load process. The usual goal of the
`data organization is that data is declustered(dispersedin such a
`way that as many system elements as possible can operate simul-
`taneously to satisfy a given request) across both disks and servers.
`This strategy allows a large collection of disks to seek in parallel,
`and all servers to send the resulting data to the application in par-
`allel, enabling the 1SS to perform as a high-speed image server.
`
`a high-speed
`design strategy is to provide
`The functional
`‘“block” server, where a block is a unit of data request and storage.
`The 1SS essentially
`provides only one function - it responds
`to
`requests for blocks. However,
`for greater efficiency and increased
`usability, we have attempted to identify a limited set of functions
`
`that extend the core 1SS timctionality while allowing support for a
`range of applications.
`First,
`the blocks
`are “named.”
`In other
`is that of a logical block
`words,
`the view from an application
`server. Second, block requests
`are in the form of
`tists that are
`taken by the 1SS to be in priority order. Therefore the 1SS attempts
`(but does not guarantee)
`to return the higher priority blocks first.
`‘fhird,
`the application
`interface provides
`the ability to ascertain
`certain configuration
`parameters
`(e.g., disk server names, perfor-
`mance., disk configuration,
`etc.)
`in order
`to permit parametrizat-
`ion
`of block placement-stmtegy
`algorithms
`(for example,
`see
`[1]). Fourth,
`the 1SS is instrumented
`to permit monitoring
`of
`almost every aspect of its functioning during operation. This mon-
`itoring functionality
`is designed to facilitate performance
`tuning
`research; however, a data layout algo-
`and network performance
`rithm might usethis facility to determine performanceparametem.
`
`the 1SS
`and experience,
`At the present state of development
`that we describe here is used primarily as a large,
`fast “cache”.
`Reliability with respect
`to data corruption is provided onty by the
`usurd OS and disk mechanisms,
`and data delivery reliability of the
`overall system is a function of user-level
`strategies of data replica-
`tion. The data of interest
`(tens to hundreds of GBytes)
`is typically
`loaded onto the 1SS from archivat
`tertiary storage, or written into
`the system from live video sources.
`In the latter case,
`the data is
`atso archived to bulk storage in real-time.
`ClientUae
`the 1SS is provided
`use of
`(application)
`‘l%e client-side
`initialization
`(for example,
`an
`that handles
`through
`a library
`“open” of a data set requires discovering
`all of the disk servers
`with which the application will have to communicate),
`and the
`basic block request
`/ receive interface.
`It is the responsibility
`of
`the client (or its agent) to maintain information about any higher-
`Ievel organization of the data blocks,
`to maintain sufficient
`local
`requirements may be met
`buffering
`so that “smooth
`playout”
`locally, and to run predictor algorithms that will pre-request
`blocks so that application responsetime requirementscan be met.
`
`1SS disk server
`
`ES disk server
`
`ISS dtsk server
`
`,@
`
`,@
`
`@
`
`ATM
`netwolk
`interface
`
`ATM
`network
`interfsce
`
`ATM network (interleaved cell streams
`representing mukiple virtuat circuits)
`
`single high bandwidth sink (or source)
`
`Ngure 1: Parallel Data and Server Architecture Approach to the Image Server System
`
`400
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0002
`
`

`

`None of this has to be explicitly visible to the user-level applica-
`tion, but some agent
`in the client environment must deal with
`these issues, because
`the 1SS always operates on a best-effoti
`basis: if it did not deliver a requested block in the expected time or
`order,
`it was because it was not possible to do so.
`
`Implementation
`
`the typical 1SS consists of
`In our prototype implementations,
`several (four - five) UNIX workstations (e.g. Sun SPARCStation,
`DEC Alph~ SGI Indigo, etc.), each with several
`(four - six) fast-
`SCSI disks on multiple
`(two - three) SCSI host adaptors. Each
`workstation is also equipped with an ATM network interface. An
`1SS configuration
`such as this can deliver
`an aggregated
`data
`stream to an application at about 400 Mbits/s (50 Mbytes/s) using
`these relatively low-cost, “off the shelf’ components by exploiting
`the parallelism provided by approximately
`five servers,
`twenty
`disks,
`ten SCSI host adaptors, and five network interfaces.
`
`of the 1SS have been built and operated in the
`Prototypes
`MAGIC3 network testbed.
`In this paper we describe mainly archi-
`strategies. A previ-
`tecture and approach, as well as optimization
`ous paper
`[11] describes
`the major
`implementation
`issues, and a
`paper to be published [12] will describe other 1SS applications and
`1SS performance
`issues.
`
`2.0 Related Work
`
`There are other research groups working on solving problems
`related to distributed
`storage and fast multimedia data retrieval.
`For example, Ghandeharizadeh,
`Ramos, et af., at USC are work-
`ing on declustering methods for multimedia data [2], and Rowe, et
`
`3, MAGIC(MultiditsmaionalAppticatkonsandGig*it Irttcntctwork Consortium)is
`a gigabit networkteshcd that was establishedin Juns )992 by tbc U. S, Govern-
`ment’s AdvsncedResearchProjectsAgency(ARPA)[9].MAGICScharteris to
`developa high-speed,wide-ureanetworkingtcstbcdttvatwill demmsstrxcinksactive
`exchangeof dataat gigabit-per-secondratesamongmultipledistritwt@dserverssnd
`clientsusing a terrainvistmlisationapplication.More informationaboutMAGIC
`msy & fmmdon the WWW hems pageat: http:\/www.magic,net/
`
`al., at UCB are working on a continuous media player based on
`the MPEG standard [10].
`
`the Zebra network file
`the 1SS resembles
`In some respects,
`system, developed by John H. Hartman and John K. Gttsterhout at
`the University of California, Berkeley [3]. Both the 1SS and Zebra
`can separate their data access and management
`activities
`across
`severaf hosts on a network. Both try to maintain the availability of
`the system as a whole by building in some redundancy,
`allowing
`for the possibility that a disk or host might be unavailable at a crit-
`ical time. The goaf of both is to increase data throughput despite
`the current
`Iimits on both disk and host throughput.
`
`the 1SS and the Zebra network file system differ in
`However,
`the tasks they perform. Zebra is
`the fundamental nature of
`intended to provide traditional file system timctionality, ensuring
`the consistency
`and correctness of a file system whose contents
`are changing from moment
`to moment. The 1SS, on the other
`hand,
`tries to provide very high-speed, high-throughput
`access to
`a relatively
`static set of data.
`It
`is optimized
`to retrieve data,
`requiring only minimum overhead to verify data correctness
`and
`no overhead to compensate
`for corrupted data.
`
`3.0 Applications
`
`There are severaf target applications for the initial implemen-
`tation of the 1SS. These applications fafl
`into two categories:
`image serversand multimedia / video file servers.
`
`3.1
`
`Image Server
`
`The initial use of the 1SS is to provide data to a terrainvisual-
`ization application in the MAGIC testbed. This application,
`known as TerraVlsion [5], allows a user to navigate through and
`over a high resolution
`landscape
`represented
`by digital
`aerial
`images and elevation models. TerraVkion is of interest
`to the U.S.
`Army because of its ability to let a commander
`“see” a battlefield
`from a typical “flight
`environment. TerraVision is very different
`simulator’’-like programin that it useshigh resolutionaerial imag-
`ery for the visualization
`instead of simulated terrain. TerraVision
`
`Tiled ottho
`images of
`landscape.
`
`D1
`
`Path of
`travel.
`
`J
`
`Tiles intersected by the path of travel:
`74,64,63,53,52,42,32,
`33
`
`1
`Data piarzment atgorittun results in mapping tiles along
`path to several disks and sc~ra.
`
`Servetx and dkks operatein pantlel to supply
`
`tile
`
`‘
`
`~
`
`.wmer and disk
`SIDt
`
`~
`~
`32 ~
`
`S2D2
`S1D2
`S2D1
`
`Figure 2: ISS Parallel DatJs Access Strategy as illustrated by the TerrssVisloo Application
`
`401
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0003
`
`

`

`requires huge amounts of dam transferred at both bursty and
`steady rates. The 1SS is used to supply image data at hundreds of
`Mbits/s rates to TerraVision.No data compressionis used with
`this application because the bandwidthrequirementsare such that
`real-time decompression is not possible without using special pur-
`pose hardware.
`In the case of a large-image browsing application like TerraVi-
`sion, the strategy for using the 1SS is straightforward: the image is
`tiled (broken into smaller, equal-sized
`pieces), and the tiles are
`scattered across the disks and servers of the 1SS. The order of tiles
`delivered to the application is determined by the application pre-
`dicting a ‘*path” through the image (landscape), and requesting the
`tiles needed to supply a view along the path. The actual delivery
`order is a function of how quickly a given server can read the tiles
`from disk and send them over the network. Tries will be detivered
`in roughly
`the requested
`order, but small variations
`from the
`requested order will occur. These variations must be accommo-
`dated by buffering, or other strategies,
`in the client application.
`Figure 2: 1SS Parallel Data Access Strategy as Illustrated by
`the Terraklsion Application shows how image tiles needed by the
`TerraVision application are dedustered
`across
`several disks and
`wrvers. More detail on this declustering is provided below.
`
`to the network,
`comected
`is independently
`Each 1SS server
`and each supplies an independent data stream into and through the
`network. These streams are formed into a single network flow by
`using ATM switches
`to combine
`the streams
`from multiple
`medium-speed
`links onto a single high-speed
`link. This high-
`speed link is ultimately comected
`to a high-speed interface on the
`visu~~zation platform (client). On the client, data is gathered from
`buffers and processed into the form needed to produce the user
`view of the landscape.
`
`This approach could supply data to any sort of large-image
`browsing application,
`including applications
`for displaying large
`aerial-photo
`landscapes,
`satellite images, X-ray images, scanning
`microscope images, and so forth.
`Figure 3: Use of the 1SSfor Single High-Bandwidth App.
`shows how the network is used to aggregate
`several medium-
`speed streams into one high-speed stream for the image browsing
`application. For the MAGIC TerraVisionapplication, the applica-
`
`Urge Image Browsing Scenario (MAGIC TerraVision application)
`
`MAGIC
`application
`
`Figure 3: Use of tie ISS forSingleHigh-Bandwidth App.
`
`tion host (an SGI Onyx) is using multiple OC-3 (155 MbWs) inter-
`faces
`to achieve
`the bandwidth
`requirements
`necessary.
`‘I%ese
`multiple interfaces will be replaced by a single OC- 12 (622 Mbit/
`s) interface when it becomes available.
`
`the 1SS has been run in several ATM
`In the MAGIC testbed,
`WAN configurations
`to drive
`several
`different
`applications,
`including TerraVision. The configurations
`include placing 1SS
`servers in Sioux Falls, South Dakota (EROS Data Center), Kansas
`City, Kansas (Sprint), and Lawrence, Kansas (University of Kan-
`sas), and running the TerraVkion client at Fort Leavenworth, Kan-
`sas (U. S. Army’s Battle Command Battle Lab).
`l%e 1SS disk
`
`402
`
`server and the TerraVision
`are separated by several
`application
`hundred kilometers,
`the longest
`link being about 700 kilometers.
`
`3.2 Video Server
`Examples of video server applicationsinclude video players,
`video editors, and multimedia document browsers. A video server
`might contain several
`types of stream-like dat& including conven-
`tional vidW, compressed video, variable time base video, muki-
`media hypertext,
`interactive
`video,
`and others. Several
`users
`would typically be accessing the same video data at the same time,
`but would be viewing different
`streams, and different
`frames
`in
`the same stream.
`In this case the 1SS and the network are effec-
`(see Figure 4: Use of the
`tively being used to “reorder” segments
`1SS to Supply many Low-Bandwidth
`Streams). This
`reordering
`I
`
`VIdm Fite server Scenario
`
`1
`
`L
`Figure 4 Use of the 1SS &tp&ly many Low-Bandwidth
`
`I
`
`affectsmanyfactorsin an image server system,
`including the lay-
`out of
`the data on disks. Commercial
`concerns
`such as Time
`Warner and U.S. Weat are building large-scale commercial
`video
`servers such as the Time Warner/
`Silicon Graphics video server
`[4]. Because of the relatively low cost and ease of scalability of
`our approach,
`it may address a wider scale, as well as a greater
`diversity, of data organization strategies
`so as to serve the diverse
`needs of schools,
`research institutions,
`and hospitals
`for video-
`image servers in support of various educational
`and research-ori-
`ented digital
`libraries.
`
`4.0 Design
`
`4.1 Goals
`
`l
`
`The following are some of our goals in designing the 1SS:
`The 1SS should be capable of being geographically
`distrib-
`uted. In a future environment
`of large scale, high-speed,
`mesh-comected
`national
`networks,
`network
`distributed
`storage should be capable of providing an uninterruptible
`stream of data,
`in much the same way that a power grid is
`resilient
`in the face of source failures, and tolerant of peak
`demands, because of the possibility
`of multipIe sources
`multiply interconnected.
`in all dimensions,
`The [SS approach should be scalable
`including &ta set size, number of users, number of server
`sites, and aggregate data delivery speed.
`lle
`1SS should deliver coherent
`image streams to an appli-
`cation, given that the itilvidual
`images that make up the
`stream are scattered (by ales@ all over
`the network.
`In
`this case, “coherent” means “in the order needed by the
`application”. No one disk server will ever be capable of
`delivering the entire stream. ‘fire network is the server.
`The 1SS should be tiordable. While
`sometidng
`like a
`HIPPLbased RAID device might be able to provide func-
`tionality similar
`to the 1SS,
`this sort of device
`is very
`expensive,
`is not scalable, and is a single point of failure.
`
`.
`
`l
`
`l
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0004
`
`

`

`K. ‘liIes assigned to the same disk are separated by integer multi-
`ples of these vectors. Mathematical
`analysis
`shows that for com-
`mon visualization
`queries
`this dedusterirrg method
`performs
`within seven percent of optimal
`for a wide range of practicaf mul-
`tiple disk configurations,
`
`Within a disk, however,
`the tiles such
`it is necessary to cluster
`that tiles near each other in 2-D space are close to each other on
`disk, thus minimizing disk seek time. The clustering method used
`here is based on the Hilbert Curve because it has been shown to be
`the best curve that preserves
`the 2-D Iocafity of points in a 1-D
`traversal.
`Path Prediction
`Path prediction is importantto ensure that
`the 1SS is utilized
`as efficiently as possible. By using a strategy that always requests
`more tiles than the 1SS can actually deliver before the next
`tile
`request, we can ensure that no component of the 1SS is ever idle,
`For example, if most of a request
`list’s tiles were on one server,
`the other servers could still be reading and sending or caching tiles
`that may be needed in the future,
`instead of idly waiting. The goal
`of path prediction is to provide a rational basis for pre-requesting
`tiles. See [1] for more details on data placement methods.
`As a simple example of path prediction,
`consider an interac-
`tive video database with a finite number of distinct paths (video
`clips), and therefore a finite number of possible branch points. (A
`“b~ch
`point” OCCUrSwhere a user miht
`select one of .wvend
`possible play clips, see Figure 5: hag> Stnmrn Management/
`Prediction Strategy). As a branch point
`is approached
`by the
`
`mabaw StNeture
`Multimediaprogramthatconsistsof multiplethreads(M+A+B+C),
`whoseplay order is nor known in advance.
`
`M s
`
`ition
`
`client (muftinsedkpfaycr)
`
`‘;::F:-?,
`
`~UCStCdtiks
`
`L
`
`(9, 10, 10s }
`re-lequestlist
`
`a 1%1X18
`1111X1X19
`m buffer (X=rniaaing)
`
`17[
`
`Figure S: Image Streasn Management / Predktioss Strategy
`
`(without knowledge of which branch will be
`the predictor
`player,
`taken) will start requesting images (frames)
`rdong both branches.
`‘ilrese images
`are cached first at
`the disk servers,
`then at
`the
`receiving application. As soon as a branch is chosen,
`the predictor
`ceases to send requests
`for images from the other branches. Any
`“images”
`(i.e.,
`frames or compressed
`segments)
`cached on the
`1SS, but unsent, am flushed as better predictions
`fill
`the cache.
`
`4.2 Approach
`
`A Distributed, Parallel Server
`The 1SS design is based on the use of multiple low-cost,
`medium-speed disk servers which use the network to aggregate
`serveroutput.To achieve high performancewe exploit all possible
`levels of parallelism,
`including that available at the level of the
`disks, controllers,
`processors
`/ memory banks,
`servers, and the
`network. Proper data placement
`strategy is also key to exploiting
`system parallelism.
`
`the approach is that of a collection of disk
`At the server level,
`managers
`that move requested data from disk to memory cache.
`Depending on the nature of the data and its organization,
`the disk
`managers may have a strategy for moving other closely located
`and related data from disk to memory. However,
`in general, we
`have tried to keep the implementation
`of data prediction (deter-
`mining what data will be needed in the near future) separate from
`the basic data-moving
`function of the server. Prediction might be
`(as it is in TerraVision), or it might be
`done by the application
`done be a third party that understands
`the data usage patterns.
`In
`any event,
`the server sees only lists of requested blocks.
`
`for this type of
`As explained in [ 12], the dominant bottlenecks
`application in a typical UNIX workstation are first memory copy
`speed, and second, network access speed. For these reasons, an
`impatant
`design criterion is to use as few memoty copies as pos-
`sible, and to keep the network interface operating at full band-
`width all the time. Our implementation
`uses only three copies to
`get data from disk to network,
`so maximum server throughput
`is
`about (memory_copy_sped
`/ 3).
`
`aspect of the design is that afl components
`imp-tant
`Another
`are instrumented
`for timing and data flow monitoring in order to
`characterize 1SS and network performance. To do this, all commu-
`nications
`between
`1SS components
`are
`timestamped.
`In the
`MAGIC testbed, we are using GPS (Globaf Positioning System)
`receivers and NTP (Network Time Protocol)
`[6] to synchronize
`the clocks of all 1SS servers and of the client application in order
`to accurately measure network throughput and latency.
`Data PlacementIaaues
`A limiting factor
`in handling large data sets is the long delay
`in managing and accessing subsets of these data sets. Slow UO
`rates,
`rather
`than processor
`speed, are chiefly the cause of this
`delay. One way to address this problem is to use data reorganiza-
`tion techniques based on the application’s view of the structure of
`the data, analysis of data access patterns, and storage device char-
`acteristics.
`By matching
`the data
`set organization with the
`intended
`use of
`the data,
`substantial
`improvements
`can be
`achieved for common patterns of data access[ 1]. This technique
`has been applied to large climate-modeling
`data sets, and we are
`applying it to TerraVision
`data stored in the 1SS. For image tile
`data,
`the placement algorithm declusters
`tiles so that all disks are
`evenly accessed by tile requests, but then clusters tiles that are on
`the same disk based on the tiles’ relative nearness to one another
`in the image. This strategy is a function of both the data structure
`(tiled images) and the geometry of the access (e.g., paths through
`the landscape).
`
`The declustering method used for tiles of large images is a lat-
`tice-based
`(i.e., vector-based)
`declustering
`scheme,
`the goal of
`which is to ensure tiles assigned to the same server are as far apart
`as possible on the image plane, ‘ilris minimizes
`the chance that the
`same server will be accessed many times by a single tile request
`list.
`
`are distributed among K disks by first determining a pair
`liles
`of integer component vectors which span a parallelogram of area
`
`403
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0005
`
`

`

`This is an example where a relatively
`might do the prediction.
`
`independent
`
`third party
`
`l medium: sendif there is time.
`low: fetch into the cache if there is time, but don’t send.
`
`l
`
`The client will keep astilng for an image until it shows up, or
`until
`it is no longer needed (e.g.,
`in TerraVision,
`the application
`may have “passed”
`the region of
`landscape
`that
`involves
`the
`image that was requested, but never received.) Applications will
`have different
`strategies
`to deal with images that do not arrive in
`time. For example, TerraVision keeps a local,
`low-resolution,
`data
`set to till in for missing tiles.
`
`to the 1SS, and is manifested only in
`Prediction is transparent
`the order and priority of images in the request
`list. The prediction
`algorithm is a function of the client application,
`and typically runs
`on the client.
`The S@iikance
`
`of ATM Networks
`
`The design of the 1SS depends
`in part on the ability of ATM
`switches and networks to aggregate multiple data streams from the
`disk servers
`into a single high-bandwidth
`stream to the applica-
`tion. This is feasible because most wi& area ATM network aggre-
`gate bandwidth upward - that
`is, the link speeds tend to increase
`from LANs to WANS, and even within WANS the “backbone” is
`the highest bandwidth. (T’tis is actually a characteristic of the
`that underlie ATM net-
`architecture
`of
`the SONE1’ networks
`works.) Aggregation of stream bandwidth occurs at switch output
`ports. For example,
`three incoming streams of 50 Mbits/s that are
`all destined for
`the same client will aggregate
`to a 150 Mbit/s
`stream at the switch output port. The client has data stream con-
`nections open to each of the 1SS disk servers, and the incoming
`data from all of these streams
`typically put data into the same
`buffer.
`
`5.0 Implementation
`
`sends
`In a typical example of 1SS operation the application
`requests for data (images, video, sound, ete.)
`to the name server
`process which does a lookup to determine the location (server/
`disldoffset) of the requested data. Requests are sorted on a per-
`server basis, and the resulting lists are sent to the individual
`serv-
`ers. Each server
`then checks
`to see if the data is already in its
`cache, and if not, fetches the data from disk and transfers it to the
`cache, Once the data is in the cache,
`it is sent
`to the rqresting
`application. FQure 6: 1SS Arvhitectum showshow the components
`
`tile (irnsm)
`
`m
`
`other 1SS
`servers
`
`rretwork
`
`4
`
`Figure 6: ES Architecture
`
`of the 1SS are used to handle requests for data blocks.
`
`The disk server handles three image request priority levels:
`high: send first, with an imp~cit priority given by order
`within the list.
`
`l
`
`is set by the requesting
`request
`The priority of a particular
`application. The application’s
`prediction algorithm can use these
`priority levels to keep the 1SS fully utilized at all
`times without
`requesting more data than the application can process. For exam-
`ple,
`the application could send low priority requests
`to pull data
`into the 1SScache, knowing that the 1SS would not send the data
`on to the application
`until
`the application was ready. Another
`example is an application that plays back a movie with a sound
`track, where audio might be high priority requests,
`and video
`medium priority requests.
`
`5.1 Performance Limits
`
`host
`10-41 with two Fast-SCSI
`Using a Sun SPARCStation
`adaptors
`and four disks, and reading into memory
`random 48
`Kbyte tiles from all disks simultaneously, we have measured a sin-
`gle server disk-to-memory
`throughput
`of 9 Mbytes/s. When we
`add a process which sends UDP packets to the ATM interface,
`this
`reduces the disk-to-memory
`throughput
`to 8 Mbytes/s
`(64 Mbits/
`s). The network throughput under these conditions
`is 7.5 Mbytes/s
`(60 Mbits/s). This number
`is an upper
`limit on performance
`for
`it does not include the 1SSoverhead of buffer man-
`this platform;
`agement, semaphore locks, and context switching, The SCSI host
`adaptor and Sbus are not yet saturated, but addktg more dkks will
`not help the overall
`throughput without
`faster access memory and
`to the network (e.g., multiple interfaces and multiple independent
`data paths as are used in systems like a SPARCServer
`1000 or SGI
`Challenge).
`
`6.0 Current Status
`
`All 1SSsoftware is currently tested and running on Sun work-
`stations
`(SPARCstations
`and SPARCserver
`10(MYs) running
`SunOS 4.1.3 and Soktris 2.3, DEC Alpha’s
`running OSF/1, and
`SGI’S running IRIX 5.x. Demonstrations
`of
`the 1SS with the
`MAGIC Terrain Visualization
`application TerraVkion
`have been
`in the MAGIC testbed
`done using several WAN configurations
`[9]. Using enough disks (4-8, depending on the disk and system
`type), the 1Ss SOftW~ has no difficulty saturating curnmt ATM
`interface cards. We have worked with 1(M Mbit and 140Mbh
`TAXI S-Bus and VME cards from Fore systems,and OC-3 (155
`Mbit/s) cards from DEC, and in all cases 1SS throughput
`is only
`slightly less than trcp4 speeds.
`
`system ttcp speeds and 1SS
`Table 1 beiow shows various
`speeds. The first column is the maximum ttcp speeds using ‘KP
`
`TABLE 1.
`
`System
`
`Sun SS1O-5I
`
`sunSslcmo
`(2 processors)
`SGI Challenge L
`(2 processors)
`
`Dec Alphs
`
`I Max ATM ttcpwfdisk
`LAN ttcp
`- read
`70 Mbiisec
`
`60 Mbiisee
`75 Mbhs/sec 65 Mbits/sex
`
`I Max 1SS I
`speed
`
`55MbitsJsec
`
`60 Mbits/see
`
`82 Mbks/sec
`
`72 MbitrAec
`
`65 h4MWsee
`
`127 MbiW
`see
`
`95 Wlwsee
`
`88 MbiWsee
`
`4. rrcp is a utiliry Ihsr rimes the Ir8nsrdtion
`
`and nwqrtion of dam between rwo sys-
`
`rems using k
`
`LJDP of ‘lCP praccoh.
`
`404
`
`Petitioner's Exhibit 1106
`Google LLC v. WAG Acquisition, IPR2022-01413
`Page 0006
`
`

`

`.
`
`l
`
`Modifying name server design to accommodate data on
`server performance and availability
`and to provide a
`mechanism to request tiles from the “best” server (fastest
`or least loaded);
`Investigating the issues involved in dealirw with data other
`than i~age~ or video- like data,
`Many of these enhancementswill
`to the
`involve extensions
`data placement
`algorithm and the cache management methods.
`Also we plan to explore some optimization
`techniques,
`inchsding
`using larger disk reads, and conversion of all buffer and device
`manageme

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket