`
`by
`
`Prashant Jagdish Shenoy
`
`1998
`
`LENOVO ET AL. EXHIBIT 1028
`Lenovo et al. v. Intellectual Ventures I, LLC
`IPR2017-00477
`
`Page 1 of 137
`
`
`
`Symphony: An Integrated Multimedia File System
`
`by
`
`Prashant Jagdish Shenoy, B.Tech., M.S.
`
`Dissertation
`
`Presented to the Faculty of the Graduate School of
`
`The University of Texas at Austin
`
`in Partial Fulfillment
`
`of the Requirements
`
`for the Degree of
`
`Doctor of Philosophy
`
`The University of Texas at Austin
`
`August 1998
`
`Page 2 of 137
`
`
`
`Symphony: An Integrated Multimedia File System
`
`Approved by
`Dissertation Committee:
`
`Page 3 of 137
`
`
`
`To Mom and Dad
`To Mom and Dad
`
`Page 4 of 137
`
`Page 4 of 137
`
`
`
`Acknowledgments
`
`The long-winding road towards a doctoral degree can be fraught with frustration and pain. Instead, the company and
`support that I received during my five year stint in Austin have made this journey rewarding and downright fun. I
`will try and acknowledge the people who have shared these memorable years. My apologies to those I may forget.
`I would first like to thank my dissertation supervisor, Prof. Harrick Vin, for his constant encouragement, mentor-
`
`ing and guidance. His perseverance and strong emphasis on excellence—qualities that I pledge to emulate—have
`been inspiring.
`I am grateful to Prof. Lorenzo Alvisi, Prof. Mike Dahlin, Prof. Don Fussell, Prof. Kevin Jeffay, and Dr. Dan
`Swinehart for agreeing to serve on my dissertation committee. Their constructive comments and suggestions have
`contributed greatly to my dissertation research. I would especially like to thank Prof. Don Fussell for his help and
`
`advice during the final stages of my dissertation and my interview trail. My sincere thanks to Patti Spencer, our
`Facilities and Equipment Coordinator, for providing hardware and software support at a moments notice; and to
`Gloria Ramirez, our Graduate Coordinator, for helping me through the bureaucratic maze at UT.
`Several exceptional collaborators have helped shape the ideas in this dissertation. The kernel support for Sym-
`phony was jointly developed with Balaji Balasubramanayan. Portions of the work on failure recovery techniques
`
`and the initial implementation of the Symphony prototype was joint with Sriram Rao. The design and implementa-
`tion of the buffer subsystem of Symphony was jointly done with Renu Tewari. Pawan Goyal has contributed, both
`directly and indirectly, to many of the ideas contained in this dissertation. I thank him for his critical comments and
`insightful suggestions.
`Life in Austin has been enriched by several people I have gotten to know over the past few years. I have immensely
`
`benefited from many stimulating discussions, technical and otherwise, with my colleagues at the Distributed Multi-
`media Computing Laboratory (DMCL): Balaji Balasubramanyan, Xingang Guo, Amir Husain, Jon Kay, Scott Page,
`Ed Posnak, Sriram Rao, Jasleen Sahni and T R. Vishwanath. I will fondly remember the endless discussions and
`arguments during the lunch hour and the 4 p.m. coffee break. The company of Sundeep Abraham, Rajesh Govin-
`dan, Srinivas Guggilla, Raja Krishnaswamy, Anand Kudari, Dev Putchala, Rajmohan Rajaraman, Divas Sanwal and
`
`Premal Shah provided many memorable moments. I owe special thanks to Bhagat Nainani for his help and support
`during my early years in Austin.
`I am especially indebted to two fabulous colleagues and close friends, Pawan Goyal and Renu Tewari. I thank
`them for patiently listening to my numerous grouses, their help on innumerable occasions, their advice on technical
`matters and their perspectives on life in general—I have learnt a lot from them.
`
`Lastly, but most importantly, I would like to thank my family for sharing the ups and downs of graduate student life
`and for putting up with long years of separation. My sister, Preeti, has been a endless source of enthusiasm through
`these years. My parents, Shakuntala and Jagdish Shenoy, have provided inspiration and guidance throughout the
`
`v
`
`Page 5 of 137
`
`
`
`course of my graduate studies and my entire life. I am what they taught me to be. I dedicate this dissertation to
`
`them.
`
`The University of Texas at Austin
`August 1998
`
`PRASHANT JAGDISH SHENOY
`
`vi
`
`Page 6 of 137
`
`
`
`Symphony: An Integrated Multimedia File System
`
`Publication No.
`
`Prashant Jagdish Shenoy, Ph.D.
`The University of Texas at Austin, 1998
`
`Supervisor: Harrick M. Vin
`
`Advances in computing technologies coupled with the growing popularity of the World Wide Web have led to a
`proliferation of applications that have diverse performance requirements and access heterogeneous data. Existing
`general purpose file systems have been developed for a single class of applications. Since these file systems treat
`all applications alike regardless of their requirements, they are ineffective at simultaneously supporting multiple
`application classes. A simple approach that addresses this limitation is to design a file system that employs multiple
`
`servers, each optimized for a particular class of applications. However, we demonstrate that static partitioning of
`resources among component servers inherent in this approach results in a manifold degradation in performance over
`an approach that employs an integrated server for all application classes. The design of such an integrated multimedia
`file system poses three challenges: (i) the file system must support multiple application classes and align the service
`provided within a class with application needs, (ii) it must enable the coexistence of multiple data type specific
`
`policies, and (iii) it must employ an extensible architecture to facilitate easy integration of new application classes
`and data types. The design, implementation and evaluation of Symphony, an integrated multimedia file system that
`meets these requirements is the focus of this dissertation.
`In this dissertation, we first design mechanisms for disk scheduling, placement, caching and failure recovery that
`implement core file system functionality. Whereas the mechanism for disk scheduling supports multiple application
`
`classes and aligns the service provided within a class with application needs, those for placement, caching and
`failure recovery enable the coexistence of multiple data type specific policies. Moreover, these mechanisms prevent
`interference between policies with conflicting requirements and dynamically allocate resources to applications on
`demand. Next we develop a number of data type specific policies for placement, failure recovery and meta data
`management that exploit the characteristics of the data to optimize file server performance. We then develop a novel
`
`two layer architecture for Symphony, consisting of a data type independent layer and a data type specific layer. The
`two layers of Symphony separate data type independent mechanisms from data type specific policies, and thereby
`facilitate easy extensions to the file system.
`We implement the policies and mechanisms that we develop in the two layer architecture of Symphony and then
`experimentally evaluate the Symphony prototype to demonstrate its effectiveness in supporting diverse applications
`
`vii
`
`Page 7 of 137
`
`
`
`and managing heterogeneous data. Our results show that, even at moderate utilization levels, a four disk Symphony
`
`prototype can yield a factor of 1.9 improvement in response times of interactive text requests over existing disk
`scheduling techniques, while meeting the deadlines of all real-time continuous media requests. Moreover, tailoring
`policies to needs of data types improves server throughput and reduces the recovery overhead for continuous media
`from a factor of two to zero.
`
`viii
`
`Page 8 of 137
`
`
`
`Contents
`
`Acknowledgments
`
`Abstract
`
`Chapter 1 Introduction
`
`. . . . . .
`1.1 File Systems: A Perspective
`1.2 Limitations of Existing File Systems . .
`1.3 Summary of Contributions
`.
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`
`1.4 Dissertation Outline . . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . .
`
`Chapter 2 Design Methodologies
`
`. . . . . .
`2.1 Qualitative Comparison . . .
`2.1.1 Disk Bandwidth Management
`.
`2.1.2
`Storage Space Management
`. .
`
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . .
`2.1.3 Buffer Space Management
`. . . . .
`2.2 Experimental Evaluation . .
`. . . . . .
`. . . . .
`2.2.1 Experimental Setup .
`. . . . . .
`2.2.2 Behavior Under Different Loads . . . . .
`2.2.3
`Performance Comparison . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . .
`2.2.4 Overprovisioning Issues
`2.2.5
`Sensitivity to Experimental Parameters
`.
`2.3 Concluding Remarks . . . .
`. . . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`. . . .
`. . . .
`. . . .
`
`Chapter 3 Symphony Architecture: An Overview
`
`3.1 Requirements for a Physically Integrated File System . . . .
`3.1.1
`Service Model
`. . .
`. . . . . .
`. . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`
`. . . . .
`3.1.2 Retrieval Architecture
`. . . . .
`3.1.3
`Placement Techniques
`3.1.4
`Failure Recovery Techniques . .
`3.1.5 Caching Techniques
`. . . . . .
`3.1.6 Meta Data Management
`. . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`3.1.7 Extensibility . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . .
`
`ix
`
`v
`
`vii
`
`1
`
`2
`3
`4
`
`7
`
`8
`
`9
`10
`11
`
`11
`12
`12
`13
`15
`
`17
`19
`19
`
`21
`
`21
`21
`
`22
`22
`23
`23
`23
`
`24
`
`Page 9 of 137
`
`
`
`3.1.8 Requirements Summary . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . . . .
`
`. . . . .
`
`. .
`
`. . . .
`
`. . . .
`
`. . . . . .
`. . . . . .
`. . . . .
`. . . . . .
`3.2 Architecture of Symphony .
`3.2.1 Mechanisms for Enabling Coexistence of Diverse Policies . . .
`3.2.2
`Policies for Managing Heterogeneous Data types . .
`. . . . . .
`3.3 Concluding Remarks . . . .
`. . . . . .
`. . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`
`Chapter 4 Cello Disk Scheduling Framework
`
`4.1 Requirements for a Disk Scheduling Algorithm .
`
`. . . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . .
`
`4.2 The Cello Disk Scheduling Framework .
`4.2.1 Architectural Principles
`. . . .
`4.2.2 The Class-independent Scheduler
`4.2.3 The Class-specific Schedulers .
`4.2.4 Complexity Analysis . . . . . .
`
`. . . . .
`. . . . .
`. . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . . .
`. . . . . .
`. . . . .
`. . . . . .
`4.2.5 Discussion . . . . .
`. . . . .
`. . . . . .
`. . . . . .
`. . . . .
`. . . . . .
`4.3 Experimental Evaluation . .
`. . . . .
`. . . . . .
`. . . . .
`4.3.1 Aligning the Service to Application Needs
`. . . . .
`. . . . . .
`4.3.2 Effect of Proportionate Allocation in Cello . . . . .
`4.3.3
`Proportionate Time-allocation versus Proportionate Byte-allocation . . .
`
`4.3.4 Effect of reassigning idle bandwidth . . .
`4.3.5 Overheads of Cello .
`. . . . . .
`. . . . .
`4.4 Related Work . . .
`. . . . .
`. . . . . .
`. . . . .
`4.5 Concluding Remarks . . . .
`. . . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`
`Chapter 5 Striping Techniques
`
`. . . . . .
`. . . . . .
`. . . . .
`5.1 Determining the Stripe Unit Size . . . .
`5.1.1
`Analytical Models for Determining the Load on the Array . . .
`5.1.2 Validation of the Models . . . .
`. . . . .
`. . . . . .
`. . . . . .
`5.1.3
`Factors Affecting the Optimal Block Size . . . . . .
`. . . . . .
`5.1.4
`Selecting an Optimal Block Size . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . .
`. .
`5.2 Determining the Degree of Striping
`5.2.1 Modeling the Imbalance Across Partitions . . . . . .
`5.2.2 Determining the Partition Size .
`. . . . .
`. . . . . .
`5.3 Related Work . . .
`. . . . .
`. . . . . .
`. . . . .
`. . . . . .
`5.4 Concluding Remarks . . . .
`. . . . . .
`. . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`Chapter 6 Failure Recovery Techniques
`
`. . . . .
`6.1 Exploiting Inherent Redundancy of Video Streams
`6.1.1 Characteristics of the Discrete Cosine Transform . .
`6.1.2
`Image Partitioning Fundamentals
`. . . .
`. . . . . .
`6.1.3 Loss-Resilient JPEG (LRJ) Algorithm . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`. . . .
`. . . .
`
`6.1.4 Loss-Resilient MPEG (LRM) Algorithm . . . . . .
`6.1.5
`Inherently Redundant Array of Disks (IRAD) . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . .
`. . . .
`
`24
`
`24
`24
`27
`28
`
`29
`
`30
`
`32
`32
`33
`40
`41
`
`42
`43
`44
`46
`48
`
`49
`49
`50
`51
`
`52
`
`53
`55
`60
`62
`69
`
`69
`71
`72
`73
`74
`
`75
`
`77
`78
`78
`81
`
`84
`86
`
`x
`
`Page 10 of 137
`
`
`
`6.2 Exploiting Sequentiality of Continuous Media Retrieval . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . .
`
`87
`
`87
`92
`93
`93
`97
`
`. . . .
`. . . .
`. . . .
`. . . .
`. . . .
`
`Parity-based Reconstruction . .
`6.2.1
`Failure Recovery Overheads . .
`6.2.2
`6.2.3 Discussion . . . . .
`. . . . . .
`6.3 Comparative Evaluation . . .
`. . . . . .
`6.4 Experimental Evaluation . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`6.4.1 LRJ/LRM algorithms and the IRAD Architecture . .
`6.4.2
`Parity-Based Failure Recovery .
`. . . . .
`. . . . . .
`6.5 Related Work . . .
`. . . . .
`. . . . . .
`. . . . .
`. . . . . .
`6.6 Concluding Remarks . . . .
`. . . . . .
`. . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`Chapter 7 Symphony Implementation and Evaluation
`
`97
`. . . .
`. . . . 100
`. . . . 102
`. . . . 102
`
`103
`
`7.1 The Data Type Independent Layer
`
`. . .
`
`. . . . .
`
`. . . . . .
`
`. . . . . .
`
`. . . . .
`
`. . . . . .
`
`. . . . 104
`
`7.1.1 The Disk Subsystem . . . . . .
`7.1.2 The Buffer Subsystem . . . . .
`7.1.3 The Resource Manager . . . . .
`7.2 The Data Type Specific Layer
`. . . . .
`7.2.1 The Video Module .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . .
`. . . . . .
`7.2.2 The Text Module . .
`. . . . . .
`. . . . .
`7.2.3 The File Server Interface . . . .
`7.3 Experimental Evaluation of the Symphony Prototype . . . .
`7.3.1
`Performance of Text and Video Clients
`.
`. . . . . .
`7.3.2
`Performance in the presence of disk failure . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`. . . . . .
`
`. . . . 104
`. . . . 107
`. . . . 108
`. . . . 108
`. . . . 109
`
`. . . . 111
`. . . . 111
`. . . . 112
`. . . . 112
`. . . . 114
`
`. . . . .
`7.4 Related Work . . .
`7.5 Concluding Remarks . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . 114
`. . . . 115
`
`Chapter 8 Conclusions
`
`117
`
`. . . . . .
`.
`8.1 Summary of Contributions
`8.2 Directions for Future Research . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . .
`. . . . .
`
`. . . . . .
`. . . . . .
`
`. . . . 117
`. . . . 119
`
`Bibliography
`
`120
`
`xi
`
`Page 11 of 137
`
`
`
`Chapter 1
`
`Introduction
`
`multi: many, multiple, much.
`
`medium: pl media: an individual held to be a channel of communication between the earthly world and
`
`a world of spirits.
`
`—Mariam Webster’s Collegiate Dictionary
`
`Operating systems form an essential software substrate of computing systems. An operating system manages
`
`physical resources such as processors, memory, and storage devices and presents a virtualized view of these re-
`sources to applications. The virtualized view provided for storage devices is that of a file system. A file system
`provides persistent storage of data by implementing files—named objects containing data that are immune to tem-
`porary failures of the operating system. From the user’s perspective, the file system is one of the most ubiquitous
`components of an operating system. Consequently, the services provided by a file system have a critical impact on
`
`the usability of the system and ease of application design. Since the file system manages non-volatile storage devices
`such as disks that are orders of magnitude slower than volatile storage devices such as memory and registers, the
`policies and mechanisms employed by the file system significantly impact the overall system performance. Hence,
`file systems must be designed to maximize utility to applications as well as optimize system performance.
`
`Traditionally, general purpose file systems have focused on supporting a single class of best-effort applications
`and techniques that optimize performance for such applications. The explosive growth of the World Wide Web
`coupled with synergistic advances in computing and communication technologies have led to a proliferation of
`a new generation of applications with significantly differing requirements. Hence, file systems will increasingly
`need to support applications with diverse performance requirements. Consider the following examples of emerging
`
`applications that must be supported by general purpose file systems:
`
`(cid:15) I/O intensive applications: Applications in this class arise in a variety of disciplines such as Biology (e.g.,
`
`simulation of viral structures), Chemistry (e.g., molecular dynamics), Earth Sciences (e.g., seismic data pro-
`cessing), Engineering (e.g., study of aircraft turbulence), Graphics (e.g., parallel rendering systems), and Space
`Sciences (e.g., visualization of satellite images). All of these applications generate, access and process massive
`amounts of data; the volume of data managed is orders of magnitude larger than that managed by conventional
`applications. Moreover, unlike conventional applications that access only textual data, these applications ac-
`
`cess different types of data such as audio, video, images, animations sequences, and text (collectively referred
`
`1
`
`Page 12 of 137
`
`
`
`Table 1.1: Characteristics of data types
`
`Data type
`Text
`Gif image
`CD audio
`MPEG video
`
`Storage requirement Real-time requirements
`4-8KB
`No
`64 KB
`No
`175 MB (384 Kb/s)
`Yes
`2 GB (4 Mb/s)
`Yes
`
`to as multimedia). These data types inherently differ in their characteristics such as size, format, data rate,
`
`and real-time requirements. To illustrate, typical textual and image files are a few kilobytes in size, whereas
`audio and video files are orders of magnitude larger. Moreover, textual data accesses are aperiodic in nature,
`whereas audio and video (continuous media) accesses are periodic and sequential. Finally, textual and image
`accesses are best-effort in nature, while continuous media accesses impose real-time constraints and require
`performance guarantees. Table 1.1 summarizes these characteristics.
`
`(cid:15) Interactive applications: Examples of interactive applications include flight training simulators, virtual worlds,
`and multi-player games. In addition to the requirements imposed by I/O intensive applications, these applica-
`
`tions demand real-time interactivity from the system.
`
`The need for supporting diverse applications and managing heterogeneous data necessitates fundamental changes
`to the services provided by a file system. The design, implementation, and evaluation of a file system that achieves
`these objectives, referred to as an integrated multimedia file system, is the focus of this dissertation.
`The rest of this chapter is organized as follows. A brief survey of existing file systems is presented in Section
`1.1.
`In Section 1.2, we argue that these file systems are inadequate for meeting the requirements of emerging
`
`applications. Section 1.3 summarizes the research contributions of this dissertation, and Section 1.4 presents an
`outline of the dissertation.
`
`1.1 File Systems: A Perspective
`
`File systems have evolved significantly since their inception. Historically the first file systems were developed for
`mainframe operating systems, such as CTSS, OS/360, MVS, VMS and Multics, and were optimized for supporting
`data processing applications such as payroll and inventory control [79]. The advent of the UNIX1 operating system
`in the 1970s led to a proliferation of affordable minicomputers. Early versions of UNIX supported a file system—
`now known as the System V file system—that employed simple storage and retrieval policies, such as small block
`
`sizes and limited prefetching [73]. These policies limited the file server throughput to less than 4% of the peak disk
`bandwidth [56]. The Berkeley Fast File System (FFS), introduced in 4.2BSD UNIX, addressed these limitations by
`increasing the block size used to store files and by using sophisticated storage allocation and read-ahead techniques
`[56]. It also introduced new features such as file locking, long file names and symbolic links.
`
`1UNIX is a registered trademark of the X/OPEN consortium.
`
`2
`
`Page 13 of 137
`
`
`
`The growing popularity of local area networks (LANs) in the eighties saw the emergence of distributed file
`
`systems, such as Sun Microsystem’s Network File System (NFS) and AT&T’s Remote File Sharing (RFS) [87].
`These file systems allowed users to access files stored on remote servers from any computer on a network. Both NFS
`and RFS were designed for LAN environments with a small number of clients; neither scales well to enterprise-wide
`and campus-wide networks with thousands of clients. The Andrew file system (AFS) from CMU and IBM was
`developed to address this limitation [87]. AFS partitioned the network into clusters and used a dedicated server per
`
`cluster to store files frequently accessed by clients in the cluster. AFS also used aggressive caching techniques to
`minimize network traffic. Together, these techniques enabled AFS to scale to a large number of servers servicing
`thousands of clients.
`Several technological trends have driven file system research in the nineties. Falling memory prices have made
`large file caches on file servers commonplace. These caches service a majority of the read requests using cached
`
`data, causing disk traffic to be dominated by write requests. The log-structured file system from Berkeley exploits
`this trend by writing all information to disk as a sequential append-only log, thereby substantially improving disk
`throughput [74]. xFS extends this idea to a distributed file system. In addition, xFS exploits the availability of fast
`local area networks to harness resources available on client machines to improve file I/O performance. All aspects
`of file system services—data storage, data caching and control processing—are distributed across potentially all
`
`machines in the system, leading to a serverless architecture [3].
`Concurrently, advances in compression technologies, coupled with several standardization efforts such as MPEG,
`have led to a proliferation of digitized audio and video. Since audio and video inherently differ in their characteristics
`as compared to textual data, conventional file systems are inadequate for managing these data types. To address
`these limitations, several special-purpose file systems, referred to as continuous media servers, have been developed
`
`[2, 29, 44, 71, 84, 88]. These servers exploit the sequential and periodic nature of continuous media accesses to
`improve server throughput, and employ resource reservation algorithms to provide real-time performance guarantees.
`Recently, the growing popularity of the World Wide Web has led to an increased interest in mechanisms for wide
`area information access. File systems such as WebNFS [12] and Common Internet File System (CIFS) [49] that
`enable global information access over wide area networks (WANs) have been developed to fill this void. These file
`
`systems are optimized for file accesses over networks with large latencies.
`In summary, rapid technological advances have fueled file system research in the past few decades. File systems
`have evolved from simple data storage systems to sophisticated networked information access systems. However, as
`we shall argue in the next section, even these modern file systems have certain fundamental shortcomings that make
`them unsuitable for supporting diverse applications accessing heterogeneous data.
`
`1.2 Limitations of Existing File Systems
`
`The need for managing heterogeneity in application requirements and data characteristics imposes three key require-
`ments on a file system:
`
`(cid:15) Heterogeneity in application requirements: Applications that coexist in general purpose computing envi-
`
`ronments have diverse performance requirements. For instance, interactive best-effort applications, such as
`word-processors and compilers, desire low average response times but no absolute performance guarantees.
`Throughput-intensive best-effort applications, such as ftp and http servers, desire high aggregate throughput
`
`3
`
`Page 14 of 137
`
`
`
`and are less concerned about the response times of individual requests. Soft real-time applications, such as
`
`video players, require performance guarantees from the file system (e.g., bounds on response times) but can
`tolerate occasional violations of these guarantees. To efficiently support diverse applications, a file system
`will be required to simultaneously optimize different performance criteria. To do so, the file system should
`support multiple classes of service (e.g., real-time, interactive) and align the service provided within a class
`with application needs.
`
`(cid:15) Heterogeneity in data characteristics: Emerging applications access multiple types of data. Since data types
`inherently differ in their characteristics, use of a single policy for managing all data types can yield sub-
`
`optimal results. For instance, the least recently used (LRU) cache replacement policy is suitable for textual
`file accesses that exhibit locality of reference, but is completely ineffective for sequential continuous media
`accesses. Similarly, the Interval Caching policy developed for sequential continuous media accesses is unsuit-
`able for textual accesses [26]. Since no single cache replacement policy is suitable for all data types, the file
`system must either sacrifice cache hit ratio or support multiple policies to reconcile conflicting requirements.
`
`Similar arguments are applicable to policies employed for storage, retrieval, and failure recovery. Hence, to
`efficiently support heterogeneous data, the file system should enable the coexistence of multiple data type
`specific specific policies.
`
`(cid:15) Extensibility: Since it is difficult, if not impossible, to foresee requirements imposed by future applications and
`data types, a file system should employ techniques that facilitate easy integration of new application classes
`and data types. This requires file systems to employ an extensible architecture so that file system services can
`
`be easily tailored to meet new requirements.
`
`Unfortunately, existing file systems fail to meet one or more of these requirements. Most existing file systems are
`optimized for a single application class and a single data type. For instance, conventional file systems are optimized
`for best-effort applications that store and retrieve textual data [56]. These file systems provide a best-effort service
`to all applications regardless of their performance requirements. Consequently, they are ineffective at meeting the
`performance requirements of real-time applications. In contrast, continuous media file servers employ predictable
`
`resource allocation techniques to meet the real-time requirements of audio and video. Most of these servers assume
`a predominantly read-only environment with substantially less stringent response time requirements (e.g., video-
`on-demand services in which latencies of a few seconds for initiating continuous media playback are considered
`acceptable). Hence, they are not suitable for conventional textual applications. Finally, since all of these file systems
`have been developed for a single class of applications, none of them employ an extensible architecture. Due to
`
`these limitations of existing file systems, realizing emerging applications requires the development of integrated
`multimedia file systems (henceforth referred to as integrated file systems).
`
`1.3 Summary of Contributions
`
`This dissertation focuses on the design, implementation and evaluation of Symphony—an integrated file system that
`can manage heterogeneity in application requirements and data characteristics. A key thesis of our work is that
`a single technique is inadequate for meeting the diverse requirements of applications and data types, and hence,
`
`4
`
`Page 15 of 137
`
`
`
`an integrated file system should enable the coexistence of multiple application-specific and data-type specific tech-
`
`niques. This is a fundamental departure from existing file systems, which employ a single technique for servicing
`all applications and data types regardless of their requirements.
`This dissertation makes five contributions. First, we propose two different methodologies for designing integrated
`file systems and evaluate their tradeoffs. Second, we design mechanisms that enable the coexistence of diverse data
`type specific policies in the file system. Third, we design data type specific policies that exploit the characteristics of
`
`the data to optimize file server performance. Fourth, we develop a novel two layer architecture for Symphony that
`separates data type independent mechanisms from data type specific policies, and thereby facilitates easy extensions
`to the file system. Fifth, we instantiate our policies and mechanisms in a prototype implementation of Symphony
`and experimentally demonstrate the efficacy of our techniques for managing diverse applications and heterogeneous
`data. In what follows, we describe our contributions in detail.
`
`We first propose two different methodologies for designing integrated file systems—logically integrated file sys-
`tems and physically integrated file systems—and evaluate their tradeoffs. We argue that use of a single physically
`integrated server for all applications is desirable in many environments over logically integrated file systems that
`employ separate servers for each application class. For such environments, we demonstrate that dynamic sharing
`of resources inherent in the former approach yields a manifold performance improvement over employing separate
`
`servers. Based on these results, we choose the physically integrated file system architecture for designing Symphony.
`We then examine the requirements imposed on physically integrated file systems and derive key principles underly-
`ing the design of such file systems. Specifically, we argue that, a physically integrated file system must: (i) export
`multiple classes of service to applications and align the service provided with application needs, (ii) support multi