`Machines
`
`Andrew Kent Warfield
`
`Clare Hall
`
`A dissertation submitted to the University of Cambridge
`for the degree of Doctor of Philosophy
`
`University of Cambridge
`Computer Laboratory
`William Gates Building
`15 JJ Thomson Avenue
`Cambridge CB3 0FD
`UK
`
`Email: andrew.warfield@cl.cam.ac.uk
`
`May 5, 2006
`
`Microsoft Ex. 1045, p. 1
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Microsoft Ex. 1045, p. 2
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`Microsoft Ex. 1045, p. 2
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`This dissertation is the result of my own work and includes nothing which
`is the outcome of work done in collaboration except where specifically indi-
`cated in the text.
`This dissertation is not substantially the same as any I have submitted for a
`degree or diploma or any other qualification at any other university.
`No part of this dissertation has already been, or is currently being submitted
`for any such degree, diploma or other qualification.
`This dissertation does not exceed sixty thousand words.
`
`This dissertation is copyright c(cid:2)2005 Andrew Warfield.
`All trademarks used in this dissertation are hereby acknowledged.
`
`ii
`
`Microsoft Ex. 1045, p. 3
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Virtual Devices for Virtual Machines
`Summary
`
`May 5, 2006
`
`Andrew Kent Warfield
`Clare Hall College
`
`Computer systems research has recently seen a huge resurgence of interest in
`hardware virtualization, a software technique originally developed to man-
`age mainframe computers in the 1960s. Using virtual machines (VMs), a
`commodity PC may be divided into isolated “slices”, each perceiving that
`it is executing on separate physical hardware. This thesis considers the ef-
`fective virtualization of I/O devices on commodity hardware and presents
`an approach that allows developers to add new functionality to a piece of
`hardware as a software extension, running in an isolated VM. The new vir-
`tual device is presented to the OS using the existing virtualized hardware
`interface, allowing extensions to be easily applied across a wide range of
`operating systems.
`
`Isolating extensions in their own virtual machines is effectively a “sledge-
`hammer” version of the system decomposition that was attempted by mi-
`crokernels through the 1980s and 1990s. The VM-based approach has the
`benefit of demonstrably working with a broad range of existing systems,
`and allowing developers to build extensions in their OS and language of
`choice. It concurrently maintains the benefits of isolation: extension crashes
`are protected from disrupting the rest of the system, and extension software
`has a clean and simple interface to devices. This thesis develops this work
`by demonstrating the construction of a set of device extensions for various
`pieces of hardware. Additionally, this thesis demonstrates that device exten-
`sions may be aggregated within cluster environments to implement device
`services, allowing specific device types to be treated as a service throughout
`a cluster of virtual machines.
`
`Several examples are presented to validate the flexibility of device exten-
`sions: A packet symmetry-based rate limiter demonstrates a single-host net-
`work extension that prevents VMs from issuing common forms of denial
`of service attacks. Parallax, a distributed storage system for VMs, demon-
`strates the implementation of a device service for the management of storage
`within a cluster. Finally, device extensions are combined with other virtu-
`alization projects to develop deployable system-wide extensions to virtual
`hardware.
`
`Microsoft Ex. 1045, p. 4
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Acknowledgements
`
`I am indebted to my advisor, Steven Hand, who has acted as both friend
`and mentor throughout my doctoral work. Steve is a gifted teacher and
`frequently provided the right guidance at the right time. I hope to be able to
`match his standard in supervising my own graduate students in the future.
`
`The packet symmetry work in Chapter 4 stems from collaboration with
`Christian Kreibich, Jon Crowcroft, Steve Hand, and Ian Pratt. James Bulpin
`developed two excellent implementations of the distributed block store used
`by Parallax in Chapter 5, and Christian Limpach provided the file system
`measurements presented in Figure 5.6. The work in Chapter 6 is a result
`of collaborations in combining the soft device framework with other re-
`search projects. The extensions for debugging are in collaboration with
`Alex Ho, while taint tracking is the result of ongoing collaboration with
`Alex, Michael Fetterman, Christopher Clark, and Steve Hand.
`
`I have been very lucky to be a member of the Systems Research Group
`during a period of time where so much great work has been in progress,
`and am delighted to have been able to participate in such a broad range of
`projects during my time at Cambridge. In particular, James Bulpin, Julian
`Chesterfield, Christopher Clark, Jon Crowcroft, Michael Dales, Tim Dee-
`gan, Michael Fetterman, Keir Fraser, Alex Ho, Eva Kalyvianaki, Christian
`Kreibich, Christian Limpach, Anil Madhavapeddy, Rolf Neugebauer, Ian
`Pratt, Russ Ross, and Steven Smith have all provided insightful discussion
`and generally been really good fun to work with.
`
`Thanks to Steve, Jon, Christian K., Julian, Rolf, Tim D., Alex, and Richard
`Mortier for providing feedback on drafts of this thesis.
`
`Thanks finally to the staff of the Castle Pub, who have provided the impetus
`for more interesting research than they can possibly realize.
`
`iv
`
`Microsoft Ex. 1045, p. 5
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Table of Contents
`
`Summary
`
`Acknowledgements
`
`1 Introduction
`
`1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.3 Outline
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.4 Published Results . . . . . . . . . . . . . . . . . . . . . . . .
`
`2 Background
`
`2.1 The Virtualization Renaissance
`
`. . . . . . . . . . . . . . . .
`
`2.1.1 Role of a Virtual Machine Monitor . . . . . . . . . .
`
`2.1.2 VMMs in Modern Systems . . . . . . . . . . . . . . .
`
`2.1.3 High-level Principles . . . . . . . . . . . . . . . . . .
`
`2.2 The Xen Virtual Machine Monitor
`
`. . . . . . . . . . . . . .
`
`2.2.1 Xen . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`2.2.2 Paravirtualized Hardware Interface . . . . . . . . . .
`
`2.2.3 Live VM Migration . . . . . . . . . . . . . . . . . . .
`
`2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`iii
`
`iv
`
`1
`
`1
`
`2
`
`3
`
`4
`
`7
`
`7
`
`8
`
`9
`
`11
`
`14
`
`16
`
`17
`
`22
`
`25
`
`v
`
`Microsoft Ex. 1045, p. 6
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`3 The Soft Device Architecture
`
`3.1 Extending Devices
`
`. . . . . . . . . . . . . . . . . . . . . . .
`
`3.1.1 Performance and Safety . . . . . . . . . . . . . . . .
`
`3.1.2 Software Engineering . . . . . . . . . . . . . . . . . .
`
`3.2 Split Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`3.2.1 Isolating Driver Code . . . . . . . . . . . . . . . . . .
`
`3.2.2 VM Device Interface . . . . . . . . . . . . . . . . . .
`
`3.2.3 The Data Path: Device Channels and Grant Tables
`
`.
`
`3.2.4 The Control Path: Control Interfaces and XenStore .
`
`3.2.5 The Virtual Block Interface
`
`. . . . . . . . . . . . . .
`
`3.2.6 The Virtual Network Interface . . . . . . . . . . . . .
`
`3.2.7 Performance of Split Drivers . . . . . . . . . . . . . .
`
`3.3 Soft Devices . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`3.3.1 The Device Tap . . . . . . . . . . . . . . . . . . . . .
`
`3.3.2 The Block Tap . . . . . . . . . . . . . . . . . . . . .
`
`3.3.3 The Network Tap . . . . . . . . . . . . . . . . . . . .
`
`3.3.4 Performance . . . . . . . . . . . . . . . . . . . . . . .
`
`3.4 Device Services
`
`. . . . . . . . . . . . . . . . . . . . . . . . .
`
`3.4.1 A Device Service Architecture . . . . . . . . . . . . .
`
`3.4.2 Performance Isolation . . . . . . . . . . . . . . . . .
`
`3.4.3 Security Isolation Through Narrow Interfaces
`
`. . . .
`
`3.4.4 Failure Isolation and Fate Sharing . . . . . . . . . . .
`
`3.4.5 Administrative Isolation . . . . . . . . . . . . . . . .
`
`3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`4 Soft Devices for Traffic Management
`
`4.1 Network Device Extensions in Xen . . . . . . . . . . . . . .
`
`4.1.1 Preventing Source Address Spoofing . . . . . . . . . .
`
`27
`
`28
`
`29
`
`30
`
`32
`
`33
`
`34
`
`35
`
`42
`
`45
`
`48
`
`49
`
`52
`
`54
`
`58
`
`63
`
`65
`
`71
`
`72
`
`73
`
`74
`
`74
`
`75
`
`76
`
`78
`
`79
`
`80
`
`vi
`
`Microsoft Ex. 1045, p. 7
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`4.1.2 Resource Isolation and Rate Control
`
`. . . . . . . . .
`
`4.1.3 Migration of Network Addresses . . . . . . . . . . .
`
`4.1.4 Virtual Private Networks . . . . . . . . . . . . . . . .
`
`4.2 Operator Responsibility and Denial of Service . . . . . . . .
`
`4.3 Packet Symmetry Enforcement . . . . . . . . . . . . . . . . .
`
`4.3.1 Packet Symmetry . . . . . . . . . . . . . . . . . . . .
`
`4.3.2 Prototype Implementation . . . . . . . . . . . . . . .
`
`4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`5 Soft Devices for Storage
`
`5.1 Parallax: A Distributed Storage Service for Virtual Machines
`
`5.1.1 Overall System Design . . . . . . . . . . . . . . . . .
`
`80
`
`81
`
`82
`
`82
`
`83
`
`84
`
`86
`
`89
`
`91
`
`92
`
`94
`
`5.1.2 Implementation and Evaluation . . . . . . . . . . . . 106
`
`5.1.3 Future Work . . . . . . . . . . . . . . . . . . . . . . 109
`
`5.1.4 Current Status
`
`. . . . . . . . . . . . . . . . . . . . . 112
`
`5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
`
`6 Supporting System-Wide Architectural Change
`
`113
`
`6.1 Device Support for Pervasive Debugging . . . . . . . . . . . 114
`
`6.1.1 Hardware Modifications in PDB . . . . . . . . . . . . 114
`
`6.1.2 Soft Device Support for Debugging . . . . . . . . . . 116
`
`6.1.3 Understanding Intrusions
`
`. . . . . . . . . . . . . . . 117
`
`6.2 Taint-based Data Isolation . . . . . . . . . . . . . . . . . . . 118
`
`6.2.1 Overview of Taint-tracking . . . . . . . . . . . . . . 118
`
`6.2.2 Device Extensions
`
`. . . . . . . . . . . . . . . . . . . 120
`
`6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
`
`vii
`
`Microsoft Ex. 1045, p. 8
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`7 Related Work
`
`123
`
`7.1 Virtual Machines . . . . . . . . . . . . . . . . . . . . . . . . 124
`
`7.2 Device Access in Operating Systems . . . . . . . . . . . . . . 126
`
`7.3 Extensibility in System Software . . . . . . . . . . . . . . . . 130
`
`7.4 Device Virtualization and Extension . . . . . . . . . . . . . . 131
`
`7.5 Smart Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 133
`
`8 Conclusion
`
`134
`
`8.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
`
`8.1.1 Extending the Existing Extensions . . . . . . . . . . . 134
`
`8.1.2 Other Device Interfaces
`
`. . . . . . . . . . . . . . . . 135
`
`8.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
`
`viii
`
`Microsoft Ex. 1045, p. 9
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`List of Tables
`
`2.1 The Paravirtualized x86 Interface. . . . . . . . . . . . . . . .
`
`17
`
`3.1 Xen’s Virtual Block Device Interface.
`
`. . . . . . . . . . . . .
`
`3.2 Network Latency Overhead . . . . . . . . . . . . . . . . . .
`
`45
`
`69
`
`3.3 Summary of Approaches to the Management of Virtual Devices 77
`
`5.1 VM Interfaces to CVDs
`
`. . . . . . . . . . . . . . . . . . . .
`
`5.2 Administrative Interfaces to CVDs
`
`. . . . . . . . . . . . . .
`
`97
`
`98
`
`ix
`
`Microsoft Ex. 1045, p. 10
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`List of Figures
`
`2.1 Overview of a VMM-based System . . . . . . . . . . . . . .
`
`2.2 Division of the Administrative Role . . . . . . . . . . . . . .
`
`2.3 Type I and II VMMs . . . . . . . . . . . . . . . . . . . . . .
`
`2.4 Memory in Xen . . . . . . . . . . . . . . . . . . . . . . . . .
`
`2.5 Migration Timeline . . . . . . . . . . . . . . . . . . . . . . .
`
`2.6 Results of Migrating a Running Web Server VM . . . . . . .
`
`3.1 Soft Device Overview . . . . . . . . . . . . . . . . . . . . . .
`
`3.2 Split Driver Structure . . . . . . . . . . . . . . . . . . . . . .
`
`3.3 Shared-Memory Ring . . . . . . . . . . . . . . . . . . . . . .
`
`3.4 Device Channel Example: Block Read . . . . . . . . . . . . .
`
`3.5 XenStore Overview . . . . . . . . . . . . . . . . . . . . . . .
`
`3.6 The Block Request Structure . . . . . . . . . . . . . . . . . .
`
`3.7 System Benchmarks of Split Drivers . . . . . . . . . . . . . .
`
`3.8 High-level View of a Traffic-limiting Soft Device . . . . . . .
`
`3.9 Overview of Device Tap Configurations . . . . . . . . . . . .
`
`3.10 Device Tap Structure . . . . . . . . . . . . . . . . . . . . . .
`
`3.11 Examples of Forwarding Modes . . . . . . . . . . . . . . . .
`
`3.12 Structure of the Block Tap . . . . . . . . . . . . . . . . . . .
`
`3.13 Example of the blktaplib Interface . . . . . . . . . . . . .
`
`3.14 Example of the libipq Interface
`
`. . . . . . . . . . . . . . .
`
`8
`
`12
`
`15
`
`20
`
`23
`
`25
`
`29
`
`35
`
`40
`
`41
`
`43
`
`46
`
`50
`
`52
`
`54
`
`55
`
`57
`
`58
`
`62
`
`64
`
`x
`
`Microsoft Ex. 1045, p. 11
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`3.15 Block Throughput
`
`. . . . . . . . . . . . . . . . . . . . . . .
`
`3.16 Block Request Timelines . . . . . . . . . . . . . . . . . . . .
`
`3.17 Network Throughput . . . . . . . . . . . . . . . . . . . . . .
`
`3.18 Structure of a Device Service . . . . . . . . . . . . . . . . . .
`
`4.1 Illustration of Asymmetry-based Rate Limiting . . . . . . . .
`
`4.2 Simple DDoS Example: UDP Flood . . . . . . . . . . . . . .
`
`4.3 Limiting UDP Flood Based on Packet Symmetry . . . . . . .
`
`4.4 High Throughput TCP Traffic Remains Unaffected . . . . .
`
`5.1 Parallax High-Level Architecture
`
`. . . . . . . . . . . . . . .
`
`5.2 CVD Snapshot and Copy-on-Write . . . . . . . . . . . . . .
`
`66
`
`67
`
`68
`
`72
`
`85
`
`86
`
`88
`
`89
`
`95
`
`99
`
`5.3 CVD Tree View—Visualizing the Snapshot Log
`
`. . . . . . . 101
`
`5.4 Administrative Data Structures
`
`. . . . . . . . . . . . . . . . 102
`
`5.5 Throughput and Compile Performance . . . . . . . . . . . . 109
`
`5.6 Write Cost of updatedb on File Fystems
`
`. . . . . . . . . . . 109
`
`6.1 Overview of Taint Tracking . . . . . . . . . . . . . . . . . . 118
`
`xi
`
`Microsoft Ex. 1045, p. 12
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Glossary
`
`ABI
`CVD
`IPI
`OS
`VCPU
`VM
`VMM
`
`Application Binary Interface
`Cluster Virtual Disk
`Inter-processor Interrupt
`Operating System
`A Virtual CPU
`Virtual Machine
`Virtual Machine Monitor
`
`Backend
`Backend Driver
`
`VM Containing a backend driver. (§3.2)
`A device driver that maps requests from frontend
`drivers in client VMs onto a physical device driver
`in the backend VM. (§3.2)
`Communication Ring A shared memory ring used to pass messages between
`VMs. (§3.2.3.4)
`A driver that allows interposition and redirection of
`device requests. (§3.3)
`A mechanism for high-performance inter-VM com-
`munication based on the combination of communi-
`cation rings and event channels. (§3.2.3)
`An aggregation of a set of soft devices, allowing a
`given device class to be managed as a service across a
`set of physical hosts.
`Logical container within which a VM executes.
`Privileged domain, capable of administering other
`domains.
`VM, with hardware access to a device, which hosts a
`physical device driver. (§3.2)
`One-bit VM-to-VM notification mechanism; effec-
`tively a virtual interrupt line. (§3.2.3.1)
`
`Device Tap
`
`Device Channel
`
`Device Service
`
`Domain
`Domain 0
`
`Driver VM
`
`Event Channel
`
`xii
`
`Microsoft Ex. 1045, p. 13
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Event Notification
`Extension VM
`Foreign Page
`Frontend Driver
`
`Grant Mapping
`
`Grant Table
`
`Grant Transfer
`
`Guest
`GuestOS
`
`Hypercall
`
`Hypervisor
`Machine Memory
`
`Page
`
`Paravirtualization
`
`Physical Memory
`Pseudo-physical
`Memory
`Soft Device
`
`Split Driver
`
`Tap
`
`Notification received on an event channel. (§3.2.3.1)
`VM hosting a device extension. (§3.3)
`A page of memory mapped from another VM.
`A simple virtual device driver allowing a VM to ac-
`cess a class of devices (e.g. disks, network interfaces)
`exported by a backend driver. (§3.2)
`A grant table operation allowing a page of memory
`belonging to one VM to be mapped into another VM
`as shared memory.
`A per-VM table that lists pages of memory that the
`VM is sharing (mapping) or transfering to other
`VMs.
`A grant table operation allowing a page of memory
`belonging to a VM to be relinquished, and given to a
`specific other VM.
`A VM running above a hypervisor.
`The OS used in a guest. In the case of Xen, guestOSes
`are often paravirtualized.
`Analogous to a system call, a hypercall is a request
`for the hypervisor to perform a privileged operation.
`syn. Virtual Machine Monitor
`In the context of a VMM, machine memory refers to
`the real hardware memory on the system.
`Unit of memory described by virtual memory hard-
`ware on a processor. Notwithstanding processor ex-
`tensions, a page of memory on the x86 architecture
`is 4096 bytes.
`A technique of hardware virtualization in which the
`virtual hardware interface is different from the real
`physical hardware. Paravirtualization is used to im-
`prove performance, but requires that hosted operat-
`ing systems be modified to reflect the modified hard-
`ware interface.
`syn. Pseudo-physical memory.
`Virtualized machine memory, presented as physical
`memory to a VM.
`A device, presented to a virtual machine, which has
`had its functionality augmented in software. (§3.3)
`A client-server-style device driver model used to share
`access to I/O devices in a VMM. (§3.2)
`syn. Device Tap.
`
`xiii
`
`Microsoft Ex. 1045, p. 14
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Virtual Machine
`
`Virtual Machine
`Monitor
`
`Xen
`
`XenBus
`
`XenLinux
`
`XenStore
`
`An execution environment in which hosted software
`is presented with the illusion that it has ownership of
`a complete physical machine.
`A piece of low-level systems software that multi-
`plexes access to physical hardware across a collection
`of virtual machines.
`A virtual machine monitor developed at the Univer-
`sity of Cambridge.
`A device driver used to map device configuration in-
`formation from XenStore onto an OS’s device prob-
`ing interfaces. (§3.2.4.1)
`The paravirtualized version of Linux that runs on
`Xen.
`A persistant hierarchical store used to communicate
`configuration data in Xen. (§3.2.4.1)
`
`xiv
`
`Microsoft Ex. 1045, p. 15
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Chapter 1
`
`Introduction
`
`1.1
`
`Motivation
`
`The development of system software to support access to I/O devices is a
`source of many challenges. For years, OS developers have been tasked with
`efficiently bridging the gap between the low-level interfaces of a constantly
`evolving set of device hardware, and the semantically richer, high-level in-
`terfaces required by application software. While the design space between
`these two interfaces is broad and allows considerable room for innovation,
`it has resulted in a wide variety of OSes for common, commodity hardware
`that each have a different and generally incompatible approach to interact-
`ing with devices.
`
`Device driver code is hard to write, and has been described as the most
`error-prone subset of modern operating systems [CYC+01, SBL03]. As the
`driver-OS interface is not standard across systems, device vendors are unable
`to develop robust commercial drivers for every OS on which a driver will be
`used. The lack of driver availability and the difficulty in supporting devices
`as they emerge have been described as a stifling factor in the development of
`new OSes [FBB+97].
`
`As the task of supporting devices on an OS is difficult, the act of extending
`them—installing software to augment the functionality of a given device—is
`very hard indeed. Despite the fact that many interesting research projects
`have shown the power of innovation at the device interface to build ex-
`tensions such as secure [GNA+97], distributed [LT96, SFV+04], or ver-
`sioned [WCG04] storage and packet filtered [EK96, PF01] or intrusion-
`detecting [WCSG04] network interfaces, these efforts have had only min-
`
`1
`
`Microsoft Ex. 1045, p. 16
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`1.2. Contribution
`
`imal impact on real systems. Moreover, development of these extensions is
`complicated by the eccentricities of individual OS device interfaces, and the
`resulting code is invariably hard to maintain or port to other systems.
`
`As commodity systems have become increasingly powerful, and organiza-
`tions have become concerned with the degree of utilization of individual
`physical servers, there has been a renewed interest in hardware virtualiza-
`tion, a technique originally developed for mainframes in the late 1960s
`which allows a physical host to be partitioned into a number of virtual
`machines (VMs). The availability and growing deployment of hardware vir-
`tualization on commodity systems has interesting implications for problems
`relating to device management in systems software; virtualization results in
`both challenges and opportunities in managing devices.
`
`On one hand, hardware virtualization presents a challenge in that it is unsafe
`to grant concurrent access to a device’s physical interface to more than one
`operating system. A system providing virtualization is thus responsible for
`safely multiplexing low-level device access among running virtual machines.
`Not only must VMs be able to share access to physical devices such as disk
`and network, but they must be prevented from interfering with one another
`either maliciously or accidentally.
`
`Conversely, virtualization presents developers with the ability to interact
`with devices below the OS’s hardware interface. As virtualization soft-
`ware runs underneath OS code, interactions with virtualized device hard-
`ware must pass through it. This unique position allows the introduction of
`device extensions that are inherently both isolated from and portable across
`the range of OSes that run in the virtualized environment.
`
`1.2
`
`Contribution
`
`It is the thesis of this work that a well-designed approach to device virtual-
`ization addresses the major problems of portability and extensibility in the
`management of devices on commodity systems. Using the Xen virtual ma-
`chine monitor, a widely-available and robust VMM that has been developed
`at Cambridge over the past four years, this work demonstrates a set of ap-
`proaches to the virtualization and extension of I/O devices for commodity
`
`2
`
`Microsoft Ex. 1045, p. 17
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`1.3. Outline
`
`hardware. The techniques described are validated through the development
`of a set of practical, useful device extensions targeted primarily at large
`computer installations, such as data centres, where virtualization is used.
`
`The initial contribution of this work is the design and implementation of
`an architecture for the development of device extensions. The combination
`of a physical device and extension software that modifies the behaviour of
`that piece of hardware is described as a soft device.
`I have constructed
`a set of software tools that allow the construction of soft devices for the
`disk and network device interfaces that exist in Xen today. This approach
`demonstrates that extensions may be written and executed in user-space of
`an isolated VM, allowing developers complete freedom to innovate while
`maintaining reasonable performance.
`
`The second contribution of this thesis is the aggregation of soft device-based
`extensions to form device services. Device services allow device extensions
`to be composed into cluster-wide facilities which serve large numbers of
`virtual machines.
`
`To demonstrate the range, scope and flexibility of these techniques, I have
`explored the construction of both soft devices and device services for a va-
`riety of applications. An additional contribution of this thesis is the explo-
`ration of a set of such examples, specifically storage, traffic management,
`and whole-system extensions, that are targeted to address relevant problems
`in clustered VM environments.
`
`1.3
`
`Outline
`
`The remainder of this thesis is structured as follows:
`
`Chapter 2 is a discussion of the relevant background and aims to familiarize
`the reader with the current state of hardware virtualization in general and
`Xen in particular.
`
`Chapter 3 describes the complete set of mechanisms for device virtualization
`and extension advocated by this thesis. It begins with a detailed presenta-
`tion of the split driver model for providing virtual devices in Xen. Next it
`presents the soft device architecture for constructing isolated device exten-
`
`3
`
`Microsoft Ex. 1045, p. 18
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`1.4. Published Results
`
`sions, and demonstrates implementations of extension support for both disk
`and network devices. The chapter concludes with a description of the device
`service model for aggregating device extensions in cluster environments. The
`remaining chapters of the thesis validate this architecture by demonstrating
`examples of soft device-based extensions.
`
`In Chapter 4, I discuss extensions for the support of network interfaces.
`After surveying the challenges faced in virtualizing network devices, the
`chapter presents an example device extension, a packet symmetry-based rate
`limiter. This extension monitors outbound and inbound packet counts and
`prevents VMs from being used maliciously to mount denial of service at-
`tacks.
`
`Chapter 5 presents Parallax, an example device service to address the stor-
`age requirements of virtualization-based clusters.
`
`Chapter 6 presents a final example of the application of device services.
`This chapter considers the combination of the techniques developed in this
`thesis with more sweeping changes to the virtualization system in order to
`realize full-system architectural change. The chapter presents two examples
`of such change: extensions to support whole-system debugging, and taint-
`based memory protection.
`
`Chapter 7 places this thesis in the context of relevant related work and
`Chapter 8 concludes and discusses directions for future investigation.
`
`1.4
`
`Published Results
`
`Some aspects of this work have been described previously. In reverse chrono-
`logical order, the list of related publications is as follows:
`
`1. C. Kreibich, A. Warfield, J. Crowcroft, S. Hand and I. Pratt. Using
`Packet Symmetry to Curtail Malicious Traffic. In Proceedings of the
`ACM Workshop on Hot Topics in Networks (HotNets), College Park,
`MD, 2005.
`
`Describes initial results regarding the packet symmetry scheme pre-
`sented in Chapter 4.
`
`4
`
`Microsoft Ex. 1045, p. 19
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`1.4. Published Results
`
`2. A. Warfield, R. Ross, K. Fraser, C. Limpach and S. Hand. Paral-
`lax: Managing Storage for a Million Machines. In Proceedings of the
`USENIX Workshop on Hot Topics in Operating Systems (HotOS),
`Santa Fe, NM, 2005.
`
`Presents the Parallax prototype, which is extended by the work de-
`scribed in Chapter 5.
`
`3. S. Hand, A. Warfield, K. Fraser, E. Kotsovinos and D. Magenheimer.
`Are Virtual Machine Monitors Microkernels Done Right? In Proceed-
`ings of the USENIX Workshop on Hot Topics in Operating Systems
`(HotOS), Santa Fe, NM, 2005.
`
`Argues that virtual machine monitors provide a practical means of
`achieving the isolation goals sought by microkernel research. The
`theme of this paper represents the nucleus of the argument presented
`in this thesis.
`
`4. C. Clark, K. Fraser, S. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt
`In Proceed-
`and A. Warfield. Live Migration of Virtual Machines.
`ings of the USENIX Symposium on Networked Systems Design and
`Implementation (NSDI), Boston, MA, 2005.
`
`Presents the design and implementation of live-migration support for
`Xen-based virtual machines. The ability to migrate running virtual
`machines punctuates the separation from real hardware that exists in
`these environments, and helps form the basis of the argument for de-
`vice services presented in Chapter 3, and illustrated by Parallax in
`Chapter 5.
`
`5. A. Warfield, S. Hand, K. Fraser and T. Deegan. Facilitating the De-
`In Proceedings of the USENIX Annual
`velopment of Soft Devices.
`Technical Conference, Anaheim, CA, 2005.
`
`Discusses the initial implementation of the block tap, an instance of a
`device tap as presented in Chapter 3.
`
`6. K. Fraser, S. Hand, R. Neugebauer, I. Pratt, A. Warfield and M. Wil-
`liamson. Safe Hardware Access with the Xen Virtual Machine Mon-
`itor. In Proceedings of the 1st Workshop on Operating System and
`Architectural Support for the On-Demand IT Infrastructure (OASIS-
`
`5
`
`Microsoft Ex. 1045, p. 20
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`1.4. Published Results
`
`1), Boston, MA, 2004.
`
`Motivates the use of virtual machine monitors to enhance the reliabil-
`ity of legacy device drivers for commodity systems, and explains the
`driver architecture used to share I/O devices across virtual machines
`in Xen. This paper presents an initial discussion of split drivers, which
`are detailed in Chapter 3.
`
`7. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R.
`Neugebauer, I. Pratt and A. Warfield. Xen and the Art of Virtual-
`ization. In Proceedings of the 19th ACM Symposium on Operating
`Systems Principles (SOSP), Lake George, NY, 2003.
`
`Provides a detailed technical description of the initial public release of
`the Xen virtual machine monitor.
`
`8. A. Warfield, S. Hand, T. Harris and I. Pratt. Isolation of Shared Net-
`work Resources in Xenoservers. PlanetLab Design Note PDN-02-
`006, 2002.
`
`Describes early approaches taken to the virtualization of network de-
`vices. This paper predates the driver architecture that is used in this
`paper.
`
`6
`
`Microsoft Ex. 1045, p. 21
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`Chapter 4
`
`Soft Devices for Traffic
`Management
`
`The virtualization of network resources has been of particular interest in
`recent years, as facilities such as virtual hosting providers and academic re-
`search networks strive to provide data-link (typically Ethernet) layer access
`to shared network resources in a fair, and isolated manner. My original in-
`volvement with network virtualization was with the network subsystem of
`Xen, which was described both as a Planetlab design note [WHTP02] and
`in the original Xen paper [BDF+03].
`
`This chapter explores the application of device extensions to address prob-
`lems relating to the management of network resources in VMM-based envi-
`ronments. As with other examples that are presented throughout this thesis,
`device extensions are particularly relevant in that they first present an op-
`portunity to cleanly extend systems in a manner that has hitherto been dif-
`ficult to achieve. Second, they address emerging problems that result from
`the scale, and division of administrative responsibility that are presented by
`large VMM-based environments. These two characteristics may be restated
`specifically in terms of network resources as follows:
`
`1. The ability to place isolated extensions at the network edge.
`
`Soft devices are an especially powerful abstraction with regard to net-
`work resources in that, unlike in-OS extensions, they cannot be di-
`rectly subverted under the compromise of administrative access on the
`client OS. Secondly, extensions may limit a VM’s ability to transmit
`data by directly applying back-pressure: they may simply refuse to ac-
`cept transmissions, forcing the burden of queueing onto the client OS
`
`78
`
`Microsoft Ex. 1045, p. 22
`Microsoft v. Daedalus Blue
`IPR2021-00832
`
`
`
`[VMC+05] M. Vrable, J. Ma, J. Chen, E. Vandekieftand A. Snoeren
`D. Moore, G. Voelker, and S. Savage. Scalability, fidelity and
`containment in the Potemkin virtual honeyfarm. In Proceed-
`ings of the ACM Symposium on Operating System Principles
`(SOSP), Brighton, UK, October 2005. 10, 93, 126
`
`[Wal02]
`
`C. A. Waldspurger. Memory resource management in VMware
`ESX server. In Proceedings of the 5th Symposium on Operat-
`ing Systems Design and Implementation, pages 181–194, De-
`cember 2002. 21, 37, 115
`
`[WCG04] A. Whitaker, R. S. Cox, and S. D. Gribble. Configuration
`debugging as search: Finding the needle in the haystack.
`In
`OSDI, pages 77–90, 2004. 1, 10, 61, 91, 93, 126, 133
`
`[WCSG04] A. Whitaker, R. Cox, M. Shaw, and S. Gribble. Constructing
`services with interposable virtual hardware. In Proceedings of
`the 1st Symposium on Networked Systems Design and Imple-
`mentation, pages 169–182, March 2004. 1, 12, 30, 61, 132,
`133
`
`[WFHD05] A. Warfield, K. Fraser, S. Hand, and T. Deegan. Facilitating the
`development of soft devices. In Proceedings USENIX Annual
`Technical Conference, pages 379–382, 2005. 58
`
`[Whe00]
`
`D. A. Wheeler. Estimating linux’s size.
`http://www.dwheeler.com/sloc/, November 2000. 33
`
`[WHTP02] A. Warfield, S. Hand, T.Harris, and I. Pratt. Isolation of shared
`network resources in Xenoservers. Technical Report PDN-02-
`006, Planet Lab Design Note, November 2002. 48, 63, 78
`
`[Wil02]
`
`Matthew M. Williamson. Throttling viruses: Restricting prop-
`In Proceedings of
`agation to defeat malicious