`( 12 ) United States Patent
`Rawat et al .
`( 10 ) Patent No . :
`( 45 ) Date of Patent :
`US 9 , 881 , 040 B2
`Jan . 30 , 2018
`( 71 ) Applicant : VMware , Inc . , Palo Alto , CA ( US )
`( 72 ) Inventors : Mayank Rawat , Sunnyvale , CA ( US ) ;
`Ritesh Shukla , Saratoga , CA ( US ) ; Li
`Ding , Cupertino , CA ( US ) ; Serge
`Pashenkov , Los Altos , CA ( US ) ;
`Raveesh Ahuja , San Jose , CA ( US )
`( 73 ) Assignee : VMware , Inc . , Palo Alto , CA ( US )
`Subject to any disclaimer , the term of this
`( * ) Notice :
`patent is extended or adjusted under 35
`U . S . C . 154 ( b ) by 336 days .
`( 21 ) Appl . No . : 14 / 831 , 808
`( 22 ) Filed :
`Aug . 20 , 2015
`Prior Publication Data
`( 65 )
`US 2017 / 0052717 A1 Feb . 23 , 2017
`( 51 ) Int . Ci .
`G06F 3 / 06
`( 2006 . 01 )
`G06F 1730
`( 2006 . 01 )
`( 52 ) U . S . CI .
`CPC . . . . . . . . G06F 17 / 30327 ( 2013 . 01 ) ; G06F 3 / 067
`( 2013 . 01 ) ; G06F 3 / 0608 ( 2013 . 01 ) ; GOOF
`370641 ( 2013 . 01 ) ; G06F 17 / 30088 ( 2013 . 01 )
`Field of Classification Search
`( 58 )
`CPC . . . . . . . . . . . . GO6F 17 / 30327 ; G06F 3 / 0608 ; G06F
`370641 ; G06F 3 / 067 ; G06F 17 / 30088
`See application file for complete search history .
`( 56 )
`References Cited
`7 / 2014 Acharya et al .
`8 , 775 , 773 B2
`9 , 720 , 947 B2 *
`8 / 2017 Aron . . . . . . . . . . . . . . . . . G06F 17 / 30327
`G06F 12 / 1018
`9 , 740 , 632 B1 *
`8 / 2017 Love
`. . . .
`2015 / 0058863 Al
`2 / 2015 Karamanolis et al .
`2016 / 0210302 Al *
`7 / 2016 Xia
`. . . G06F 3 / 0619
`* cited by examiner
`Primary Examiner - Eric S Cardwell
`( 74 ) Attorney , Agent , or Firm — Patterson & Sheridan ,
`( 57 )
`User data of different snapshots for the same virtual disk are
`stored in the same storage object . Similarly , metadata of
`different snapshots for the same virtual disk are stored in the
`same storage object , and log data of different snapshots for
`the same virtual disk are stored in the same storage object .
`As a result , the number of different storage objects that are
`managed for snapshots do not increase proportionally with
`the number of snapshots taken . In addition , any one of the
`multitude of persistent storage back - ends can be selected as
`the storage back - end for the storage objects according to
`user preference , system requirement , snapshot policy , or any
`other criteria . Another advantage is that the storage location
`of the read data can be obtained with a single read of the
`metadata storage object , instead of traversing metadata files
`of multiple snapshots .
`20 Claims , 4 Drawing Sheets
`duel disk
`virluai disk
`file descriptor 211
`Me desceptorZfl
`geometry =
`geormtry •
`size =
`data _ region - PTR - -
`dele_regico • PIP — — — —
`Snapshot Management Data Structure
`Snmehol kimegement Dela Sneeze
`snapshot _ data = OID !
`enepehoLdata • CeD1
`snapshot _ metadata = OID2
`erecehotffietedate • 01D2
`snapshot _ log = 01D3
`artmshot_log = 011)3
`OID1 - PTR1 w
`OID2 = PTR2
`aD2- PTR2
`QID3 = PTR3
`01123 PTR3
`$ $ 1 = lagt ; OID2 , offset xD
`SS' • tot; O! D2, offset x0
`SS2 = tag2 : OD2 , offset x2
`262 • tag2; 0O2, offset x2
`SS3 = tag 3
`RP = OID2 , offset xC
`RP • 0O2, offset xe
`Storage Device 152
`Storage Darks ifia
`VMFS 230
`VIPS 2ga
`Storage Object 1
`I Storage Object 1
`- -
`Storage Obiect 2
`1 Storage Object 2
`how w ww
`| Storage Object 3
`I Storage Object 3 II
`Storage Object 1
`Sto.age Object 1
`Slorage Object 2
` ,]s rage Object 2
`Storage Object 3
` Id Storage Object 3
`Storage Device 161
`Storage Device Ica
`yan wwwwwwwww
`Storage Object 1
`I Storage Object 1
`Storage Object 2
`1 Storage Object 2
`Storage Object 3
`I Storage Object 3 I,
`WIZ, Inc. EXHIBIT - 1019
`WIZ, Inc. v. Orca Security LTD.
`U.S. Patent
`Jan . 30 , 2018
`Sheet 1 of 4
`US 9 , 881 , 040 B2
`Host Computer System 100
`VM 112N
`VM 112
`Applications 118
`OS 116
`+ +
`+ 144141 +
`+ 44440 +
`4 6
`WWW xxxxxxxxxxxxxxxxxxxxx
`* * * * * MKMKMHRHMHMK???
`* * W
`W WXXX * * *
`* * *
`VMM 1221
`VMM 122N
`???????????????????? ???????????? ?? ??????
`SCSI Virtualization Layer 131
`Device Switch
`2 + 3
`+ 1 +
`+ + +
`Data Access Layer 136
`HBA ( S )
`HW Platform
`179???? ?1?1?17 ??? # ?
`# H
`?? #
`# ????????
`Device JuruterCuttuu
`U . S . Patent
`Jan . 30 , 2018
`Sheet 2 of 4
`US 9 , 881 , 040 B2
`virtual disk
`file descriptor 211
`geometry =
`size =
`data region = PTR = - -
`Snapshot Management Data Structure
`Snapshot _ data = OIDI
`snapshot _ metadata = 0102
`snapshot _ log = O1D3
`OID1 = PTR1
`OID2 = PTR2 -
`OID3 = PTR3
`SS1 = tagl ; OID2; of₹set x0
`$ $ 1 = tag1 ; OID2 , offset xo
`SS2 = tag2 ; OID2 , offset x2
`SS2 = tag2; OID2, offset x2
`SS3 = tag 3
`SS3 = tag 3
`RP = OID2, o₹fset xC
`RP = 01D2 , offset xa
`Storage Device 162
`VMFS 230
`1 Storage Object 1 JAAR
`Storage Object 2 I
`| Storage Object 3 want
`Pm w
`Storage Object 1
`Storage Object 2
`Storage Object 3
`Storage Device 161
`???? ????
`1 Storage Object 1
`- - - - -
`Storage Object 2 1
`Storage Object 3
`U . S . Patent
`Jan . 30 , 2018
`Sheet 3 of 4
`US 9 , 881 , 040 B2
`7000 8000
`2000 2000 3000 4000
`0 1000 2000 3000 4000
`o D2
`B + Tree
`Se "
`base , 0
`base , 3800
`base , o
`base, 3800
`C1D1 , y
`base , 3800
`base , o
`0D1 , y1
`OID1 , 0
`0ID1, 0
`base , 3200
`- SS1
`RP = OID2 , offset 0
`14_ unit of allocation
`WRI Why
`TIETO ) 414SPEED torty
`* * * * 5001 PORTO
`Ceca * * * TC )
`011 2 3
`H y1
`Methane Write data
`* * ) SeNCH
`OID2 , X10
`OID2 , x2
`OID2 , X3
`Atelier AT & C ) C300E
`A Att51474618914
`A317635 Arteriet
`40°C )
`Hotectie 3 C3ECEDO
`alebo write data
`A * *
`1) .1- 7 1.12-17417 —
`* * * * *
`* * * * *
`* * *
`A WR2
`* * * * * * *
`* *
`* * * * * * * * *
`* *
`* * *
`another unit
`www membang k
`ASTO 15046910161
`1982 recor
`et ses
`3 93183xD ) 10x « te
`Her89500 )
`1994€ ) 1031 )
`10 * 2013
`3 * 010
`write data
`to Show write data
`S $ 1 = 0102 , offset xo
`RP = OID2 , offset x8
`- WR4
`another unit
`non minumang mga tao ang
`reka hang
`mga mata me
`56940 )
`$ 1831999
`* * * *
`) $ 70 CM Chanel
`974 )
`write data
`to write data
`$ S3
`SS2 = OID2 , offset x2
`RP = 01D2 , offset xc
`5 Y
`0 1 2 3 4 5 6 7
`1011 1213
`516 7 8
`base ,
`base , 0
`0101 , 0 OD1 , y1 3800
`base , 3200
`OD1 , y3
`0ID1, y3
`base ,
`base , o
`OID1 , 0 OID1 , y1 3800
`base , 3200
`OD1 , y3
`0101, y3
`0 : 02 ,
`OD ? x5
`OO2 xi 300
`002 , xi
`0 : 02 , x2 3500
`0 : 02 , 43
`0O`.x3 3800
`TOID2 . ó
`0102x7 7900
`{ } } , x9
`0102 , 46
`OO2. Ai 3000
`v ) : x 316
`nu (
`01D2 .
`0102 , x3
`qO2. x3 1800
`OID2 , 07700
`C2 , x
`C.012.2. s7
`7 09
`0 1 2 3 4 5 6 7 8 9 A B C
`N www
`OD1 , y4 base , 300
`base ,
`OID1 , 71 3800
`OD1 , 0
`0101,0 0ID1, y1 3800
`OID1 , 13
`base , 3200
`U . S . Patent
`Jan . 30 , 2018
`Sheet 4 of 4
`US 9 , 881 , 040 B2
`Power On
`Read SMDS
`Open storage
`objects 1
`LEstablish running
`point ( RP )
`* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
`Read 10
`Access snapshot
`metadata at RP
`+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 424
`Traverse tree
`beginning at RP to
`locate data
`VM -.NI
`Set running point as root
`node of previous snapshot
`Create node for
`new running point
`Copy contents of root node
`of previous snapshot into the
`new running point node and
`mark all pointers as pointing
`to shared nodes
`Write 10
`Access snapshot
`metadata at RP
`Traverse tree
`beginning at RP to
`find write location
`and update tree nnnnnnnnnnnnnnnnnnnnnnnnnn
`Issue write
`command to write
`data at the
`Issue read
`command to read
`data from the
`* 241424344477777XXIIIIIIIIII
`I I111111111111111111111
`US 9 , 881 , 040 B2
`object , which may take the form of a file in a host file
`system , a file in a network file system , and object storage
`provisioned as a virtual storage area network ( SAN ) object ,
`a virtual volume object , or a cloud storage object . Similarly ,
`5 metadata of different snapshots for the same virtual disk are
`stored in the same storage object , and log data of different
`In a virtualized computing environment , virtual disks of
`snapshots for the same virtual disk are stored in the same
`snapshots for the same virtual disk are stored in the same
`storage object . As a result , the number of different storage
`storage object. As a result, the number of different storage
`objects that are managed for snapshots do not increase
`( " host " ) are typically represented as files in the host ’ s file
`objects that are managed for snapshots do not increase
`system . To back up the VM data and to support linked VM 10 proportionally with the number of snapshots taken . In addi
`10 proportionally with the number of snapshots taken. In addi-
`tion , any one of the multitude of persistent storage back - ends
`clones , snapshots of the virtual disks are taken to preserve
`can be selected as the storage back - end for the storage
`the VM data at a specific point in time . Frequent backup of
`the VM data at a specific point in time. Frequent backup of
`objects containing data for the snapshots . As a result , the
`VM data increases the reliability of the VMs . The cost of
`form of the storage objects containing data for the snapshots
`frequent backup , i . e . , taking frequent snapshots , is high
`because of the increase in associated storage costs and 15 may be selected according to user preference , system
`15 may be selected according to user preference, system
`because of the increase in associated storage costs and
`adverse impact on performance , in particular read perfor -
`requirement , snapshot policy , or any other criteria . Another
`advantage is that the storage location of the read data can be
`mance because each read will have to potentially traverse
`each snapshot level to find the location of the read data .
`obtained with a single read of the metadata storage object ,
`Solutions have been developed to reduce the amount of
`instead of traversing metadata files of multiple snapshots .
`storage consumed by snapshots . For example , snapshots can 20
`FIG . 1 is a computer system , shown as host computer
`be backed up incrementally by comparing blocks from one
`system 100 , having a hypervisor 108 installed on top of
`version to another and only the blocks that have changed
`hardware platform 102 to support the execution of virtual
`from the previous version are saved . Deduplication has also
`machines ( VMs ) 1121 - 112n through corresponding virtual
`been used to identify content duplicates among snapshots to
`machine monitors ( VMMs ) 122 , - 122x . Host computer sys
`25 tem 100 may be constructed on a conventional , typically
`remove redundant storage content .
`Although these solutions have reduced the storage
`server - class , hardware platform 102 , and includes one or
`requirements of snapshots , further enhancements are needed
`more central processing units ( CPUs ) 103 , system memory
`104 , one or more network interface controllers ( NICs ) 105 ,
`for effective deployment in cloud computing environments
`where the number of VMs and snapshots that are managed
`and one or more host bus adapters ( HBAs ) 106 . Persistent
`is quite large , often several orders of magnitude times 30 storage for host computer system 100 may be provided
`greater than deployment in conventional data centers . In
`locally , by a storage device 161 ( e . g . , network - attached
`addition , storage technology has advanced to provide a
`storage or cloud storage ) connected to NIC 105 over a
`multitude of persistent storage back - ends , but snapshot
`network 151 or by a storage device 162 connected to HBA
`technology has yet to fully exploit the benefits that are
`106 over a network 152 .
`provided by the different persistent storage back - ends .
`Each VM 112 implements a virtual hardware platform in
`the corresponding VMM 122 that supports the installation of
`a guest operating system ( OS ) which is capable of executing
`applications . In the example illustrated in FIG . 1 , the virtual
`FIG . 1 is a block diagram of a virtualized host computer
`hardware platform for VM 112 , supports the installation of
`system that implements a snapshot module according to 40 a guest OS 116 which is capable of executing applications
`118 within VM 112 , . Guest OS 116 may be any of the
`embodiments .
`FIG . 2 is a schematic diagram that illustrates data struc -
`well - known commodity operating systems , such as Micro
`tures for managing virtual disk snapshots according to an
`soft Windows® , Linux® , and the like , and includes a native
`file system layer , for example , either an NTFS or an ext3FS
`embodiment .
`FIG . 3 is a schematic diagram that illustrates additional 45 type file system layer . Input - output operations ( IOs ) issued
`data structures , including B + trees , for managing virtual disk
`by guest OS 116 through the native file system layer appear
`to guest OS 116 as being routed to one or more virtual disks
`snapshots according to an embodiment .
`FIG . 4A depicts a flow diagram of method steps that are
`provisioned for VM 112 , for final execution , but such IOs
`carried out in connection with opening storage objects that
`are , in reality , reprocessed by IO stack 130 of hypervisor 108
`are needed to manage snapshots according to an embodi - 50 and the reprocessed IOs are issued through NIC 105 to
`storage device 161 or through HBA 106 to storage device
`ment .
`FIG . 4B depicts a flow diagram of method steps that are
`carried out in connection with taking snapshots according to
`At the top of IO stack 130 is a SCSI virtualization layer
`131 , which receives IOs directed at the issuing VM ' s virtual
`an embodiment .
`FIG . 4C depicts a flow diagram of method steps that are 55 disk and translates them into IOs directed at one or more
`carried out to process a read IO on a virtual disk having one
`storage objects managed by hypervisor 108 , e . g . , virtual disk
`or more snapshots that have been taken according to an
`storage objects representing the issuing VM ' s virtual disk . A
`file system device switch ( FDS ) driver 132 examines the
`embodiment .
`FIG . 4D depicts a flow diagram of method steps that are
`translated IOs from SCSI virtualization layer 131 and in
`carried out to process a write IO on a virtual disk having one 60 situations where one or more snapshots have been taken of
`or more snapshots that have been taken according to an
`the virtual disk storage objects , the IOs are processed by a
`snapshot module 133 , as described below in conjunction
`embodiment .
`with FIGS . 4C and 4D .
`The remaining layers of IO stack 130 are additional layers
`65 managed by hypervisor 108 . HFS / VVOL / NSAN driver 134
`represents one of the following depending on the particular
`implementation : ( 1 ) a host file system ( HFS ) driver in cases
`According to embodiments , user data of different snap -
`shots for the same virtual disk are stored in the same storage
`US 9 , 881 , 040 B2
`2 , 3 , storage objects 1 , 2 , 3 are identified by their object
`where the virtual disk and / or data structures relied on by
`snapshot module 133 are represented as a file in a file
`identifiers ( OIDs ) in the embodiments . SMDS provides a
`system , ( 2 ) a virtual volume ( VVOL ) driver in cases where
`mapping of each OID to a location in storage . In SMDS 220 ,
`the virtual disk and / or data structures relied on by snapshot
`OID1 is mapped to PTR1 , OID2 mapped to PTR2 , and OID3
`module 133 are represented as a virtual volume as described 5 mapped to PTR3 . Each of PTR1 , PTR2 , and PTR3 may be
`in U . S . Pat . No . 8 , 775 , 773 , which is incorporated by refer -
`a path to a file in HFS 230 or a uniform resource identifier
`( URI ) of a storage object .
`ence herein in its entirety , and ( 3 ) a virtual storage area
`network ( VSAN ) driver in cases where the virtual disk
`SMDS is created per virtual disk and snapshot module
`133 maintains the entire snapshot hierarchy for a single
`and / or data structures relied on by snapshot module 133 are
`represented as a VSAN object as described in U . S . patent 10 virtual disk in the SMDS . Whenever a new snapshot of a
`application Ser . No . 14 / 010 , 275 , which is incorporated by
`virtual disk is taken , snapshot module 133 adds an entry in
`reference herein in its entirety . In each case , driver 134
`the SMDS of that virtual disk . SMDS 220 shows an entry for
`receives the IOs passed through filter driver 132 and trans -
`each of snapshots SS1 , SS2 , SS3 . Snapshot SS1 is

