`TAI EI TO U KONTUCI
`I IIIII IIIIIIII 1111111111111!IIJIII IIIII 111111 IIII
`
`
`
`
`
`
`US009881040B2
`
`( 12 ) United States Patent
`(12) United States Patent
`Rawat et al .
`Rawat et al.
`
`( 10 ) Patent No . :
`(10) Patent No.:
`( 45 ) Date of Patent :
`(45) Date of Patent:
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`Jan . 30 , 2018
`Jan. 30, 2018
`
`( 54 ) TRACKING DATA OF VIRTUAL DISK
`(54) TRACKING DATA OF VIRTUAL DISK
`SNAPSHOTS USING TREE DATA
`SNAPSHOTS USING TREE DATA
`STRUCTURES
`STRUCTURES
`( 71 ) Applicant : VMware , Inc . , Palo Alto , CA ( US )
`(71) Applicant: VMware, Inc., Palo Alto, CA (US)
`( 72 ) Inventors : Mayank Rawat , Sunnyvale , CA ( US ) ;
`(72)
`Inventors: Mayank Rawat, Sunnyvale, CA (US);
`Ritesh Shukla , Saratoga , CA ( US ) ; Li
`Ritesh Shukla, Saratoga, CA (US); Li
`Ding , Cupertino , CA ( US ) ; Serge
`Ding, Cupertino, CA (US); Serge
`Pashenkov , Los Altos , CA ( US ) ;
`Pashenkov, Los Altos, CA (US);
`Raveesh Ahuja , San Jose , CA ( US )
`Raveesh Ahuja, San Jose, CA (US)
`( 73 ) Assignee : VMware , Inc . , Palo Alto , CA ( US )
`(73) Assignee: VMware, Inc., Palo Alto, CA (US)
`Subject to any disclaimer , the term of this
`( * ) Notice :
`(*) Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`patent is extended or adjusted under 35
`U . S . C . 154 ( b ) by 336 days .
`U.S.C. 154(b) by 336 days.
`( 21 ) Appl . No . : 14 / 831 , 808
`(21) Appl. No.: 14/831,808
`( 22 ) Filed :
`Aug . 20 , 2015
`(22) Filed:
`Aug. 20, 2015
`Prior Publication Data
`( 65 )
`(65)
`Prior Publication Data
`US 2017 / 0052717 A1 Feb . 23 , 2017
`US 2017/0052717 Al
`Feb. 23, 2017
`( 51 ) Int . Ci .
`(51) Int. Cl.
`G06F 3 / 06
`( 2006 . 01 )
`G06F 3/06
`(2006.01)
`G06F 1730
`( 2006 . 01 )
`G06F 17/30
`(2006.01)
`( 52 ) U . S . CI .
`(52) U.S. Cl.
`CPC . . . . . . . . G06F 17 / 30327 ( 2013 . 01 ) ; G06F 3 / 067
`CPC
`G06F 17/30327 (2013.01); G06F 3/067
`( 2013 . 01 ) ; G06F 3 / 0608 ( 2013 . 01 ) ; GOOF
`(2013.01); G06F 3/0608 (2013.01); G06F
`370641 ( 2013 . 01 ) ; G06F 17 / 30088 ( 2013 . 01 )
`3/0641 (2013.01); G06F 17/30088 (2013.01)
`Field of Classification Search
`( 58 )
`(58) Field of Classification Search
`CPC . . . . . . . . . . . . GO6F 17 / 30327 ; G06F 3 / 0608 ; G06F
`CPC
` G06F 17/30327; G06F 3/0608; G06F
`370641 ; G06F 3 / 067 ; G06F 17 / 30088
`3/0641; G06F 3/067; G06F 17/30088
`See application file for complete search history .
`See application file for complete search history.
`
`( 56 )
`(56)
`
`References Cited
`References Cited
`U . S . PATENT DOCUMENTS
`U.S. PATENT DOCUMENTS
`7 / 2014 Acharya et al .
`8 , 775 , 773 B2
`8,775,773 B2
`Acharya et al.
`7/2014
`9 , 720 , 947 B2 *
`8 / 2017 Aron . . . . . . . . . . . . . . . . . G06F 17 / 30327
`9,720,947 B2 *
` G06F 17/30327
`8/2017
`Aron
`G06F 12 / 1018
`9 , 740 , 632 B1 *
`8 / 2017 Love
`9,740,632 B1 *
`8/2017
`Love
` G06F 12/1018
`. . . .
`2015 / 0058863 Al
`2 / 2015 Karamanolis et al .
`2015/0058863 Al
`2/2015
`Karamanolis et al.
`2016 / 0210302 Al *
`7 / 2016 Xia
`. . . G06F 3 / 0619
`2016/0210302 Al*
`7/2016
`Xia
` G06F 3/0619
`* cited by examiner
`* cited by examiner
`
`Primary Examiner - Eric S Cardwell
`Primary Examiner — Eric S Cardwell
`( 74 ) Attorney , Agent , or Firm — Patterson & Sheridan ,
`(74) Attorney, Agent, or Firm — Patterson & Sheridan,
`LLP
`LLP
`
`( 57 )
`ABSTRACT
`ABSTRACT
`(57)
`User data of different snapshots for the same virtual disk are
`User data of different snapshots for the same virtual disk are
`stored in the same storage object . Similarly , metadata of
`stored in the same storage object. Similarly, metadata of
`different snapshots for the same virtual disk are stored in the
`different snapshots for the same virtual disk are stored in the
`same storage object , and log data of different snapshots for
`same storage object, and log data of different snapshots for
`the same virtual disk are stored in the same storage object .
`the same virtual disk are stored in the same storage object.
`As a result , the number of different storage objects that are
`As a result, the number of different storage objects that are
`managed for snapshots do not increase proportionally with
`managed for snapshots do not increase proportionally with
`the number of snapshots taken . In addition , any one of the
`the number of snapshots taken. In addition, any one of the
`multitude of persistent storage back - ends can be selected as
`multitude of persistent storage back-ends can be selected as
`the storage back - end for the storage objects according to
`the storage back-end for the storage objects according to
`user preference , system requirement , snapshot policy , or any
`user preference, system requirement, snapshot policy, or any
`other criteria . Another advantage is that the storage location
`other criteria. Another advantage is that the storage location
`of the read data can be obtained with a single read of the
`of the read data can be obtained with a single read of the
`metadata storage object , instead of traversing metadata files
`metadata storage object, instead of traversing metadata files
`of multiple snapshots .
`of multiple snapshots.
`
`20 Claims , 4 Drawing Sheets
`20 Claims, 4 Drawing Sheets
`
`duel disk
`virluai disk
`210
`
`file descriptor 211
`Me desceptorZfl
`geometry =
`geormtry •
`am.
`size =
`data _ region - PTR - -
`dele_regico • PIP — — — —
`
`wwwwwwwwwwwane
`
`Snapshot Management Data Structure
`Snmehol kimegement Dela Sneeze
`2211
`
`snapshot _ data = OID !
`enepehoLdata • CeD1
`snapshot _ metadata = OID2
`erecehotffietedate • 01D2
`snapshot _ log = 01D3
`artmshot_log = 011)3
`
`OID1 - PTR1 w
`01121
`OID2 = PTR2
`aD2- PTR2
`QID3 = PTR3
`01123 PTR3
`$ $ 1 = lagt ; OID2 , offset xD
`SS' • tot; O! D2, offset x0
`SS2 = tag2 : OD2 , offset x2
`262 • tag2; 0O2, offset x2
`SS3 = tag 3
`SS3=teo3
`RP = OID2 , offset xC
`RP • 0O2, offset xe
`
`Storage Device 152
`Storage Darks ifia
`
`base
`Wwwwwwwwwwwwwwwwww
`
`VMFS 230
`VIPS 2ga
`
`wwwwwwwwwwwww
`Storage Object 1
`I Storage Object 1
`-
`- -
`-
`-
`-
`Storage Obiect 2
`1 Storage Object 2
`how w ww
`| Storage Object 3
`I Storage Object 3 II
`
`Storage Object 1
`Sto.age Object 1
`
`Slorage Object 2
`
` ,]s rage Object 2
`
`Storage Object 3
` Id Storage Object 3
`
`Storage Device 161
`Storage Device Ica
`yan wwwwwwwww
`Storage Object 1
`I Storage Object 1
`-
`-
`Storage Object 2
`1 Storage Object 2
`Storage Object 3
`I Storage Object 3 I,
`
`WIZ, Inc. EXHIBIT - 1019
`WIZ, Inc. v. Orca Security LTD.
`
`WIZ, Inc. EXHIBIT - 1019
`WIZ, Inc. v. Orca Security LTD.
`
`
`
`U.S. Patent
`atent
`
`Jan . 30 , 2018
`Jan. 30, 2018
`
`Sheet 1 of 4
`Sheet 1 of 4
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`WWWWWWWWWWWWWWWWWWWW
`
`Host Computer System 100
`Host Computer System 100
`
`WWWWWWWWWWWWW
`
`WMULA MAJMU
`
`UMMUM
`
`MAKAMU
`
`VM 112N
`VM 112N
`
`VM 112
`VM 112,
`Applications 118
`Applications 118
`
`OS 116
`OS 116
`
`WWMWWMWWWWWWWW
`
`+
`
`+ +
`
`+
`
`+
`
`+
`
`+ 144141 +
`
`+
`
`+
`
`+
`
`+ 44440 +
`
`+
`
`4 6
`
`nnnnnnnnnnnn
`
`WWW xxxxxxxxxxxxxxxxxxxxx
`
`KALULUKLUX * * *
`
`. LILIK
`
`*
`
`* * * * * MKMKMHRHMHMK???
`
`* * W
`
`W WXXX * * *
`
`*
`
`* * *
`
`VMM 1221
`VMM 1221
`
`000
`
`VMM 122N
`VMM 122N
`
`77702077777777777777777777777777777777777777777777777777777777777
`
`
`
`???????????????????? ???????????? ?? ??????
`
`SCSI Virtualization Layer 131
`, SCSI Virtualization Layer 131
`I—
`Filesystem
`Snapshot
`Filesystem
`Snapshot
`Device Switch
`Module
`Device Switch
`Module
`132
`132
`133
`
`+
`
`2 + 3
`
`+
`
`4
`
`+ 1 +
`
`1
`
`+
`
`1
`
`+ + +
`
`BERE
`
`???????
`wwwwwwwwwwwwwwwwwwwwwwwww
`wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`HESAOLIVSAN Driver 134
`HFSIVVOL/VSAN Driver 134
`
`Data Access Layer 136
`Data Access Layer 136
`
`Hypervisor
`Hypervisor
`108
`108
`
`CPUS )
`CPU(s)
`103
`2I_Q2
`
`Memory
`Memory
`104
`104
`
`NICIS )
`NIC(s)
`105
`105
`
`HBA ( S )
`HBA(s)
`106
`106
`
`HW Platform
`HW Platform
`102
`102
`
`179???? ?1?1?17 ??? # ?
`
`# H
`
`#
`
`?? #
`
`#
`
`# ????????
`
`K
`151
`151
`
`Storage
`Storage
`Device
`161
`.10.1
`
`Device JuruterCuttuu
`
`UUUUUUUUUU
`
`152
`152
`
`Storage
`Storage
`Device
`Device
`162
`1c22
`
`??
`
`FIGURE 1
`FIGURE 1
`
`
`
`U . S . Patent
`U.S. Patent
`
`Jan . 30 , 2018
`Jan. 30, 2018
`
`Sheet 2 of 4
`Sheet 2 of 4
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`virtual disk
`virtual disk
`210
`210
`
`file descriptor 211
`file descriptor 211
`geometry =
`geometry, =
`size =
`size =
`data region = PTR = - -
`data_region = PTR
`
`Snapshot Management Data Structure
`Snapshot Management Data Structure
`220
`220
`
`Snapshot _ data = OIDI
`snapshot_data = Ol D1
`snapshot _ metadata = 0102
`snapshot_metadata = OID2
`snapshot _ log = O1D3
`snapshot_log = Ol D3
`OID1 = PTR1
`OID1 = PTR1
`OID2 = PTR2 -
`OID2 = PTR2
`OID3 = PTR3
`Ol D3 = PTR3
`SS1 = tagl ; OID2; of₹set x0
`$ $ 1 = tag1 ; OID2 , offset xo
`SS2 = tag2 ; OID2 , offset x2
`SS2 = tag2; OID2, offset x2
`SS3 = tag 3
`SS3 = tag 3
`RP = OID2, o₹fset xC
`RP = 01D2 , offset xa
`
`ww
`
`w
`
`w
`
`w
`
`w
`
`w
`
`w
`
`www
`
`wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`Storage Device 162
`Storage Device 162
`
`wwwwwwwwwwwwwwwwwwwwwwwwwww
`base
`base
`
`wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`VMFS 230
`VMFS 230
`
`Hoe
`ww
`w
`*
`KWA WA
`1 Storage Object 1 JAAR
`I Storage Object 1
`KAKA
`Storage Object 2 I
`Storage Object 2 *
`| Storage Object 3 want
`I Storage Object 3
`
`*
`
`We
`
`w
`w
`Pm w
`wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`Storage Object 1
`Storage Object 1
`
`Storage Object 2
`Storage Object 2
`
`Storage Object 3
`Storage Object 3
`
`tuwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
`
`Storage Device 161
`Storage Device 161
`????
`?????
`????
`????
`???? ????
`1 Storage Object 1
`I Storage Object 1
`mm
`- - - - -
`Storage Object 2 1
`Storage Object 2 ji
`Storage Object 3
`Storage Object 3
`w
`
`3
`
`m
`
`wong
`
`???
`
`FIGURE 2
`FIGURE 2
`
`
`
`U . S . Patent
`U.S. Patent
`
`Jan . 30 , 2018
`Jan. 30, 2018
`
`Sheet 3 of 4
`Sheet 3 of 4
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`7000 8000
`2000 2000 3000 4000
`0
`7000 8000
`enefit
`0 1000 2000 3000 4000
`B
`LBA W
`LBA
`
`WR1
`WMAIkR 1
`
`2
`WR2
`
`WR3
`WR3
`
`OID2
`o D2
`
`B + Tree
`8+ Tree
`
`MUND
`27
`Se "
`00000
`
`20000
`20000
`
`base , 0
`base, 0
`
`0
`
`2
`
`1
`3
`base , 3800
`base , o
`base, 0
`base, 3800
`C1D1 , y
`0ID1, yl
`
`i
`
`base , 3800
`base, 3800
`base , o
`base. 0
`0D1 , y1
`OID1 , 0
`0ID1. y1
`0ID1, 0
`base , 3200
`base, 3200
`
`WR4
`
`=
`
`SS1
`- SS1
`RP = OID2 , offset 0
`RP = OM, offset x0
`
`OID1
`0101
`from
`li
`
`10
`
`14_ unit of allocation
`t he unit of allocation
`WRI Why
`TIETO ) 414SPEED torty
`* * * * 5001 PORTO
`WR1
`FIETSE103107147
`TEREOC1767467562
`St10109
`SO167015
`Ceca * * * TC )
`01010
`OLEDICIESIEC2
`011 2 3
`CC1324
`furorch
`4118
`H y1
`Methane Write data
`write data
`
`* * ) SeNCH
`TEST )
`
`LBA
`PIR
`LIE
`LBA
`OID2 , X10
`0102 xl
`OID2 , x2
`3500
`0102,x2
`3500
`OID2 , X3
`3800
`0102.x3
`3800
`Atelier AT & C ) C300E
`A1c347633
`A Att51474618914
`BUCHULE
`A317635 Arteriet
`Alt
`€140X16378296
`DEC31634901054
`40°C )
`Hotectie 3 C3ECEDO
`CECSECSESSE
`1926
`ACHILE )
`36963
`48010
`ACCESS * *
`alebo write data
`write data
`
`0
`
`A * *
`
`4
`1) .1- 7 1.12-17417 —
`
`
`
`* * * * *
`
`W
`
`* * * * *
`
`* * *
`
`0
`
`A WR2
`— WR2
`
`WR3
`WR3
`
`* * * * * * *
`
`* *
`
`*
`
`*
`
`* * * * * * * * *
`
`* *
`
`* * *
`
`another unit
`www membang k
`another unit
`pohon
`allocated
`allocated
`AUTO
`Actress
`Asesorerost
`I III KV,WiM
`Af0919x314
`frescore
`Area
`ASTO 15046910161
`oreret
`1982 recor
`WEDS
`et ses
`3 93183xD ) 10x « te
`Her89500 )
`1994€ ) 1031 )
`10 * 2013
`3 * 010
`write data
`to Show write data
`write data
`
`y3
`
`HUWAKAMKAMAKAKAKACHUMAKMAMAKAA
`
`SS2
`SS2
`S $ 1 = 0102 , offset xo
`SS1 = 0O2: offset x0
`RP = OID2 , offset x8
`RP = 0ID2, offset x8
`
`- WR4
`WR4
`
`pour
`
`another unit
`another unit
`for
`allocated
`allocated
`non minumang mga tao ang
`reka hang
`mga mata me
`le
`SCHECHISI 372€ CIS ) CCCC
`HOHEN
`Cre
`1
`56940 )
`Cheeheese
`$ 1831999
`
`?
`* * * *
`) $ 70 CM Chanel
`974 )
`CHE
`ISY
`write data
`write data
`to write data
`
`ITTDI
`HARAKARAR
`
`$ S3
`- SS3
`SS2 = OID2 , offset x2
`SS2 = 0ID2, offset x2
`RP = 01D2 , offset xc
`RP = 0O2, offset xC
`
`y
`
`5 Y
`PRUNAKAN
`0 1 2 3 4 5 6 7
`
`N
`
`x8
`
`1011 1213
`2
`
`4
`
`516 7 8
`
`Y
`
`000000
`4
`base ,
`base ,
`base,
`base , 0
`1
`J
`base.
`base,0
`0101 , 0 OD1 , y1 3800
`7900
`01D1, yl 3800
`7900
`O1D1,0
`base , 3200
`OD1 , y3
`0ID1, y3
`base, 3200
`8
`
`20
`
`5
`S
`S
`S
`S
`5
`
`6
`
`4
`
`5
`
`7
`base ,
`base ,
`base,
`base,
`base,0
`base , o
`OID1 , 0 OID1 , y1 3800
`7900
`01D1, y1 3800
`7900
`O1D1,0
`base , 3200
`OD1 , y3
`0101, y3
`base, 3200
`8
`
`LBA
`PIR
`Lest
`fie
`0 : 02 ,
`0
`OO2. x3
`OD ? x5
`3000
`OO2 xi 300
`002 , xi
`3200
`OO2. xl
`3200
`0 : 02 , x2 3500
`OO2. x2
`3500
`0 : 02 , 43
`3800
`0O`.x3 3800
`TOID2 . ó
`7700
`OO2. x0
`7700
`0102x7 7900
`OO2. x7
`7900
`1011121314151617
`1213141516171810
`8191A1B
`PTR
`,119
`21E
`LBA
`{ } } , x9
`Ix
`I
`OO2. x9
`3000
`0102 , 46
`S
`OO2. Ai 3000
`v ) : x 316
`S
`002. sl
`3200
`nu (
`01D2 .
`3500
`OO2. s2
`3500
`0102 , x3
`3800
`S
`qO2. x3 1800
`OID2 , 07700
`S
`OID2. xS
`77000
`C2 , x
`790
`S
`C.012.2. s7
`7900
`Luwwwwwwwwwwwwwwwwwwwwwwwww
`3
`7 09
`PERY
`0 1 2 3 4 5 6 7 8 9 A B C
`B
`N www
`?
`?
`mwenental
`A
`B
`OD1 , y4 base , 300
`0ID1, y4 base, 300
`
`G
`
`9
`
`5
`
`1
`
`2
`
`6
`
`7
`
`base,
`base ,
`base ,
`base,
`7900
`OID1 , 71 3800
`OD1 , 0
`0101,0 0ID1, y1 3800
`7900
`OID1 , 13
`0101, y3
`base , 3200
`base, 3200
`
`W
`
`WW
`
`Y
`
`xC
`
`FIGURE 3
`FIGURE 3
`
`
`
`U . S . Patent
`U.S. Patent
`
`Jan . 30 , 2018
`Jan. 30, 2018
`
`Sheet 4 of 4
`Sheet 4 of 4
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`402
`402
`...)
`
`404
`404
`)
`
`406
`J
`
`VM
`VM
`Power On
`Power On
`
`Read SMDS
`Read SMDS
`
`YYYYYYYYYYYYYYYYYYYYYY MYYYYYYYYYYYYYYYYYYYYYYYY
`
`Open storage
`Open storage
`objects
`
`objects 1
`LEstablish running
`
`Establish running
`point ( RP )
`point (RP)
`
`* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
`
`FIGURE 4A
`RE
`FIGURE 4A
`
`--\
`Read 10
`Read IO,...)
`
`4
`Jr
`
`422
`422
`..)
`Access snapshot
`Access snapshot
`metadata at RP
`metadata at RP
`+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 424
`424
`J
`Traverse tree
`Traverse tree
`beginning at RP to
`beginning at RP to
`locate data
`locate data
`
`1111111111111111111111111111111111111111111111111111
`
`11111111111111111111111111111111111111111111111
`
`Jr
`
`426
`426
`J
`
`412
`412
`..)
`
`414
`414
`....)
`
`nnnnnnnnnnnn
`
`416
`416
`...)
`
`VM
`VM -.NI
`Snapshot
`Snap
`shot
`l
`4,
`Set running point as root
`Set running point as root
`node of previous snapshot
`node of previous snapshot
`
`Create node for
`Create node for
`new running point
`new running point
`
`1
`1
`
`Copy contents of root node
`Copy contents of root node
`of previous snapshot into the
`of previous snapshot into the
`new running point node and
`new running point node and
`mark all pointers as pointing
`mark all pointers as pointing
`to shared nodes
`to shared nodes
`
`FIGURE 4B
`FIGURE 4B
`
`Write 10
`Write ID.)
`
`432
`432
`J
`Access snapshot
`Access snapshot
`metadata at RP
`metadata at RP
`
`4.
`4,
`434
`434
`Traverse tree
`Traverse tree
`...)
`beginning at RP to
`beginning at RP to
`find write location
`find write location
`
`and update tree nnnnnnnnnnnnnnnnnnnnnnnnnn
`and update tree 1
`
`436
`436
`J
`
`Issue write
`Issue write
`command to write
`command to write
`data at the
`data at the
`location
`location
`
`Issue read
`Issue read
`command to read
`command to read
`data from the
`data from the
`location
`location
`
`?????????????????????????
`
`*
`
`*
`
`*
`
`*
`
`*
`
`*
`
`* 241424344477777XXIIIIIIIIII
`
`I I111111111111111111111
`
`FIGURE 4C
`FIGURE 4C
`
`FIGURE 4D
`FIGURE 4D
`
`
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`1
`TRACKING DATA OF VIRTUAL DISK
`TRACKING DATA OF VIRTUAL DISK
`SNAPSHOTS USING TREE DATA
`SNAPSHOTS USING TREE DATA
`STRUCTURES
`STRUCTURES
`
`20
`
`35
`35
`
`2
`object , which may take the form of a file in a host file
`object, which may take the form of a file in a host file
`system , a file in a network file system , and object storage
`system, a file in a network file system, and object storage
`provisioned as a virtual storage area network ( SAN ) object ,
`provisioned as a virtual storage area network (SAN) object,
`a virtual volume object , or a cloud storage object . Similarly ,
`a virtual volume object, or a cloud storage object. Similarly,
`5 metadata of different snapshots for the same virtual disk are
`BACKGROUND
`5 metadata of different snapshots for the same virtual disk are
`BACKGROUND
`stored in the same storage object , and log data of different
`stored in the same storage object, and log data of different
`In a virtualized computing environment , virtual disks of
`snapshots for the same virtual disk are stored in the same
`snapshots for the same virtual disk are stored in the same
`In a virtualized computing environment, virtual disks of
`storage object . As a result , the number of different storage
`virtual machines ( VMs ) running in a host computer system
`storage object. As a result, the number of different storage
`virtual machines (VMs) running in a host computer system
`objects that are managed for snapshots do not increase
`( " host " ) are typically represented as files in the host ’ s file
`objects that are managed for snapshots do not increase
`("host") are typically represented as files in the host's file
`system . To back up the VM data and to support linked VM 10 proportionally with the number of snapshots taken . In addi
`10 proportionally with the number of snapshots taken. In addi-
`system. To back up the VM data and to support linked VM
`tion , any one of the multitude of persistent storage back - ends
`clones , snapshots of the virtual disks are taken to preserve
`tion, any one of the multitude of persistent storage back-ends
`clones, snapshots of the virtual disks are taken to preserve
`can be selected as the storage back - end for the storage
`the VM data at a specific point in time . Frequent backup of
`can be selected as the storage back-end for the storage
`the VM data at a specific point in time. Frequent backup of
`objects containing data for the snapshots . As a result , the
`VM data increases the reliability of the VMs . The cost of
`objects containing data for the snapshots. As a result, the
`VM data increases the reliability of the VMs. The cost of
`form of the storage objects containing data for the snapshots
`frequent backup , i . e . , taking frequent snapshots , is high
`form of the storage objects containing data for the snapshots
`frequent backup, i.e., taking frequent snapshots, is high
`because of the increase in associated storage costs and 15 may be selected according to user preference , system
`15 may be selected according to user preference, system
`because of the increase in associated storage costs and
`adverse impact on performance , in particular read perfor -
`requirement , snapshot policy , or any other criteria . Another
`requirement, snapshot policy, or any other criteria. Another
`adverse impact on performance, in particular read perfor-
`advantage is that the storage location of the read data can be
`mance because each read will have to potentially traverse
`advantage is that the storage location of the read data can be
`mance because each read will have to potentially traverse
`each snapshot level to find the location of the read data .
`obtained with a single read of the metadata storage object ,
`obtained with a single read of the metadata storage object,
`each snapshot level to find the location of the read data.
`Solutions have been developed to reduce the amount of
`instead of traversing metadata files of multiple snapshots .
`instead of traversing metadata files of multiple snapshots.
`Solutions have been developed to reduce the amount of
`storage consumed by snapshots . For example , snapshots can 20
`FIG . 1 is a computer system , shown as host computer
`FIG. 1 is a computer system, shown as host computer
`storage consumed by snapshots. For example, snapshots can
`be backed up incrementally by comparing blocks from one
`system 100 , having a hypervisor 108 installed on top of
`system 100, having a hypervisor 108 installed on top of
`be backed up incrementally by comparing blocks from one
`version to another and only the blocks that have changed
`hardware platform 102 to support the execution of virtual
`hardware platform 102 to support the execution of virtual
`version to another and only the blocks that have changed
`from the previous version are saved . Deduplication has also
`machines ( VMs ) 1121 - 112n through corresponding virtual
`machines (VMs) 1121-112N through corresponding virtual
`from the previous version are saved. Deduplication has also
`been used to identify content duplicates among snapshots to
`machine monitors ( VMMs ) 122 , - 122x . Host computer sys
`machine monitors (VMMs) 1221-122N. Host computer sys-
`been used to identify content duplicates among snapshots to
`25 tem 100 may be constructed on a conventional , typically
`remove redundant storage content .
`25 tem 100 may be constructed on a conventional, typically
`remove redundant storage content.
`Although these solutions have reduced the storage
`server - class , hardware platform 102 , and includes one or
`server-class, hardware platform 102, and includes one or
`Although these solutions have reduced the storage
`requirements of snapshots , further enhancements are needed
`more central processing units ( CPUs ) 103 , system memory
`more central processing units (CPUs) 103, system memory
`requirements of snapshots, further enhancements are needed
`104 , one or more network interface controllers ( NICs ) 105 ,
`for effective deployment in cloud computing environments
`104, one or more network interface controllers (NICs) 105,
`for effective deployment in cloud computing environments
`where the number of VMs and snapshots that are managed
`and one or more host bus adapters ( HBAs ) 106 . Persistent
`and one or more host bus adapters (HBAs) 106. Persistent
`where the number of VMs and snapshots that are managed
`is quite large , often several orders of magnitude times 30 storage for host computer system 100 may be provided
`30 storage for host computer system 100 may be provided
`is quite large, often several orders of magnitude times
`greater than deployment in conventional data centers . In
`locally , by a storage device 161 ( e . g . , network - attached
`locally, by a storage device 161 (e.g., network-attached
`greater than deployment in conventional data centers. In
`addition , storage technology has advanced to provide a
`storage or cloud storage ) connected to NIC 105 over a
`storage or cloud storage) connected to NIC 105 over a
`addition, storage technology has advanced to provide a
`multitude of persistent storage back - ends , but snapshot
`network 151 or by a storage device 162 connected to HBA
`network 151 or by a storage device 162 connected to HBA
`multitude of persistent storage back-ends, but snapshot
`technology has yet to fully exploit the benefits that are
`106 over a network 152 .
`106 over a network 152.
`technology has yet to fully exploit the benefits that are
`provided by the different persistent storage back - ends .
`Each VM 112 implements a virtual hardware platform in
`Each VM 112 implements a virtual hardware platform in
`provided by the different persistent storage back-ends.
`the corresponding VMM 122 that supports the installation of
`the corresponding VMM 122 that supports the installation of
`a guest operating system ( OS ) which is capable of executing
`BRIEF DESCRIPTION OF THE DRAWINGS
`a guest operating system (OS) which is capable of executing
`BRIEF DESCRIPTION OF THE DRAWINGS
`applications . In the example illustrated in FIG . 1 , the virtual
`applications. In the example illustrated in FIG. 1, the virtual
`FIG . 1 is a block diagram of a virtualized host computer
`hardware platform for VM 112 , supports the installation of
`hardware platform for VM 1121 supports the installation of
`FIG. 1 is a block diagram of a virtualized host computer
`system that implements a snapshot module according to 40 a guest OS 116 which is capable of executing applications
`40 a guest OS 116 which is capable of executing applications
`system that implements a snapshot module according to
`118 within VM 112 , . Guest OS 116 may be any of the
`embodiments .
`118 within VM 1121. Guest OS 116 may be any of the
`embodiments.
`FIG . 2 is a schematic diagram that illustrates data struc -
`well - known commodity operating systems , such as Micro
`well-known commodity operating systems, such as Micro-
`FIG. 2 is a schematic diagram that illustrates data struc-
`tures for managing virtual disk snapshots according to an
`soft Windows® , Linux® , and the like , and includes a native
`soft Windows®, Linux®, and the like, and includes a native
`tures for managing virtual disk snapshots according to an
`file system layer , for example , either an NTFS or an ext3FS
`file system layer, for example, either an NTFS or an ext3FS
`embodiment .
`embodiment.
`FIG . 3 is a schematic diagram that illustrates additional 45 type file system layer . Input - output operations ( IOs ) issued
`45 type file system layer. Input-output operations (IOs) issued
`FIG. 3 is a schematic diagram that illustrates additional
`data structures , including B + trees , for managing virtual disk
`by guest OS 116 through the native file system layer appear
`by guest OS 116 through the native file system layer appear
`data structures, including B+ trees, for managing virtual disk
`to guest OS 116 as being routed to one or more virtual disks
`snapshots according to an embodiment .
`to guest OS 116 as being routed to one or more virtual disks
`snapshots according to an embodiment.
`FIG . 4A depicts a flow diagram of method steps that are
`provisioned for VM 112 , for final execution , but such IOs
`provisioned for VM 1121 for final execution, but such IOs
`FIG. 4A depicts a flow diagram of method steps that are
`carried out in connection with opening storage objects that
`are , in reality , reprocessed by IO stack 130 of hypervisor 108
`are, in reality, reprocessed by IO stack 130 of hypervisor 108
`carried out in connection with opening storage objects that
`are needed to manage snapshots according to an embodi - 50 and the reprocessed IOs are issued through NIC 105 to
`so and the reprocessed IOs are issued through NIC 105 to
`are needed to manage snapshots according to an embodi-
`storage device 161 or through HBA 106 to storage device
`storage device 161 or through HBA 106 to storage device
`ment.
`ment .
`FIG . 4B depicts a flow diagram of method steps that are
`162 .
`162.
`FIG. 4B depicts a flow diagram of method steps that are
`carried out in connection with taking snapshots according to
`At the top of IO stack 130 is a SCSI virtualization layer
`At the top of IO stack 130 is a SCSI virtualization layer
`carried out in connection with taking snapshots according to
`131 , which receives IOs directed at the issuing VM ' s virtual
`an embodiment .
`131, which receives IOs directed at the issuing VM's virtual
`an embodiment.
`FIG . 4C depicts a flow diagram of method steps that are 55 disk and translates them into IOs directed at one or more
`55 disk and translates them into IOs directed at one or more
`FIG. 4C depicts a flow diagram of method steps that are
`carried out to process a read IO on a virtual disk having one
`storage objects managed by hypervisor 108 , e . g . , virtual disk
`storage objects managed by hypervisor 108, e.g., virtual disk
`carried out to process a read IO on a virtual disk having one
`or more snapshots that have been taken according to an
`storage objects representing the issuing VM ' s virtual disk . A
`storage objects representing the issuing VM's virtual disk. A
`or more snapshots that have been taken according to an
`file system device switch ( FDS ) driver 132 examines the
`file system device switch (FDS) driver 132 examines the
`embodiment .
`embodiment.
`FIG . 4D depicts a flow diagram of method steps that are
`translated IOs from SCSI virtualization layer 131 and in
`translated IOs from SCSI virtualization layer 131 and in
`FIG. 4D depicts a flow diagram of method steps that are
`carried out to process a write IO on a virtual disk having one 60 situations where one or more snapshots have been taken of
`60 situations where one or more snapshots have been taken of
`carried out to process a write IO on a virtual disk having one
`or more snapshots that have been taken according to an
`the virtual disk storage objects , the IOs are processed by a
`the virtual disk storage objects, the IOs are processed by a
`or more snapshots that have been taken according to an
`snapshot module 133 , as described below in conjunction
`embodiment .
`snapshot module 133, as described below in conjunction
`embodiment.
`with FIGS . 4C and 4D .
`with FIGS. 4C and 4D.
`The remaining layers of IO stack 130 are additional layers
`The remaining layers of IO stack 130 are additional layers
`65 managed by hypervisor 108 . HFS / VVOL / NSAN driver 134
`65 managed by hypervisor 108. HFS/VVOL/NSAN driver 134
`represents one of the following depending on the particular
`represents one of the following depending on the particular
`implementation : ( 1 ) a host file system ( HFS ) driver in cases
`implementation: (1) a host file system (HFS) driver in cases
`
`DETAILED DESCRIPTION
`DETAILED DESCRIPTION
`According to embodiments , user data of different snap -
`According to embodiments, user data of different snap-
`shots for the same virtual disk are stored in the same storage
`shots for the same virtual disk are stored in the same storage
`
`
`
`US 9 , 881 , 040 B2
`US 9,881,040 B2
`
`4
`3
`2 , 3 , storage objects 1 , 2 , 3 are identified by their object
`where the virtual disk and / or data structures relied on by
`2, 3, storage objects 1, 2, 3 are identified by their object
`where the virtual disk and/or data structures relied on by
`snapshot module 133 are represented as a file in a file
`identifiers ( OIDs ) in the embodiments . SMDS provides a
`identifiers (OIDs) in the embodiments. SMDS provides a
`snapshot module 133 are represented as a file in a file
`system , ( 2 ) a virtual volume ( VVOL ) driver in cases where
`mapping of each OID to a location in storage . In SMDS 220 ,
`mapping of each OID to a location in storage. In SMDS 220,
`system, (2) a virtual volume (VVOL) driver in cases where
`the virtual disk and / or data structures relied on by snapshot
`OID1 is mapped to PTR1 , OID2 mapped to PTR2 , and OID3
`OID1 is mapped to PTR1, O1D2 mapped to PTR2, and O1D3
`the virtual disk and/or data structures relied on by snapshot
`module 133 are represented as a virtual volume as described 5 mapped to PTR3 . Each of PTR1 , PTR2 , and PTR3 may be
`5 mapped to PTR3. Each of PTR1, PTR2, and PTR3 may be
`module 133 are represented as a virtual volume as described
`in U . S . Pat . No . 8 , 775 , 773 , which is incorporated by refer -
`a path to a file in HFS 230 or a uniform resource identifier
`a path to a file in HFS 230 or a uniform resource identifier
`in U.S. Pat. No. 8,775,773, which is incorporated by refer-
`( URI ) of a storage object .
`ence herein in its entirety , and ( 3 ) a virtual storage area
`(URI) of a storage object.
`ence herein in its entirety, and (3) a virtual storage area
`network ( VSAN ) driver in cases where the virtual disk
`SMDS is created per virtual disk and snapshot module
`SMDS is created per virtual disk and snapshot module
`network (VSAN) driver in cases where the virtual disk
`133 maintains the entire snapshot hierarchy for a single
`and / or data structures relied on by snapshot module 133 are
`133 maintains the entire snapshot hierarchy for a single
`and/or data structures relied on by snapshot module 133 are
`represented as a VSAN object as described in U . S . patent 10 virtual disk in the SMDS . Whenever a new snapshot of a
`io virtual disk in the SMDS. Whenever a new snapshot of a
`represented as a VSAN object as described in U.S. patent
`application Ser . No . 14 / 010 , 275 , which is incorporated by
`virtual disk is taken , snapshot module 133 adds an entry in
`application Ser. No. 14/010,275, which is incorporated by
`virtual disk is taken, snapshot module 133 adds an entry in
`reference herein in its entirety . In each case , driver 134
`the SMDS of that virtual disk . SMDS 220 shows an entry for
`reference herein in its entirety. In each case, driver 134
`the SMDS of that virtual disk. SMDS 220 shows an entry for
`receives the IOs passed through filter driver 132 and trans -
`each of snapshots SS1 , SS2 , SS3 . Snapshot SS1 is