throbber
LIBRARY OF CONGRESS
`
`Proceedings
`
`·.
`
`'
`
`...
`
`,
`
`.
`
`3rd International CoJlferenee on
`-.. ~.
`Autonomic CoDlpnting
`
`~~
`
`ICAC2006
`
`Logo artwork provided by Vincent Matossian
`Rutgers University
`
`TX 6-440-305
`IIIII/III /IIIII~ IIIII IIIII /ll/llllllllllllllllll/ll////ill!l/l///111/
`
`•TXIJ996441330S:t~
`
`LENOVO ET AL. EXHIBIT 1025
`Lenovo et al. v. Intellectual Ventures I, LLC
`IPR2017-00477
`
`Page 1 of 6
`
`

`

`The papers in this book comprise the digest of the meeting mentioned on the cover and
`title page. They reflect the authors' opinions and are published as presented and without
`change, in the interest of timely dissemination. Their inclusion in this publication does
`not necessarily constitute endorsement by the editors, the Institute of Electrical and
`Electronics Engineers, Inc.
`
`"' ·
`Copyright and Reprint Permissions: Abstracting is permitted with credit to the source.
`Libraries are permitted to photocopy beyond the limits ofU. S. copyright law for private
`use of patrons those articles in this volume that canj a code at the bottom of the first
`page, provided the per-copy fee indicated in the code is paid through the Copyright
`Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For other copying, reprint
`or republication permission, write to IEEE, Copyrights Manager, IEEE Service Center,
`445 Hoes Lane, P. 0. Box 1331, Piscataway, NJ 08855-1331. All rights reserved.
`Copyright ©2005 by The Institute of Electrical and Electronics Engineers, Inc.
`
`IEEE Catalog Number:
`
`06EX1303
`
`ISBN:
`
`1-4244-0175-5
`
`Library of Congress:
`
`2006920296
`
`Page 2 of 6
`
`

`

`Efficient Data Migration in Self-managing Storage
`Systems
`
`Vijay Sundaram
`Dept. of Computer Science
`University of Massachusetts
`Amherst, MA 01003
`vijay@cs.umass.edu
`
`Timothy Wood
`Dept. of Computer Science
`University of Massachusetts
`Amherst, MA 01003
`twood @cs.umass.edu
`
`Prashant Shenoy
`Dept. of Computer Science
`University of Massachusetts
`Amherst, MA 01003
`shenoy@cs.umass.edu
`
`Abstract-Self-managing storage systems automate the tasks
`of detecting hotspots and triggering data migration to alleviate
`them. This paper argues that existing data migration techniques
`do not minimize data copying overhead incurred during recon(cid:173)
`figuration, which in turn impacts application performance. We
`propose a novel technique that automatically detects hotspots and
`uses the bandwidth-to-space ratio metric to greedily reconfigure
`the system while minimizing the resulting data copying overhead.
`We validate our technique with simulations and a prototype im(cid:173)
`plemented into the Linux Kernel. Our prototype and simulations
`show our algorithm successfully eliminates hotspots with a factor
`of two reduction in data copying overhead compared to other
`approaches.
`
`This paper focuses on the design of automated data migra(cid:173)
`tion algorithms that minimize the total data copying overhead
`incurred during reconfiguration in self-managing storage sys(cid:173)
`tems. We also propose automated techniques to detect hotspots
`and trigger our data migration algorithm.
`
`II. PROBLEM FORMULATION
`
`Consider an enterprise storage system as shown in Figure 1.
`Such a system comprises multiple, heterogeneous disk arrays,
`each consisting of one or more RAID devices. A RAID device
`is constructed by combining a set of disks from the array using
`RAID techniques [9]. Each RAID device can contain one or
`more data partitions (also called logical volumes) and it is
`possible for partitions to span multiple RAID devices, forming
`a logical RAID.
`In this system, there is a hard constraint that the storage
`needs of all partitions mapped to an array cannot exceed the
`array's storage capacity. There is also a weak constraint that
`the total bandwidth utilization of an array be smaller than
`a threshold r in order for an array to be considered load
`balanced. If the bandwidth utilization exceeds this threshold,
`the array is overloaded and a hotspot is said to be in the
`system.
`Alleviating the hotspot requires some logical volumes to
`be moved from overloaded arrays to underloaded ones. The
`cost of such a system reconfiguration is defined to be the total
`amount of data moved. If a new configuration does not change
`the mapping of a particular volume, then that volume does not
`contribute to the cost of the reconfiguration.
`
`I. INTRODUCTION
`As information is increasingly created, processed, and ma(cid:173)
`nipulated in digital form, the role of large-scale enterprise
`storage systems becomes increasingly important. Enterprise
`storage systems are complex entities that comprise a large
`number of RAID arrays with one or more data partitions
`mapped to each array. As storage needs and 1/0 workloads
`evolve over time, managing, configuring, and continual tuning
`of such systems becomes a Complex task [2], [3]. Conse(cid:173)
`quently, design of self-managing storage systems is an ap(cid:173)
`pealing option; such a system performs automated mapping
`of data partitions to RAID arrays, monitors the workload on
`each array, and transparently triggers data migration if hotspots
`or imbalances are detected in the system.
`Until recently, these tasks were performed manually by
`administrators using sophisticated tools to analyze the load
`on the system [1]. Despite these tools' capabilities to collect
`performance data anopredictllie rmpact of nioving data
`partitions from one array to another, the decision process
`remains manual and prone to human error. To address \his
`drawback, recent efforts have focused on automating the ta~lc A. Background
`of detecting hotspots and triggering data migration to alleviate\
`A self-managing storage system must be able to determine
`them [5], [7], [8]. Such a remapping of data partitions to RAID
`~ new mapping of data partitions after detecting a hotspot.
`arrays involves a system reconfiguration that either results Seyeral techniques can be employed for this purpose:
`in downtime or reduced performance during the migration.
`Bin Packing: One approach is to consider the new storage
`Consequently, it is essential to minimize the reconfiguration
`and bandwidth requirement of each logical volume and deter(cid:173)
`overhead by minimizing the volume of data moved, and thus, mine ~'oew mapping of volumes to arrays from scratch using
`the time needed to do so. Unfortunately, recently proposed bin packing he•Jristics [2]. Random permutations of volumes
`approaches do not focus on minimizing data copying, resulting
`are generate\! and objects are assigned random!y to arrays until
`
`Ill. COST-AWARE OBJECT REMAPPING
`
`:·:~~::::.::::.:::0:~ ~E~"IDY
`
`29
`
`: ~d as~i\ is obmillod (a vilid as~~ment safufies
`
`Page 3 of 6
`
`

`

`Did. A
`t·tays
`
`/
`
`RAlDde"i~s
`
`I
`I
`
`\
`
`I
`I
`
`l(
`
`I
`
`\ LogiCill Vol
`
`----------------------------------------
`
`1) Displace: Our approach first considers a displace step
`which attempts to move volumes from overloaded arrays to
`unused storage space on underloaded ones. All arrays that
`see a hotspot (i.e., violation of the bandwidth constraint)
`are considered. Any underloaded array with non-zero unused
`storage space is a potential destination for overloaded volumes.
`Overloaded arrays are considered, one at a time, in decreasing
`order of overload. Within each overloaded array, volumes are
`considered for possible relocation in decreasing order of their
`BSR. This results in a set of logical volumes which when
`displaced to an underloaded array will sufficiently reduce the
`total load to the array to make the hotspot disappear. By
`selecting volumes by decreasing BSR order, we ensure that
`we are moving the minimum amount of data necessary to
`eliminate the hotspot.
`The set of possible destination arrays are considered in
`decreasing order of spare BSR (spare bandwidth to spare
`storage space ratio) and selected so that neither the bandwidth
`nor storage constraints presented earlier will be violated. In the
`event that sufficient storage space is unavailable for a displace
`step, our techniques must resort to volume swaps to alleviate
`the remaining hotspots.
`2) Swap: As in displace, BSR is used as the metric to
`guide the selection of volumes to be swapped between two
`arrays. The key idea is to choose the highest BSR volumes
`from the most overloaded array and attempt to swap them
`with the lowest BSR volumes on the most underloaded array.
`Doing so yields the maximum reduction in load per unit data
`moved and reduces data copying overheads. Choosing the most
`underloaded array as a destination increases the probability of
`finding valid swaps.
`The swap step must determine a sets of overloaded and
`underloaded volumes such that the following constraints are
`satisified:
`Constraint C1 : There is at least a certain minimum amount of
`load reduction on the overloaded array.
`Constraint C2 : The swap should not violate the storage and
`bandwidth constraints on the underloaded array.
`Constraint C3
`: The swap should not violate the storage
`constraint on the overloaded array.
`If sets satisfying the above three constraints are successfully
`found, then the corresponding volumes are marked for a
`possible swap. The swap step repeats the above process until
`the hotspot is completely removed from the overloaded array.
`The swap step then moves onto the next overloaded array and
`repeats the process.
`Additonal optimizations and the full details of the algorithm
`are described in [10].
`3) Example: Figure 2 illustrates how displace and swap
`works. Figure (a) shows two arrays with bandwidth utilizations
`of 100% and 40%, respectively. Each box with a number indi(cid:173)
`cates a volume and an empty box indicates unallocated space.
`The number in a box indicates the bandwidth requirement of
`the volume. For simplicity, all volumes are assumed to be of
`unit size; so the bandwidth requirement of a volume is also its
`BSR. The bandwidth overload threshold r is assumed to be
`
`\\ r---------------~----- I ---------I
`~------------- ~--------- - - - - - -_ I 1\
`LogiCill Att·ays~
`.. ----------- ~---. ~---- _..,_.-- ---- j r!
`i:l~~~~~ll~~~~~l:jj
`~
`
`·---------------- ----------------
`
`.The figure shows two disk arrays with four RAID devices each. Each RAID device has
`five disks. Logical volumes are striped across all RAID devices on the first array, while
`each volume is striped across two RAID devices in the second array.
`
`Fig. I. An enterprise storage system.
`
`both the hard storage constraint and the bandwidth constraint,
`eliminating the hotspot).
`BSR-based Approach: Bandwidth-to-space ratio (BSR) has
`been used as a metric for video placement [4], [6]. Using
`knapsack heuristics, volumes are ordered in decreasing order
`of their bandwidth to storage ratio. Underloaded arrays are
`ordered by spareBSR, which is the ratio of spare bandwidth
`to spare storage space. Volumes are chosen in BSR order and
`assigned to arrays with the highest spareBSR. This ensures
`better utilization of the system bandwidth per unit storage
`space and leads to a tighter packing.
`Randomized reassignment: The previous two approaches
`are cost-oblivious; they construct a new mapping of volumes
`to arrays without considering the current mapping. We can
`modify the bin packing approach to take the current mapping
`into account by incrementally modifying the current configu(cid:173)
`ration until the hotspot is eliminated. The current configuration
`is assumed to be the initial configuration; bin packing is then
`used to assign objects from overloaded to underloaded arrays
`until sufficient load "shifts" to remove the hotspot. Although
`this algorithm is cost-aware, it does not explicitly attempt
`to reduce data copying overhead, nor does it consider the
`possibility of swaps, where two volumes are swapped. Thus,
`an entire set of possible solutions is ignored by the approach.
`
`B. Displace and Swap
`is a greedy data migration
`Displace and Swap ( Dswap)
`technique that alleviates hotspots by (i) using the current
`configuration to incrementally determine a new load-balanced
`configuration, (ii) using bandwidth-to-space ratio as a guiding
`metric to determine which volumes to move and where, and
`(iii) considering both volume moves and swaps as possible so(cid:173)
`lutions for determining the new configuration. The key idea is
`to move one or more volumes from overloaded to underloaded
`arrays or swap heavily loaded volumes from overloaded arrays
`with less loaded volumes on underloaded arrays. BSR is used
`to guide the selection of volumes and maximize the reduction
`in overload per unit data moved (which in tum reduces data
`copying overhead).
`
`298
`
`Page 4 of 6
`
`

`

`(o)
`
`(b)
`
`v.
`J,
`l·
`
`~'
`
`0
`
`2
`
`knplc:t of System SiD
`
`..
`
`Numt.of~
`
`(a) Cost-aware
`
`BSR(cid:173)
`Ra,obr~PIIdong-
`
`·.~-~-.~-~--.. _ .. _
`
`(b) Cost-oblivious
`
`Swap
`
`Fig. 3.
`
`Impact of system size.
`
`to determine a new mapping of volumes to arrays and data
`migration is triggered to actually move or swap volumes.
`
`V. EXPERIMENTAL EVALUATION
`
`A. Impact of System Size
`
`(e)
`
`(d)
`
`II.D ..
`
`Fig. 2.
`
`Illustration of Displace and Swap.
`
`75% for both the arrays. As Array 1 is overloaded the displace
`and swap algorithm proceeds as follows.
`The displace step is invoked first as the underloaded array
`has one unit spare space.
`Displace: F,igures (b) and (c) illustrate a volume being moved
`from Array 1 to Array 2. The volume selected is one with the
`maximum BSR.
`Since Array 1 is still overloaded after the displace step, the
`swap step is invoked.
`Swap: Figures (d) and (e) illustrate a volume with BSR 10
`being swapped with a volume with BSR 0. Thus, a high BSR
`volume gets swapped with a low BSR volume.
`At this point, the hotspot is eliminated and the algorithm
`terminates.
`
`We use simulations to evaluate the reconfiguration cost of
`our algorithm over a wide range of system configurations. The
`simulator allows us to study the impact of system size on re(cid:173)
`configuration overhead. We vary the system size-the number
`of arrays-from two to ten (each array holding 20 disks) and
`determine the normalized cost of eliminating hotspots over
`100 runs. Figures 3(a) depicts the data copying overheads for
`the Random Reassignment and our Dswap approach, both of
`which are cost-aware, while Figure 3(b) depicts the cost for
`Randomized bin packing and BSR, both of which are cost(cid:173)
`oblivious and reconfigure the system from scratch.
`Figure 3(a) shows that DSwap outperforms Random Reas-
`signment by factors of two to three. Moreover, the normalized
`IV. AUTOMATED HOTSPOT DETECTION
`reconfiguration costs for Dswap remains constant over a range
`We detect overload by continually monitoring the bandwidth
`utilization on each array and flag a hotspot if the bandwidth of system sizes, while it increases for the latter. Since Dswap
`chooses volumes from overloaded arrays carefully based on
`constraint is consistently violated on an array.
`Load monitoring: Our technique uses bandwidth utilization BSR values, the normalized cost is not sensitive to system
`on an array as an indicator of its load. The bandwidth
`size (the data copying overheads increase in proportion to
`system size, resulting in constant normalized cost). In Random
`utilization is computed as the average utilization of disks in
`the array. Since multiple volumes are typically mapped onto an
`reassignment, however, increasing the system size increases
`array, each contributes a certain fraction of the total utilization,
`the number of volumes to choose from, which also increases
`and the workload seen by each individual volume must be
`the likelihood of making sub-optimal decisions.
`Figure 3(b) shows the cost of the reconfiguration for the
`monitored to derive the total array utilization.
`Our monitoring module is implemented with hooks in the
`cost-oblivious approaches. Since both approaches reconfigure
`OS kernel which measure the mean request rate and· reques\
`the system from scratch, the cost of reconfiguration is higher
`size for each volume. Knowing these as well as the seek \ by more than an order of magnitude when compared to that of
`latency, rotational latency, and transfer time per byte of the ~e cost-aware approaches. Since the probability of a volume
`underlying disk, gives an indication of disk utilization. It be~ng moved to a new array increases with system size, the
`maintains a history of utilizations observed on each array over normalized data copying overheads increase with system size.
`\
`a long period (e.g., a day or a week).
`. Hotspot detection: Given a time series (history) of utiliza-
`B. Heter;:geneous Volume Sizes
`nons seen. at each an:ay, . a ~otspot is said to ~e present if
`We hav\ also implemented our techniques in the Linux
`the bandwidth constramt 1s vwlated for a certam percentage kernel verswn 2.6.11. Our prototype consists of kernel hooks
`to monitor re~l)est sizes and request rates seen by each logical
`of the observations. The threshold used to flag a hotspot is
`a configurable parameter; small values aggressively alleviate volume, and a 'recon5.guration daemon which uses statistics
`collected in the kernel to estimate bandwidth reau!.r~ments.
`hotspots, while larger values require an overload to persist
`over a longer period before data migration is triggered. Upon The daemon computes a new configuration if a hotspot is
`hotspot detection, our displace and swap technique is invoked detected, and triggers' yolume migration.
`
`\
`
`299
`
`Page 5 of 6
`
`

`

`VI. CONCLUSION
`In this paper, we argued that manual hotspot detection and
`storage system reconfiguration are tedious and error-prone and
`advocated the design of a self-managing system to automate
`these tasks. We argued that existing data migration techniques
`do not minimize data copying overhead incurred during a
`reconfiguration, which impacts application performance. We
`proposed a novel technique that automatically detects hotspots
`and uses the bandwidth-to-space ratio metric to reconfigure the
`system while minimizing the resulting data copying overhead.
`Evaluating our techniques within the Linux Kernel and through
`simulation showed a factor of two reduction in data copying
`overhead compared to other approaches without causing no(cid:173)
`ticeable degradation in application performance.
`
`VII. ACKNOWLEDGEMENTS
`This research was supported, in part, by NSF grants EEC-
`0313747, CNS-0323597, and EIA-0080119. Thanks to our
`anonymous reviewers for their helpful comments on this paper.
`
`REFERENCES
`
`[1] Erne symmetrix optimizer. Available from http://www.emc.com/
`products/storage..managementlsymm.nptimizer.jsp.
`[2] E. Anderson, M. Hobbs, K. Keeton, S. Spence, M. Uysal, and A. Veitch.
`In Pro·
`Hippodrome: Running circles around storage administration.
`ceedings of the Usenix Conference on File and Storage Technology
`(FAST'02), Monterey, CA, pages 175-188, January 2002.
`[3] E. Borowsky, R. Golding, P. Jacobson, A. Merchant, L. Schreier,
`M. Spasojevic, and J. Wilkes. Capacity planning with phased workloads.
`In Proceedings of WOSF '98, Santa Fe, NM, October 1998.
`[4] A. Dan and D. Sitaram. An online video placement policy based on
`bandwidth and space ratio. In Proceedings of SIGMOD, pages 376-
`385, May 1995.
`[5] E. A. et. al. An experimental study of data migration algorithms. In Proc.
`of WAE- International Workshop on Algorithms Engineering, LNCS,
`2001.
`[6] Y. Guo, Z. Ge, B. Urgaonkar, P. Shenoy, and D. Towsley. Dynamic
`cache reconfiguration strategies for a cluster-based streaming proxy.
`In Proceedings of the Eighth International Workshop on Web Content
`Caching and Distribution (WCW 2003), , Hawthorne, NY, September
`2003.
`[7] J. Hall, J. Hartline, A. Karlin, J. Saia, and J. Wilkes. On algorithms for
`efficient data migration. In Proceedings of ACM Symposium on Discrete
`Algorithms (SODA), 2001.
`[8] S. Khuller, Y. Kim, and Y. Wan. Algorithms for data migration with
`cloning. In Proceedings of ACM Symposium on Principles of database
`systems, pages 27-36, New York, NY, USA, 2003. ACM Press.
`[9] D. Patterson, G. Gibson, and R. Katz. A case for redundant array of
`inexpensive disks (raid). In Proceedings of ACM SIGMOD'88, pages
`109-116, June 1988.
`[10] V. Sundaram and P. Shenoy. Efficient data migration in self-managing
`storage systems. Technical Report TR06-21, Dept. of Computer Science,
`Univ. of Massachusetts, 2006.
`
`.,..,_ __
`-·-
`
`An,,_
`
`~~2 -
`~T"f3-
`
`~lMneion
`
`..
`
`100
`
`150
`200
`TII'I'Mt(S.C.)
`
`,. ""
`
`100
`
`..
`i
`i eo
`..
`f
`
`I ...
`
`350
`
`l
`~
`~
`f
`
`~ I
`
`OOPS
`
`50
`
`100
`
`200
`150
`Tme(Hea)
`
`250 ""
`
`Fig. 4. Variable volume size; no spare storage space
`
`We have used our prototype to evaluate systems with
`heterogeneous volume sizes. We consider two scenarios, one
`where unused storage space is present and another where
`all arrays are full. For the scenario where no free space is
`available, we configure the first two arrays with six volumes
`each-two volumes each of size 4GB, 8GB and 16GB. The
`other two arrays are configured with 14 volumes, all of size
`4GB. Initially the concurrency factor is set to two for all
`volume workloads. The request interarrival times are set to
`300ms, ls, ls and 500ms, respectively, for the four arrays. As
`shown in Figure 4, all arrays are initially underloaded. Further,
`the utilization estimated by our measurement technique closely
`tracks the imposed workload measured in lOPs.
`At t = lOOs, we impose an overload on array 0 by increas(cid:173)
`ing the concurrency factor to seven. The workload on array 1
`is also increased slightly but not enough to cause a hotspot.
`At t = 200s, our prototype correctly identifies a hotspot on
`array 0 and invokes Dswap. The algorithm swaps two 4GB
`volumes from array 0 with two similar sized volumes on
`array 3, which is underloaded. As can be seem, the technique
`swaps the smallest size volumes from array 0, minimizing the
`copying overhead. Further, doing so successfully dissipates the
`hotspot, as indicated by the array utilization at t = 300s. The
`load on array 3 increases due to the swap but the array still
`remains underloaded.
`
`C. Summary of Other Results
`We have used our simulator and prototype to examine
`how our Dswap algorithm works under a variety of overload
`configurations besides those mentioned here. In addition to
`system size, we have examined the impact of both bandwidth
`and storage utilization on the cost of reconfiguration. In all
`scenarios, we find that our algorithm performs reconfigurations
`cheaper, and is able to successfuly eliminate hotspots even
`in scenarios with very high storage utilization where the
`other methods fail to find a configuration which eliminates
`the hotspot. Our results shows that for a variety of overload
`configurations, the Dswap approach outperforms other ap(cid:173)
`proaches by a factor of two in terms of data copying overhead.
`Moreover, the larger the system size or the larger the overload,
`the greater the performance gap. Results from our prototype
`implementation show that our techniques correctly measure
`array utilizations and are effective at detecting hotspots and
`dissipating them, while imposing negligible overheads on the
`system. Our full results are available in [10).
`
`300
`
`Page 6 of 6
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket