throbber
(12) United States Patent
`Hölzle et al.
`
`USOO6510498B1
`(10) Patent No.:
`US 6,510,498 B1
`(45) Date of Patent:
`*Jan. 21, 2003
`
`(54) METHOD AND APPARATUS FOR MEMORY
`ALLOCATION IN A MULTI-THREADED
`VIRTUAL MACHINE
`
`5,727,178 A * 3/1998 Pletcher et al. ............. 711/202
`5,893,159 A * 4/1999 Schneider ................... 711/150
`OTHER PUBLICATIONS
`Robert H. Halstead, Jr., “Multilisp: A Language for Con
`current Symbolic Computation.” Oct. 1985, ACM Transac
`tions on Programming Languages and Systems, Vol. 7, No.
`4.
`* cited by examiner
`Primary Examiner Matthew Kim
`ASSistant Examiner Pierre M. Vital
`(74) Attorney, Agent, or Firm- Beyer Weaver & Thomas
`(57)
`ABSTRACT
`Methods and apparatus for the efficient allocation of shared
`memory in a multi-threaded computer System are disclosed.
`In accordance with one embodiment of the present
`invention, a computer-implemented method for allocating
`memory shared by multiple threads in a multi-threaded
`computing System includes partitioning the shared memory
`into a plurality of blocks, and grouping the multiple threads
`into at least a first group and a Second group. A Selected
`block is allocated to a Selected thread which may attempt to
`allocate an object in the selected block. The allocation of the
`Selected block to the Selected thread is based at least partially
`upon whether the Selected thread is a part of the first group
`or the Second group. In one embodiment, grouping the
`multiple threads into the first group and the Second group
`includes identifying a particular thread and determining
`whether the particular thread is a fast allocating thread. In
`Such an embodiment, when the particular thread is fast
`allocating, the particular thread is grouped into the first
`grOup.
`
`19 Claims, 14 Drawing Sheets
`
`(75)
`
`(73)
`
`(*)
`
`(21)
`(22)
`
`(63)
`
`(51)
`(52)
`(58)
`
`(56)
`
`ASSignee:
`
`Notice:
`
`Inventors: Urs Hölzle, Goleta, CA (US); Steffen
`Grarup, Palo Alto, CA (US)
`Sun Microsystems, Inc.,
`Palo Alto, CA
`(US)
`Subject to any disclaimer,
`the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`This patent is Subject to
`claimer.
`
`a terminal dis
`
`Appl. No.:
`09/724.737
`Nov. 28, 2000
`Filed:
`Related U.S. Application Data
`
`Continuation of application No. 09/108,047, filed on Jun.
`30, 1998, now Pat. No. 6,209,066.
`Int. Cl................................................. G06F 12/00
`711/153; 711/170; 711/173
`U.S. C. ...
`711/153, 170,
`Field of Search ....
`711/173; 709/104,105,
`107, 108, 312;
`712/206, 215, 245, 246
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,247,634 A
`5.535,361 A
`5,600,596 A
`
`9/1993 Cline et al.
`7/1996 Hirata et al.
`2/1997 Shirakihara
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING MEMORY
`BLOCKS
`
`-432
`
`TAKE FRSOCKFROM
`ALOCATION AREA AND ASSIGN
`SHARED BLOCK FOR ALL
`THREADS
`
`43
`
`RUNOWRAL SYSTEM
`
`4.38
`
`HASA
`BLOCK
`OWERFLOWED
`g
`
`440
`OTAIN NeXT AWALABLE BLOCK
`FROMALLOCATION AREA
`
`442
`
`ISA
`NEW BLOCKNN
`AVAILABLE
`
`
`
`462
`
`ASSIGNONE SHARED BLOCK
`O All OHRHREADS
`Ax
`
`46G,
`ASSIGNONNEW BLOCK TO
`EACHFAST ALCCAING
`THREAD
`
`458
`TRMINE WHICH THREADS
`TO CONSIDeR FAST
`ALOCATING
`
`A.
`
`456
`
`PERFORMGARBAG
`COLLECTION
`
`ASSIGNNEW BLOCK TO THE
`THREAD THATOWERFLOWED
`HE BLOCK
`
`
`
`NEWBLOCK
`&WAILABLE
`
`447
`
`N/6VeRFLOWENY
`
`
`
`450
`y
`RLACEFULL SHARE BLOCK
`WhMEWBCCK
`
`
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 1
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 1 of 14
`
`US 6,510,498 B1
`
`Figure la
`(Prior Art)
`
`-3%. 115
`
`- 112
`
`Figure Ib
`
`Figure 2a
`(Prior Art)
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 2
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 2 of 14
`
`US 6,510,498 B1
`
`230 N.
`
`233a / 233C
`T1|T2T1|T1
`233
`233d
`
`-232
`
`GE)-236
`2->2->
`
`Figure 2b
`(Prior Art)
`
`(2)-238
`
`300 N.
`
`-- - - - - - --------------- -
`- 7304a2304c. 2304e
`:Y304b 04, Y304f
`-
`f GE)-306a, -
`-" i
`
`s
`302 y,
`',
`
`306b 1
`
`2.
`
`Figure 3
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 3
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 3 of 14
`
`US 6,510,498 B1
`
`Figure 4
`
`(STARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING MEMORY
`BLOCKS
`
`4O2
`
`TAKE FIRST BLOCK FROM
`ALLOCATION AREA AND ASSIGN
`SHARED BLOCK FOR ALL
`THREADS
`
`
`
`RUN OVERALL SYSTEM
`
`
`
`
`
`
`
`SA
`NEW BLOCK
`AVAILABLE
`
`4 16
`
`Y
`ASSIGN NEW BLOCK TO THE
`THREAD THAT OVERFLOWED
`THE BLOCK
`
`
`
`414
`
`ANOTHER
`NEW BLOCK
`
`
`
`
`
`
`
`N /OVERFLOWEDNY
`
`
`
`42O
`REPLACE FULL SHARED BLOCK
`WITH NEW BLOCK
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 4
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 4 of 14
`
`US 6,510,498 B1
`
`Figure 5a
`
`(STARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING MEMORY
`BLOCKS
`
`432
`
`TAKE FIRST BLOCK FROM
`ALLOCATION AREA AND ASSIGN
`SHARED BLOCK FOR ALL
`THREADS
`
`434
`
`436
`
`RUN OVERALL SYSTEM
`
`
`
`
`
`ASSIGNONE NEW BLOCK TO
`EACH FAST ALLOCATING
`THREAD
`
`DETERMINE WHICH THREADS
`TO CONSIDER FAST
`ALLOCATING
`
`
`
`444
`
`
`
`ANOTHER
`NEW BLOCK
`AVAILABLE
`
`Y
`45
`O
`REPLACE FULL SHARED BLOCK
`WITH NEW BLOCK
`
`
`
`
`
`
`
`SA
`NEW BLOCK
`AVAILABLE
`
`Y
`ASSIGN NEW BLOCK TO THE
`THREAD THAT OVERFLOWED
`THE BLOCK
`
`447
`
`N /OVERFLOWEDNY
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 5
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet S of 14
`
`US 6,510,498 B1
`
`
`
`S is TOTAL
`NUMBER OF
`THREADS?
`
`
`
`DOES
`THREAD
`USE SHARED
`POOL
`
`
`
`
`
`SET ALLOCATION ROUTINE
`OF THREAD TO LOCKING
`
`
`
`458
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`MEMORY
`ALLOCATED
`BY THREAD IN LAS
`GARBAGE COLLECTION
`NTERVAL a
`THRESHOLD
`p
`
`
`
`
`
`SET ALLOCATION ROUTINE
`OF THREAD TO NON-LOCKING
`
`
`
`Figure 5b
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 6
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 6 of 14
`
`US 6,510,498 B1
`
`(STARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING MEMORY
`BLOCKS
`
`TAKE FIRST BLOCK FROM
`ALLOCATION AREA AND ASSIGN
`SHARED BLOCK FOR ALL
`THREADS
`
`
`
`6O2
`
`6O4
`
`RUN OVERALL SYSTEM
`
`606
`
`
`
`Fi gure 6
`
`617
`RESET OVERFLOW COUNTERS
`OF ALL THREADS
`
`616
`
`PERFORM GARBAGE
`COLLECTION
`
`ANOTHER
`NEW BLOCK
`AVAILABLE
`
`
`
`
`
`
`
`61O
`Y
`OBTAN NEXT AVAILABLE BLOCK
`FROM ALLOCATION AREA
`
`
`
`
`
`
`
`SA
`NEW BLOCKYNN
`AVAILABLE
`
`Y
`NCREMENT OVERFLOW
`COUNTER OF OVERFLOWING
`THREAD
`
`618
`
`62O
`
`
`
`Y
`
`COUNTER
`HRESHOLD
`
`
`
`ASSGN NEW BLOCK TO THE
`THREAD THAT OVER FLOWED
`THE BLOCK
`
`N
`
`622
`
`REPLACE FULL SHARED BLOCK
`WITH NEW SHARED BLOCK
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 7
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 7 of 14
`
`US 6,510,498 B1
`
`(STARD
`
`Figure 7
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING MEMORY
`BLOCKS
`
`TAKE FIRST BLOCK FROM
`ALLOCATION AREA AND ASSIGN
`SHARED BLOCK FOR ALL
`THREADS
`
`7O2
`
`704
`
`
`
`
`
`
`
`INCREMENT
`OVERFLOW
`COUNTER OF
`OVERFLOWING
`THREAD
`
`COUNTER
`HRESHOLD
`
`
`
`ASSIGNONE NEW BLOCK TO
`EACH FAST ALLOCATING
`THREAD
`
`DETERMINE WHICH THREADS
`TO CONSIDER FAST
`ALLOCATING
`
`ASSIGN NEW BLOCK TO THE
`THREAD THAT OVER FLOWED
`THE SHARED BLOCK
`
`
`
`ANOTHER
`NEW BLOCK
`AVAILABLE
`
`
`
`
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 8
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 8 of 14
`
`US 6,510,498 B1
`
`
`
`Figure 8
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 9
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 9 of 14
`
`US 6,510,498 B1
`
`(SIARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING SMALL AND
`LARGE MEMORY BLOCKS
`
`8O2
`
`ALLOCATE A SMALL MEMORY
`BLOCK TO EVERY THREAD
`
`RUN OVERALL SYSTEM
`
`
`
`
`
`
`
`
`
`
`
`S
`A NEW
`LARGE BLOC
`AVAILABLE
`
`
`
`816
`
`PERFORM GARBAGE
`COLLECTION
`
`ASSIGN NEW LARGE BLOCK TO
`THE THREAD THAT OVERFLOWED
`ITS BLOCK
`
`814
`
`Figure 9
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 10
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 10 of 14
`
`US 6,510,498 B1
`
`(STARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING SMALL AND
`LARGE MEMORY BLOCKS
`
`902
`
`ALLOCATE A SMALL BLOCK TO
`EVERY THREAD
`
`904
`
`RUN OVERALL SYSTEM
`
`
`
`906
`
`
`
`
`
`
`
`
`
`910
`OBTAIN NEXT AVAILABLE LARGE
`BLOCK FROM ALLOCATION AREA
`
`
`
`
`
`
`
`
`
`
`
`912
`S
`A NEW
`LARGE BLOCK) N
`AVAILABLE
`
`
`
`922
`ASSIGNONE SMALL BLOCK
`TO EACH OF THE OTHER
`THREADS
`
`92O
`
`ASSIGNONE NEW LARGE
`BLOCK TO EACH FAST
`ALLOCATING THREAD
`
`918
`DETERMINE WHICH THREADS
`TO CONSIDER FAST
`ALLOCATING
`
`916
`
`PERFORM GARBAGE
`COLLECTION
`
`Y
`ASSGN NEW LARGE PRIVATE
`BLOCK TO THE THREAD THAT
`OVERFLOWED TS BLOCK
`
`914
`
`Figure 10a
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 11
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 11 of 14
`
`US 6,510,498 B1
`
`918 Y
`
`
`
`
`
`
`
`
`
`
`
`ls is TOTAL
`NUMBER OF
`THREADS?
`
`
`
`DID
`THREAD
`HAVE A SMAL
`BLOCK
`?
`
`
`
`942
`MARK THREAD AS SLOW
`ALLOCATING
`
`
`
`
`
`
`
`MEMORY
`ALLOCATED
`BY THREAD IN LAS
`GARBAGE COLLECTION
`INTERVAL -
`THRESHOLD
`?
`
`
`
`MARK THREAD AS FAST
`ALLOCATING
`
`Figure 10b
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 12
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 12 of 14
`
`US 6,510,498 B1
`
`(STARD
`
`CONSTRUCT ALLOCATION AREA
`BY ALLOCATING SMALL AND
`LARGE MEMORY BLOCKS
`
`ALLOCATE A SMALL BLOCK TO
`EACH THREAD
`
`952
`
`954
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`956
`
`RUN OVERALL SYSTEM
`
`
`
`
`
`
`
`
`
`976
`ASSIGN NEW LARGE BLOCK TO
`THE THREAD THAT OVERFLOWED
`ITS BLOCK
`y NCREMENT
`OVERFLOW
`COUNTER OF
`
`IS
`A NEW
`
`974
`
`y EOWING LARGE BLOCK)N
`959
`
`AVAILABL
`
`Y
`
`COUNTER
`HRESHOLD
`?
`
`972
`OBTAIN NEXT AVAILABLE LARGE
`BLOCK FROM ALLOCATION AREA
`
`
`
`PERFORM GARBAGE
`COLLECTION
`
`966
`ASSGN NEW SMALL BLOCK TO
`THREAD THAT OVERFLOWED TS
`BLOCK
`
`Figure II
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 13
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 13 of 14
`
`US 6,510,498 B1
`
`1030
`
`1040
`
`/
`
`
`
`1032
`
`Secondary
`Storage
`
`
`
`N PROCESSORS
`
`1034
`
`Primary
`Storage
`
`1036
`
`RAM
`
`NetWOrk
`
`
`
`Figure 12
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 14
`
`

`

`U.S. Patent
`
`Jan. 21, 2003
`
`Sheet 14 of 14
`
`US 6,510,498 B1
`
`|NEWNOHIANE
`
`
`
`EWNI LNTR
`
`
`
`
`
`
`
`
`
`EOHITOS
`
`ENCJOO
`
`HETICHWOO
`
`0,7 || ||
`
`08 || ||
`
`|NEWNOHIANE E WN|| ~
`ETICHWOO
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 15
`
`

`

`US 6,510,498 B1
`
`1
`METHOD AND APPARATUS FOR MEMORY
`ALLOCATION IN A MULTI-THREADED
`VIRTUAL MACHINE
`
`This is a Continuation application of prior application
`Ser. No. 09/108,047 filed on Jun. 30, 1998 now U.S. Pat. No.
`6,209,066, the disclosure of which is incorporated herein by
`reference.
`
`2
`Rafael Lins (John Wiley & Sons Ltd., 1996), which is
`incorporated herein by reference in its entirety. “Younger'
`objects have been observed as being more likely to become
`garbage than "older objects. AS Such, generational garbage
`collection may be used to increase the overall efficiency of
`memory reclamation.
`In a System that uses generational garbage collection, a
`Special memory area is designated for the allocation of new
`objects. Such a memory area is generally considered to be a
`“nursery,” as new objects are allocated within the memory
`area. AS will be appreciated by those skilled in the art, the
`memory area is often referred to as “Eden.”
`FIG. 1a is a diagrammatic representation of a single
`thread and a memory allocation area that is dedicated to the
`Single thread. Such a memory allocation area is Suitable for
`implementation within a single-threaded System that uses
`generational garbage collection. AS Shown, a memory allo
`cation area 102, which may be known as Eden, is indexed by
`an allocation pointer 104. In general, Eden 102 is a block of
`memory in which new objects may be created. When a
`thread 106, which is associated with Eden 102, attempts to
`allocate a new object, allocation pointer 104 is typically
`incremented by the size of the new object, and a check is
`made to determine if allocation pointer 104 has reached the
`end of Eden 102. When it is determined that the end of Eden
`102 has been reached, a generational garbage collection may
`be performed to effectively empty Eden 102, thereby allow
`ing new objects to be created by thread 106 within Eden 102.
`While the allocation of memory and, hence, new objects,
`as described with respect to FIG. 1a is effective in a
`Single-threaded System, Such an allocation of memory and
`objects generally may not be used in a multi-threaded
`system with multiple central processing units (CPUs). By
`way of example, when two threads concurrently attempt to
`request Space in a single Eden, concurrency problems may
`arise. AS Such, in a multi-threaded System, when Eden is a
`shared resource, access to Eden must generally be Synchro
`nized in order to prevent more than one thread from allo
`cating in Eden at any given time. Synchronizing access to
`Eden may involve associating an allocation lock with Eden
`that is obtained by a thread when the thread wishes to create
`a new object, and released by the thread after the new object
`has been created.
`FIG. 1b is a diagrammatic representation of two threads
`and a memory allocation area shared by the two threads
`within an overall multi-threaded system. An Eden 112 has an
`asSociated allocation pointer 114 which is arranged to indi
`cate the beginning of an unused portion 115 of Eden 112.
`When threads 116 and 118, which share Eden 112, wish to
`allocate a new object in Eden 112, they must generally
`obtain the allocation lock (not shown) associated with Eden
`112. Specifically, if thread 116 wishes to access unused
`portion 115, thread 116 must obtain the allocation lock on
`Eden 112. Once thread 116 obtains the allocation lock, and
`it is determined that the end of Eden 112 has not been
`reached, allocation pointer 114 may be incremented, and a
`new object may be allocated by thread 116. If the end of
`Eden 112 has been reached, i.e., when unused portion 115 is
`null, a garbage collection may be performed to effectively
`empty Eden 112, thereby allowing new objects to be created
`by threads 116 and 118.
`When access to Eden is synchronized, the allocation of
`new objects within Eden is typically slowed considerably
`due to the Overhead associated with the acquisition of and
`the releasing of the allocation lock associated with Eden.
`Each time a thread wishes to create a new object in Eden, the
`
`BACKGROUND OF THE INVENTION
`1. Field of Invention
`The present invention relates generally to memory allo
`cation in computer Systems. More particularly, the present
`invention relates to efficient, low-overhead memory alloca
`tion in multi-threaded, object-based computer Systems.
`2. Description of the Related Art
`AS the use of Virtual machines in computer technology
`increases, improving the overall efficiency of a virtual
`machine is becoming more important. The amount of
`memory associated with a computer System that includes a
`Virtual machine is typically limited. AS Such, memory must
`generally be conserved and recycled. Many computer pro
`gramming languages enable Software developerS to dynami
`cally allocate memory within a computer System, while
`other programming languages require explicit manual deal
`location of previously allocated memory, which deallocation
`may be complicated and prone to error. Languages that
`require explicit manual memory management include the C
`and C++ programming languages. Other programming lan
`guages utilize automatic Storage-reclamation to reclaim
`memory that is no longer necessary to ensure the proper
`operation of computer programs that allocate memory from
`the reclamation System. Such automatic Storage-reclamation
`Systems reclaim memory without explicit instructions or
`calls from computer programs which were previously uti
`lizing the memory.
`In object-oriented or object-based Systems, the typical
`unit of memory allocation is commonly referred to as an
`object or a memory object, as will be appreciated by those
`skilled in the art. Objects that are in use are generally
`referred to as “live” objects, whereas objects that are no
`longer needed to correctly execute computer programs are
`typically referred to a "garbage' objects. The act of reclaim
`ing garbage objects is commonly referred to as garbage
`collection, and an automatic Storage-reclamation System is
`often referred to as a garbage collector. Computer programs
`written in languages Such as the Java" programming lan
`guage (developed by Sun MicroSystems, Inc.) and the
`Smalltalk programming language use garbage collection to
`automatically manage memory.
`The use of a compacting garbage collector generally
`allows objects to be allocated relatively quickly. That is, one
`advantage of using a compacting garbage collector is fast
`allocation of objects. Objects may be allocated in a contigu
`ous memory area, e.g., an allocation area, Such that the
`allocation of the objects may be performed by incrementing
`an allocation pointer by the desired amount of Storage. When
`the end of the allocation area has been reached, a garbage
`collection may be performed.
`One garbage collection method is a generational garbage
`collection method. A generational garbage collection
`method is a method in which objects are separated based
`upon their lifetimes as measured from the time the objects
`were created. Generational garbage collection is described
`in more detail in Garbage Collection. Algorithms for Auto
`matic Dynamic Memory Management by Richard Jones and
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 16
`
`

`

`3
`thread must acquire exclusive rights to Eden, as for example
`by acquiring an allocation lock. In general, even So-called
`“fast' locking primitives which are directly implemented by
`hardware, e.g., a compare-and-Swap primitive, may be rela
`tively slow when compared to the base costs associated with
`allocation. For instance, on a multiprocessor System, a
`locking primitive may incur a remote cache miss, as will be
`appreciated by those skilled in the art. In Such a System,
`adding Synchronization features often significantly increases
`the cost of allocation, e.g., by a factor of two or three. Hence,
`adding Synchronization during allocation greatly affects the
`performance of the overall System.
`In order to improve performance associated with access
`ing Eden in a multi-threaded System by avoiding
`Synchronization, each thread in the multi-threaded System
`may be assigned its own Eden. That is, when each thread has
`its own Eden, concurrency problems that may arise when
`more than one thread attempts to access a shared Eden may
`be avoided. FIG.2a is a diagrammatic representation of two
`threads with their own associated Edens, or memory allo
`cation areas. Within a multi-threaded system 200, a first
`Eden 202, which is referenced by an allocation pointer 204,
`is associated with a first thread 206. Multi-threaded system
`200 also includes a second Eden 212 that is referenced by an
`allocation pointer 204, and is associated with a Second
`thread 216.
`When first thread 206 wishes to allocate a new object, first
`thread 206 accesses first Eden 202. Similarly, when second
`thread 216 wishes to allocate a new object, Second thread
`216 accesses second Eden 212. As each thread 206, 216 has
`its own exclusive Eden, namely Edens 202 and 212,
`respectively, no allocation locks are needed to Safeguard
`against two threads attempting to access a single Eden in
`order to create a new object at any given time.
`Although allocating a separate Eden to each thread in a
`multi-threaded System may eliminate the need for allocation
`locks, allocating Separate Edens often requires a Substantial
`amount of memory. For example, Some applications may
`contain hundreds or even thousands of threads. In addition,
`Some threads may allocate objects at a faster Speed than
`others and, hence, will generally require more memory. The
`requirement for more memory may lead to frequent garbage
`collections, performed over all memory, e.g., global garbage
`collections performed on all Edens, which would require
`Some form of Synchronization. AS Such, Overall overhead
`asSociated with performing garbage collections on multiple
`Edens may increase and adversely affect the performance of
`the overall System, Since Some Edens may still be relatively
`empty while others are filled to capacity.
`The use of a Substantial amount of memory, as well as the
`increase in the overall overhead associated with garbage
`collection, that is associated with allocating a separate Eden
`to each thread in a multi-threaded System may be inefficient
`and expensive. Reducing the amount of memory used, as
`well as the frequency of garbage collection, increases the
`efficiency and generally decreases the costs associated with
`a multi-threaded System. Dividing an Eden into chunks, or
`blocks, typically allows an Eden to be shared without
`requiring allocation locks. The general division of Eden into
`chunks is described in "Multilisp: A Language for Concur
`rent Symbolic Computation” by R. Halstead, Jr. (ACM
`Transactions on Programming Languages and Systems, 7(4)
`:501-538, October 1985), which is incorporated herein by
`reference in its entirety. FIG. 2b is a diagrammatic repre
`Sentation of two threads and a memory allocation area
`shared by the two threads in which the memory allocation
`area is divided into chunks. A multi-threaded system 230
`
`4
`includes an Eden 232 that is divided into chunks 233 which
`are of a consistent size. In other words, all chunks 233 are
`approximately the same size. Each thread 236, 238 which
`shares Eden 232 is allocated an initial chunk. By way of
`example, thread 236 is initially allocated chunk 233a, while
`thread 238 is initially allocated chunk 233b.
`When a thread, e.g., thread 236, fills its chunk 233a,
`thread 236 is allocated another chunk 233c. Threads con
`tinue to be allocated chunks 233 until no chunks 233 are
`available, at which time a garbage collection may be per
`formed. It should be appreciated that although the requests
`for chunkS 233 are Synchronized, the Synchronization gen
`erally does not occur as frequently as the allocation Syn
`chronization that was previously mentioned.
`Allocating chunks 233 to threads 236,238 often results in
`Substantial fragmentation, as each chunk 233 must generally
`be sized to hold a large object. Hence, when a chunk is
`partially full, and a large object created by a thread does not
`fit in the partially full chunk, a new chunk will be allocated
`to the thread to accommodate the large object. The Space left
`in the partially full chunk is then effectively wasted. In
`addition, the allocation of Space in the chunks may be
`inefficient when threads which are slow allocating are in
`possession of Virtually empty chunks, thereby reserving
`memory Space which may never be needed.
`Therefore, what is desired is a method and an apparatus
`for efficiently allocating memory in a multi-threaded System
`Such as a multi-threaded virtual machine. Specifically, what
`is needed is a method and an apparatus for allowing threads
`to create new objects in a memory allocation area, e.g., an
`Eden, while minimizing memory Space, minimizing alloca
`tion costs, and improving the efficiency of garbage collec
`tion.
`
`SUMMARY OF THE INVENTION
`The present invention relates to the efficient allocation of
`shared memory in a multi-threaded computer System. In
`accordance with one embodiment of the present invention,
`a computer-implemented method for allocating memory
`shared by multiple threads in a multi-threaded computing
`System includes partitioning the shared memory into a
`plurality of blocks, and grouping the multiple threads into at
`least a first group and a Second group. A Selected block is
`allocated to a Selected thread which may attempt to allocate
`an object in the selected block. The allocation of the selected
`block to the Selected thread is based at least partially upon
`whether the Selected thread is a part of the first group or the
`Second group. In one embodiment, grouping the multiple
`threads into the first group and the Second group includes
`identifying a particular thread and determining whether the
`particular thread is a fast allocating thread. In Such an
`embodiment, when the particular thread is fast allocating,
`the particular thread is grouped into the first group.
`According to another aspect of the present invention, a
`computer-implemented method for allocating shared
`memory in a multi-threaded computing System which
`includes at least a first thread and a Second thread involves
`partitioning the Shared memory into a plurality of blocks,
`and assigning a first block that is accessible to both the first
`thread and the Second thread for the creation of new objects.
`After the System is allowed to run, a determination is
`effectively made as to whether the first block has over
`flowed. If it is determined that the first block has overflowed,
`the method includes determining whether an attempt by the
`first thread to allocate the first object in the first block caused
`the first block to overflow. If such is the case, a second block
`
`US 6,510,498 B1
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 17
`
`

`

`S
`is assigned to the first thread. ASSignment of the Second
`block to the first thread is arranged to cause the first thread
`to effectively relinquish the ability to allocate objects in the
`first block. In one embodiment, the second thread does not
`have the ability to allocate objects in the second block.
`In another embodiment, the method also includes deter
`mining when one of the first block and the second block
`have overflowed, as well as assigning a third block the first
`thread when it is determined that the second block
`overflowed, or assigning the third block to the Second thread
`when it is determined that the first block overflowed. In Such
`an embodiment, when it is determined that the first block
`overflowed, a fourth block may replace the first block.
`According to Still another aspect of the present invention,
`a computer-implemented method for allocating memory in a
`multi-threaded computing System includes partitioning the
`memory into a plurality of blocks which includes a first
`block and a Second block that is Substantially larger than the
`first block. The first block is assigned to be accessible to a
`first thread which is arranged to attempt to allocate a first
`object in the first block, and the Second block is assigned to
`be accessible to the Second thread in order for the Second
`thread to attempt to allocate a Second object in the first
`block. In one embodiment, the first block has a size in the
`range of approximately 1 kiloByte to approximately 4
`kiloBytes, and the Second block has a size in the range of
`approximately 16 kiloBytes to approximately 32 kiloBytes.
`The present invention will be more readily understood
`upon reading the following detailed descriptions and Study
`ing the various figures of the drawings.
`BRIEF DESCRIPTION OF THE DRAWINGS
`The invention may be understood by reference to the
`following description taken in conjunction with the accom
`panying drawings in which:
`FIG. 1a is a diagrammatic representation of a thread and
`a memory allocation area.
`FIG. 1b is a diagrammatic representation of two threads
`and a memory allocation area shared by the two threads.
`FIG. 2a is a diagrammatic representation of two threads
`with their associated memory allocation areas.
`FIG. 2b is a diagrammatic representation of two threads
`and a memory allocation area shared by the two threads in
`which the memory allocation area is divided into chunkS.
`FIG. 3 is a diagrammatic representation of multiple
`threads and a memory allocation area shared by the multiple
`threads in accordance with a first embodiment of the present
`invention.
`FIG. 4 is a process flow diagram which illustrates the
`StepS associated with a first process of allocating memory in
`accordance with the first embodiment of the present inven
`tion.
`FIG. 5a is a process flow diagram which illustrates the
`StepS associated with a Second process of allocating memory
`in accordance with the first embodiment of the present
`invention.
`FIG. 5b is a process flow diagram which illustrates the
`StepS associated with a determination of which threads are
`considered to be fast allocating threads, i.e., step 458 of FIG.
`5a, in accordance with the first embodiment of the present
`invention.
`FIG. 6 is a process flow diagram which illustrates the
`StepS associated with a third process of allocating memory in
`accordance with the first embodiment of the present inven
`tion.
`
`1O
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,510,498 B1
`
`6
`FIG. 7 is a process flow diagram which illustrates the
`StepS associated with a fourth process of allocating memory
`in accordance with the first embodiment of the present
`invention.
`FIG. 8 is a diagrammatic representation of multiple
`threads and a memory allocation area shared by the multiple
`threads in accordance with a Second embodiment of the
`present invention.
`FIG. 9 is a process flow diagram which illustrates the
`StepS associated with a first process of allocating memory in
`accordance with the Second embodiment of the present
`invention.
`FIG. 10a is a process flow diagram which illustrates the
`StepS associated with a Second process of allocating memory
`in accordance with the Second embodiment of the present
`invention.
`FIG. 10b is a process flow diagram which illustrates the
`StepS associated with a determination of which threads are
`considered to be fast allocating threads, i.e., step 918 of FIG.
`10a, in accordance with the Second embodiment of the
`present invention.
`FIG. 11 is a process flow diagram which illustrates the
`StepS associated with a third process of allocating memory in
`accordance with the Second embodiment of the present
`invention.
`FIG. 12 illustrates a typical, general-purpose computer
`System Suitable for implementing the present invention.
`FIG. 13 is a diagrammatic representation of a virtual
`machine which is supported by computer system 1030 of
`FIG. 12, and is Suitable for implementing the present inven
`tion.
`
`DETAILED DESCRIPTION OF THE
`EMBODIMENTS
`The overhead associated with allocating shared memory,
`e.g., an "Eden,” in a multi-threaded System is often signifi
`cant. Allocating a separate Eden to each thread in a multi
`threaded System tends eliminate the need for allocation lockS
`asSociated with Synchronization. However, allocating Sepa
`rate Edens often requires a Substantial amount of memory,
`and may lead to more frequent garbage collections, thereby
`potentially adversely affecting the performance of the over
`all System.
`An Eden that is shared by multiple threads may be divided
`into equal chunks, or blocks, Such that each thread has its
`own block. By allowing each thread to have its own block,
`an Eden may be shared without requiring allocation lockS.
`However, dividing Eden into equal chunks and allowing
`each thread to have its own block often result in Substantial
`fragmentation. For example, when a chunk is partially full
`and a large object created by a thread does not fit in the
`partially full chunk, a new chunk will be allocated to the
`thread to accommodate the large object. The Space left in the
`partially full chunk is then effectively wasted. In addition,
`the allocation of Space in the chunks may be inefficient when
`threads that rarely allocate objects are in possession of
`Virtually empty chunks, thereby reserving memory Space
`which may never be needed. When threads reserve memory
`Space that may not be needed, the Space is effectively taken
`away from threads which may need the memory Space.
`Further, more frequent garbage collections, which involve
`Substantial overhead, are likely to occur in order to free
`memory for use by threads which need additional memory
`Space.
`By allowing multiple threads which rarely allocate objects
`to share chunks or blocks of a shared memory allocation
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1017, p. 18
`
`

`

`US 6,510,498 B1
`
`15
`
`25
`
`35
`
`40
`
`7
`area, while providing threads which frequently allocate
`objects with “private,” or unshared, memory blocks, more
`memory space is effectively provided to Substantially only
`the threads that need more memory. Hence, more memory
`Space is likely to be filled before garbage collections are
`performed. In addition, the frequency of garbage collection
`may also be reduced. Although Synchronization is used
`when Slow allocating

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket