throbber
Efficient Java RMI for Parallel Programming
`
`JASON MAASSEN, ROB VAN NIEUWPOORT, RONALD VELDEMA,
`HENRI BAL, THILO KIELMANN, CERIEL JACOBS, and RUTGER HOFMAN
`Vrije Universiteit, Amsterdam
`
`Java offers interesting opportunities for parallel computing. In particular, Java Remote Method
`Invocation (RMI) provides a flexible kind of remote procedure call (RPC) that supports polymor-
`phism. Sun’s RMI implementation achieves this kind of flexibility at the cost of a major runtime
`overhead. The goal of this article is to show that RMI can be implemented efficiently, while still
`supporting polymorphism and allowing interoperability with Java Virtual Machines (JVMs). We
`study a new approach for implementing RMI, using a compiler-based Java system called Manta.
`Manta uses a native (static) compiler instead of a just-in-time compiler. To implement RMI effi-
`ciently, Manta exploits compile-time type information for generating specialized serializers. Also,
`it uses an efficient RMI protocol and fast low-level communication protocols.
`A difficult problem with this approach is how to support polymorphism and interoperability.
`One of the consequences of polymorphism is that an RMI implementation must be able to download
`remote classes into an application during runtime. Manta solves this problem by using a dynamic
`bytecode compiler, which is capable of compiling and linking bytecode into a running application. To
`allow interoperability with JVMs, Manta also implements the Sun RMI protocol (i.e., the standard
`RMI protocol), in addition to its own protocol.
`We evaluate the performance of Manta using benchmarks and applications that run on a
`32-node Myrinet cluster. The time for a null-RMI (without parameters or a return value) of Manta
`is 35 times lower than for the Sun JDK 1.2, and only slightly higher than for a C-based RPC
`protocol. This high performance is accomplished by pushing almost all of the runtime overhead of
`RMI to compile time. We study the performance differences between the Manta and the Sun RMI
`protocols in detail. The poor performance of the Sun RMI protocol is in part due to an inefficient
`implementation of the protocol. To allow a fair comparison, we compiled the applications and the
`Sun RMI protocol with the native Manta compiler. The results show that Manta’s null-RMI latency
`is still eight times lower than for the compiled Sun RMI protocol and that Manta’s efficient RMI
`protocol results in 1.8 to 3.4 times higher speedups for four out of six applications.
`Categories and Subject Descriptors: D.1.3 [Programming Techniques]: Concurrent Program-
`ming—distributed programming, parallel programming; D.3.2. [Programming Languages]:
`Language Classifications—concurrent, distributed, and parallel
`languages; object-oriented
`languages; D.3.4 [Programming Languages]: Processors—compilers; run-time environments
`General Terms: Languages, Performance
`Additional Key Words and Phrases: Communication, performance, remote method invocation
`
`Authors’ address: Division of Mathematics and Computer Science, Vrije Universiteit, De Boelelaan
`1081A, 1081 HV Amsterdam, The Netherlands.
`Permission to make digital/hard copy of all or part of this material without fee for personal or
`classroom use provided that the copies are not made or distributed for profit or commercial advan-
`tage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice
`is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on
`servers, or to redistribute to lists requires prior specific permission and/or a fee.
`C(cid:176) 2001 ACM 0164-0925/01/1100–0747 $5.00
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001, Pages 747–775.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`748
`
`†
`
`J. Maassen et al.
`
`1. INTRODUCTION
`There is a growing interest in using Java for high-performance parallel ap-
`plications. Java’s clean and type-safe object-oriented programming model and
`its support for concurrency make it an attractive environment for writing re-
`liable, large-scale parallel programs. For shared memory machines, Java of-
`fers a familiar multithreading paradigm. For distributed memory machines,
`such as clusters of workstations, Java provides Remote Method Invocation
`(RMI), which is an object-oriented version of Remote Procedure Call (RPC). The
`RMI model offers many advantages for distributed programming, including a
`seamless integration with Java’s object model, heterogeneity, and flexibility
`[Waldo 1998].
`Unfortunately, many existing Java implementations have inferior perfor-
`mance of both sequential code and communication primitives, which is a serious
`disadvantage for high-performance computing. Much effort is being invested
`in improving sequential code performance by replacing the original bytecode
`interpretation scheme with just-in-time compilers, native compilers, and spe-
`cialized hardware [Burke et al. 1999; Krall and Grafl 1997; Muller et al. 1997;
`Proebsting et al. 1997]. The communication overhead of RMI implementations,
`however, remains a major weakness. RMI is designed for client/server pro-
`gramming in distributed (Web based) systems, where network latencies on
`the order of several milliseconds are typical. On more tightly coupled paral-
`lel machines, such latencies are unacceptable. On our Pentium Pro/Myrinet
`cluster, for example, Sun’s JDK 1.2 implementation of RMI obtains a null-RMI
`latency (i.e., the roundtrip time of an RMI without parameters or a return
`value) of 1,316 „s, compared to 31 „s for a user-level Remote Procedure Call
`protocol in C.
`Part of this large overhead is caused by inefficiencies in the JDK implemen-
`tation of RMI, which is built on a hierarchy of stream classes that copy data
`and call virtual methods. Serialization of method arguments (i.e., converting
`them to arrays of bytes) is implemented by recursively inspecting object types
`until primitive types are reached, and then invoking the primitive serializers.
`All of this is performed at runtime for each remote invocation.
`Besides inefficiencies in the JDK implementation of RMI, a second reason for
`the slowness of RMI is the difference between the RPC and RMI models. Java’s
`RMI model is designed for flexibility and interoperability. Unlike RPC, it allows
`classes unknown at compile time to be exchanged between a client and a server
`and to be downloaded into a running program. In Java, an actual parameter ob-
`ject in an RMI can be of a subclass of the class of the method’s formal parameter.
`In (polymorphic) object-oriented languages, the dynamic type of the parameter-
`object (the subclass) should be used by the method, not the static type of the
`formal parameter. When the subclass is not yet known to the receiver, it has
`to be fetched from a file or HTTP server and be downloaded into the receiver.
`This high level of flexibility is the key distinction between RMI and RPC [Waldo
`1998]. RPC systems simply use the static type of the formal parameter (thereby
`type-converting the actual parameter), and thus lack support for polymorphism
`and break the object-oriented model.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`Efficient Java RMI
`
`†
`
`749
`
`The key problem is to obtain the efficiency of RPC and the flexibility of
`Java’s RMI. This article discusses a compiler-based Java system, called Manta,1
`which was designed from scratch to efficiently implement RMI. Manta replaces
`Sun’s runtime protocol processing as much as possible by compile-time analysis.
`Manta uses a native compiler to generate efficient sequential code and special-
`ized serialization routines for serializable argument classes. Also, Manta sends
`type descriptors for argument classes only once per destination machine, in-
`stead of once for every RMI. In this way, almost all of the protocol overhead
`has been pushed to compile time, off the critical path. The problems with this
`approach are, however, how to interface with Java Virtual Machines (JVMs)
`and how to address dynamic class loading. Both are required to support inter-
`operability and polymorphism. To interoperate with JVMs, Manta supports the
`Sun RMI and serialization protocol, in addition to its own protocol. Dynamic
`class loading is supported by compiling methods and generating serializers
`at runtime.
`The general strategy of Manta is to make the frequent case fast. Since
`Manta is designed for parallel processing, we assume that the frequent case
`is communication between Manta processes, running, for example, on different
`nodes within a cluster. Manta supports the infrequent case (communication
`with JVMs) using a slower approach. Hence the Manta RMI system logically
`consists of two parts:
`
`— A fast communication protocol that is used only between Manta processes. We
`call this protocol Manta RMI, to emphasize that it delivers the standard RMI
`programming model to the user; but it can only be used for communication
`between Manta processes.
`— Additional software that makes the Manta RMI system as a whole compatible
`with standard RMI, so Manta processes can communicate with JVMs.
`
`We refer to the combination of these two parts as the Manta RMI system.
`We use the term Sun RMI to refer to the standard RMI protocol as defined
`in the RMI specification [Sun Microsystems 1997]. Note that both Manta RMI
`and Sun RMI provide the same programming model, but their wire formats
`are incompatible.
`The Manta RMI system thus combines high performance with the flexibil-
`ity and interoperability of RMI. In a grid computing application [Foster and
`Kesselman 1998], for example, some clusters can run our Manta software
`and communicate internally using the Manta RMI protocol. Other machines
`may run JVMs, containing, for example, a graphical user interface program.
`Manta communicates with such machines using the Sun RMI protocol, allow-
`ing method invocations between Manta and JVMs. Manta implements almost
`all other functionality required by the RMI specification, including heterogene-
`ity, multithreading, synchronized methods, and distributed garbage collection.
`Manta currently does not implement Java’s security model, as the system is
`primarily intended for parallel cluster computing.
`
`1A fast, flexible, black-and-white, tropical fish that can be found in the Indonesian archipelago.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`750
`
`†
`
`J. Maassen et al.
`
`The main contributions of this article are as follows.
`— We show that RMI can be implemented efficiently and can obtain a perfor-
`mance close to that of RPC systems. The null-RMI latency of Manta RMI
`over Myrinet is 37 „s, only 6 „s slower than a C-based RPC protocol.
`— We show that this high performance can be achieved while still supporting
`polymorphism and interoperability with JVMs by using dynamic bytecode
`compilation and multiple RMI protocols.
`— We give a detailed performance comparison between the Manta and Sun
`RMI protocols, using benchmarks as well as a collection of six parallel ap-
`plications. To allow a fair comparison, we compiled the applications and the
`Sun RMI protocol with the native Manta compiler. The results show that
`the Manta protocol results in 1.8 to 3.4 times higher speedups for four out of
`six applications.
`The remainder of the article is structured as follows. Design and imple-
`mentation of the Manta system are discussed in Section 2. In Section 3, we
`give a detailed analysis of the communication performance of our system. In
`Section 4, we discuss the performance of several parallel applications. In
`Section 5, we look at related work. Section 6 presents conclusions.
`
`2. DESIGN AND IMPLEMENTATION OF MANTA
`This section will discuss the design and implementation of the Manta RMI
`system, which includes the Manta RMI protocol and the software extensions
`that make Manta compatible with Sun RMI.
`
`2.1 Manta Structure
`Since Manta is designed for high-performance parallel computing, it uses a
`native compiler rather than a JIT. The most important advantage of a native
`compiler is that it can perform more time consuming optimizations, and there-
`fore (potentially) generate better code.
`The Manta system is illustrated in Figure 1. The box in the middle de-
`scribes the structure of a Manta process, which contains the executable code
`for the application and (de)serialization routines, both of which are generated
`by Manta’s native compiler. Manta processes can communicate with each other
`through the Manta RMI protocol, which has its own wire format. A Manta pro-
`cess can communicate with any JVM (the box on the right) through the Sun
`RMI protocol, using the standard RMI format (i.e., the format defined in Sun’s
`RMI specification).
`A Manta-to-Manta RMI is performed with the Manta protocol, which is
`described in detail in the next section. Manta-to-Manta communication is
`the common case for high-performance parallel programming, for which our
`system is optimized. Manta’s serialization and deserialization protocols sup-
`port heterogeneity (RMIs between machines with different byte-orderings or
`alignment properties).
`A Manta-to-JVM RMI is performed with a slower protocol that is compatible
`with the RMI specification and the standard RMI wire format. Manta uses
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`Efficient Java RMI
`
`†
`
`751
`
`Fig. 1. Manta/JVM interoperability.
`
`generic routines to (de)serialize the objects to or from the standard format.
`These routines use reflection, similar to Sun’s implementation. The routines are
`written in C, as is all of Manta’s runtime system, and execute more efficiently
`than Sun’s implementation, which is partly written in Java.
`To support polymorphism for RMIs between Manta and JVMs, a Manta ap-
`plication must be able to handle bytecode from other processes. When a Manta
`application requests bytecode from a remote process, Manta will invoke its
`bytecode compiler to generate the metaclasses, the (de)serialization routines,
`and the object code for the methods as if they were generated by the Manta
`source code compiler. Dynamic bytecode compilation is described in more detail
`in Section 2.4. The dynamically generated object code is linked into the applica-
`tion with the operating system’s dynamic linking interface. If a remote process
`requests bytecode from a Manta application, the JVM bytecode loader retrieves
`the bytecode for the requested class in the usual way through a shared filesys-
`tem or through an HTTP daemon. Sun’s javac compiler is used to generate the
`bytecode at compile time.
`The structure of the Manta system is more complicated than that of a JVM.
`Much of the complexity of implementing Manta efficiently is due to the need to
`interface a system based on a native-code compiler with a bytecode-based sys-
`tem. The fast communication path in our system, however, is straightforward:
`the Manta protocol just calls the compiler-generated serialization routines and
`uses a simple scheme to communicate with other Manta processes. This fast
`communication path is described below.
`
`2.2 Serialization and Communication
`RMI systems can be split into three major components: low-level communica-
`tion, the RMI protocol (stream management and method dispatch), and serial-
`ization. Below, we discuss how the Manta protocol implements each component.
`Low-level communication. RMI implementations are typically built on top
`of TCP/IP, which was not designed for parallel processing. Manta uses the Panda
`communication library [Bal et al. 1998], which has efficient implementations
`on a variety of networks. Panda uses a scatter/gather interface to minimize the
`number of memory copies, resulting in high throughput.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`752
`
`†
`
`J. Maassen et al.
`
`Fig. 2. Structure of Sun and Manta RMI protocols; shaded layers run compiled code.
`
`On Myrinet, Panda uses the LFC communication system [Bhoedjang et al.
`2000], which provides reliable communication. LFC is a network interface
`protocol for Myrinet that is both efficient and provides the right functionality
`for parallel programming systems. LFC itself is implemented partly by embed-
`ded software that runs on the Myrinet Network Interface processor and partly
`by a library that runs on the host. To avoid the overhead of operating system
`calls, the Myrinet Network Interface is mapped into user space, so LFC and
`Panda run entirely in user space. The current LFC implementation does not
`offer protection, so the Myrinet network can be used by a single process only.
`On Fast Ethernet, Panda is implemented on top of UDP, using a 2-way slid-
`ing window protocol to obtain reliable communication. The Ethernet network
`interface is managed by the kernel (in a protected way), but the Panda RPC
`protocol runs in user space.
`The Panda RPC interface is based on an upcall model: conceptually, a new
`thread of control is created when a message arrives which will execute a han-
`dler for the message. The interface was designed to avoid thread switches in
`simple cases. Unlike active message handlers [von Eicken et al. 1992], upcall
`handlers in Panda are allowed to block to enter a critical section, but a handler
`is not allowed to wait for another message to arrive. This restriction allows the
`implementation to handle all messages using a single thread, so handlers that
`execute without blocking do not need any context switches.
`The RMI protocol. The runtime system for the Manta RMI protocol is
`written in C. It was designed to minimize serialization and dispatch over-
`head such as copying, buffer management, fragmentation, thread switching,
`and indirect method calls. Figure 2 gives an overview of the layers in the
`Manta RMI protocol and compares it with the layering of the Sun RMI system.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`Efficient Java RMI
`
`†
`
`753
`
`The shaded layers denote statically compiled code, while the white layers are
`mainly JIT-compiled Java (although they contain some native calls). Manta
`avoids the stream layers of Sun RMI. Instead, RMI parameters are serialized
`directly into an LFC buffer. Moreover, in the JDK, these stream layers are
`written in Java, and therefore their overhead depends on the quality of the
`Java implementation. In Manta, all layers are either implemented as compiled
`C code or compiler-generated native code. Also, the native code generated by
`the Manta compiler calls RMI serializers directly, instead of using the slow
`Java Native Interface. Heterogeneity between little-end and big-end machines
`is handled by sending data in the native byte order of the sender, and having
`the receiver do the conversion, if necessary.
`Another optimization in the Manta RMI protocol is avoiding thread switching
`overhead at the receiving node. In the general case, an invocation is serviced
`at the receiving node by a newly allocated thread, which runs concurrently
`with the application threads. With this approach, however, the allocation of
`the new thread and the context switch to this thread will be on the critical
`path of the RMI. To reduce the allocation overhead, the Manta runtime system
`maintains a pool of preallocated threads, so the thread can be taken from this
`pool instead of being allocated. In addition, Manta avoids the context-switching
`overhead for simple cases. The Manta compiler determines whether a remote
`method may block. If the compiler can guarantee that a given method will
`never block, the receiver executes the method without doing a context switch
`to a separate thread. In this case, the current application thread will service
`the request and then continue. The compiler currently makes a conservative
`estimation, and only guarantees the nonblocking property for methods that do
`not call other methods and do not create objects (since that might invoke the
`garbage collector, which may cause the method to block). This analysis has to
`be conservative, since a deadlock situation might occur if an application thread
`services a method that blocks.
`The Manta RMI protocol cooperates with the garbage collector to keep track
`of references across machine boundaries. Manta uses a local garbage collector
`based on a mark-and-sweep algorithm. Each machine runs this local collector,
`using a dedicated thread that is activated by the runtime system or the user.
`The distributed garbage collector is implemented on top of the local collectors,
`using a reference-counting mechanism for remote objects (distributed cycles
`remain undetected). If a Manta process communicates with a JVM, it uses the
`distributed garbage collection algorithm of the Sun RMI implementation, which
`is based on leasing.
`The serialization protocol. The serialization of method arguments is an
`important source of overhead in existing RMI implementations. Serialization
`takes a Java object and converts (serializes) it into an array of bytes, making
`a deep copy that includes the referenced subobjects. The Sun serialization pro-
`tocol is written in Java and uses reflection to determine the type of each object
`during runtime. The Sun RMI implementation uses the serialization protocol
`for converting data that are sent over the network. The process of serializing
`all arguments of a method is called marshalling.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`754
`
`†
`
`J. Maassen et al.
`
`With the Manta protocol, all serialization code is generated by the compiler,
`avoiding most of the overhead of reflection. Serialization code for most classes
`is generated at compile time. Only serialization code for classes which are not
`locally available is generated at runtime, by the bytecode compiler. The over-
`head of this runtime code generation is incurred only once—the first time the
`new class is used as an argument to some method invocation. For subsequent
`uses, the efficient serializer code is then available for reuse.
`The Manta compiler also generates the marshalling code for methods. The
`compiler generates method-specific marshall and unmarshall functions, which
`(among others) call the generated routines to serialize or deserialize all ar-
`guments of the method. For every method in the method table, two pointers
`are maintained to dispatch to the right marshaller or unmarshaller, depend-
`ing on the dynamic type of the given object. A similar optimization is used for
`serialization: every object has two pointers in its method table to the serial-
`izer and deserializer for that object. When a particular object is to be serial-
`ized, the method pointer is extracted from the method table of the object’s dy-
`namic type and the serializer is invoked. On deserialization, the same procedure
`is applied.
`Manta’s serialization protocol performs optimizations for simple objects. An
`array whose elements are of a primitive type is serialized by doing a direct
`memory copy into the LFC buffer, so the array need not be traversed, as is done
`by the JDK. In order to detect duplicate objects, the marshalling code uses a
`table containing objects that have already been serialized. If the method does
`not contain any parameters that are objects, however, the table is not built up,
`which again makes simple methods faster.
`Another optimization concerns the type descriptors for the parameters of an
`RMI call. When a serialized object is sent over the network, a descriptor of its
`type must also be sent. The Sun RMI protocol sends a complete type descriptor
`for every class used in the remote method, including the name and package of
`the class, a version number, and a description of the fields in this class. All this
`information is sent for every RMI call; information about a class is only reused
`within a single RMI call. With the Manta RMI protocol, each machine sends
`the type descriptor only once to any other machine. The first time a type is
`sent to a certain machine, a type descriptor is sent and the type is given a new
`type-id that is specific to the receiver. When more objects of this type are sent
`to the same destination machine, the type-id is reused. When the destination
`machine receives a type descriptor, it checks if it already knows this type. If not,
`it loads it from the local disk or an HTTP server. Next, it inserts the type-id and
`a pointer to the metaclass in a table, for future references. This scheme thus
`ensures that type information is sent only once to each remote node.
`
`2.3 Generated Marshalling Code
`Figures 3, 4, and 5 illustrate the generated marshalling code. Consider the
`RemoteExample class in Figure 3. The square() method can be called from an-
`other machine, so the compiler generates marshalling and unmarshalling code
`for it.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`Efficient Java RMI
`
`†
`
`755
`
`import java.rmi.*;
`import java.rmi.server.UnicastRemoteObject;
`
`public class RemoteExample extends UnicastRemoteObject
`implements RemoteExampleInterface {
`int value;
`String name;
`
`synchronized int square(int i, String s1, String s2) throws RemoteException {
`value = i;
`name = s1 + s2;
`System.out.println("i = " + i);
`return i*i;
`
`}
`
`}
`
`Fig. 3. A simple remote class.
`
`marshall__square(class__RemoteExample *this, int i, class__String *s1, class__String *s2) {
`MarshallStruct *m = allocMarshallStruct();
`ObjectTable = createObjectTable();
`
`writeHeader(m->outBuffer, this, OPCODE_CALL, CREATE_THREAD);
`writeInt(m->outBuffer, i);
`writeObject(m->outBuffer, s1, ObjectTable);
`writeObject(m->outBuffer, s2, ObjectTable);
`
`// Request message is created, now write it to the network.
`flushMessage(m->outBuffer);
`
`fillMessage(m->inBuffer); // Receive reply.
`opcode = readInt(m->inBuffer);
`if (opcode == OPCODE_EXCEPTION) {
`class__Exception *exception = readObject(m->inBuffer, ObjectTable);
`freeMarshallStruct(m);
`THROW_EXCEPTION(exception);
`} else {
`result = readInt(m->inBuffer);
`freeMarshallStruct(m);
`RETURN(result);
`
`}
`
`}
`
`Fig. 4. The generated marshaller (pseudocode) for the square method.
`
`unmarshall__square(class__RemoteExample *this, MarshallStruct *m) {
`ObjectTable = createObjectTable();
`
`int i = readInt(m->inBuffer);
`class__String *s1 = readObject(m->inBuffer, ObjectTable);
`class__String *s2 = readObject(m->inBuffer, ObjectTable);
`
`result = CALL_JAVA_FUNCTION(square, this, i, s1, s2, &exception);
`if (exception) {
`writeInt(m->outBuffer, OPCODE_EXCEPTION);
`writeObject(m->outBuffer, exception, ObjectTable);
`} else {
`writeInt(m->outBuffer, OPCODE_RESULT_CALL);
`writeInt(m->outBuffer, result);
`
`} /
`
`/ Reply message is created, now write it to the network.
`flushMessage(m->outBuffer);
`
`}
`
`Fig. 5. The generated unmarshaller (pseudocode) for the square method.
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`756
`
`†
`
`J. Maassen et al.
`
`The generated marshaller for the square() method is shown in Figure 4
`in pseudocode. Because square() has Strings as parameters (which are ob-
`jects in Java), a table is built to detect duplicates. A special create thread
`flag is set in the header data structure because square potentially blocks: it
`contains a method call that may block (e.g., in a wait()) and it creates objects,
`which may trigger garbage collection and thus may also block. The writeObject
`calls serialize the string objects to the buffer. flushMessage does the actual
`writing out to the network buffer. The function fillMessage initiates reading
`the reply.
`Pseudocode for the generated unmarshaller is shown in Figure 5. The
`header is already unpacked when this unmarshaller is called. Because the
`create thread flag in the header was set, this unmarshaller will run in a sep-
`arate thread obtained from a thread pool. The marshaller itself does not know
`about this. Note that the this parameter is already unpacked and is a valid
`reference for the machine on which the unmarshaller will run.
`
`2.4 Dynamic Bytecode Compilation
`To support polymorphism, a Manta program must be able to handle classes
`that are exported by a JVM, but that have not been statically compiled into
`the Manta program. To accomplish this, the Manta RMI system contains a
`bytecode compiler to translate classes to object code at runtime. We describe
`this bytecode compiler below. Manta uses the standard dynamic linker to link
`the object code into the running application.
`As with the JDK, the compiler reads the bytecode from a file or an HTTP
`server. Next, it generates a Manta metaclass with dummy function entries in
`its method table. Since the new class may reference or even subclass other
`unknown classes, the bytecode compiler is invoked recursively for all refer-
`enced unknown classes. Subsequently, the instruction stream for each byte-
`code method is compiled into a C function. For each method, the used stack
`space on the Virtual Machine stack is determined at compile time, and a lo-
`cal stack area is declared in the C function. Operations on local variables are
`compiled in a straightforward way. Virtual function calls and field references
`can be resolved from the running application, including the newly generated
`metaclasses. Jumps and exception blocks are implemented with labels, gotos,
`and nonlocal gotos (setjmp/longjmp). The resulting C file is compiled with the
`system C compiler, and linked into the running application with the system
`dynamic linker (called dlopen() in many Unix implementations). The dummy
`entries in the created metaclass method tables are resolved into function point-
`ers in the dynamically loaded library.
`One of the optimizations we implemented had a large impact on the speed
`of the generated code: keeping the method stack in registers. The trivial im-
`plementation of the method stack would be to maintain an array of N 32-bit
`words, where N is the size of the used stack area of the current method. Since
`bytecode verification requires that all stack offsets can be computed statically,
`it is, however, possible to replace the array with a series of N register variables,
`so the calls to increment or decrement the stack pointer are avoided and the
`
`ACM Transactions on Programming Languages and Systems, Vol. 23, No. 6, November 2001.
`
`Ingenico v. IOENGINE
`IPR2019-00416 (US 8,539,047)
`Exhibit 2109
`
`

`

`Efficient Java RMI
`
`†
`
`757
`
`Fig. 6. Example of Manta’s interoperability.
`
`C compiler can keep stack references in registers. A problem is that in the JVM,
`64-bit variables are spread over two contiguous stack locations. We solve this by
`maintaining two parallel stacks, one for 32-bit and one for 64-bit words. Almost
`all bytecode instructions are typed, so they need to operate only on the relevant
`stack. Some infrequently used instructions (the dup2 family) copy either two
`32-bit words or one 64-bit word, and therefore operate on both stacks. The
`memory waste of a duplicate stack is moderate, since the C compiler will re-
`move any unreferenced local variables. With this optimization, the application
`speed of compiled bytecode is generally within 30% of compiled Manta code.
`
`2.5 Example Application
`Manta’s RMI interoperability and dynamic class loading are useful to interop-
`erate with software that runs on a JVM and uses the Sun RMI protocol. For
`example, consider a parallel program that generates output that must be visu-
`alized. The parallel program is compiled with Manta and uses the Manta RMI
`protocol. The software for the visualization system to be used, however, may
`run on the Sun JDK and use the Sun RMI protocol. To illustrate this type of
`interoperability, we implemented a simple example, using a graphical version
`of one of our parallel applications (successive overrelaxation; see Section 4).
`The computation is performed by a parallel program that is compiled with
`Manta and runs on a cluster computer (see Figure 6). The output is visualized on
`the display of a workstation, using a graphical user interface (GUI) application
`written in Java. The parallel application repeatedly performs one iteration of
`the SOR algorithm and collects its data (a 2-dimensional array) at one node of
`the cluster, called the coordinator. The coordinator passes the array via the Sun
`RMI protocol to a remote viewer object, which is part of the GUI application
`on the workstation.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket