`Morris, III
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US005915124A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,915,124
`Jun.22,1999
`
`[54] METHOD AND APPARATUS FOR A FIRST
`DEVICE ACCESSING COMPUTER MEMORY
`AND A SECOND DEVICE DETECTING THE
`ACCESS AND RESPONDING BY
`PERFORMING SEQUENCE OF ACTIONS
`
`[75]
`
`Inventor: George Lockhart Morris, III,
`Escondido, Calif.
`
`[73] Assignee: NCR Corporation, Dayton, Ohio
`
`[21] Appl. No.: 08/778,938
`
`[22]
`
`Filed:
`
`Jan. 3, 1997
`
`[51]
`[52]
`
`[58]
`
`[56]
`
`Int. Cl.6
`...................................................... G06F 13/00
`U.S. Cl. .......................... 395/829; 395/826; 395/836;
`395!856; 711!146
`Field of Search ............................. 370/218; 395/520,
`395/872, 828, 847, 836, 842, 829, 473;
`707/201; 364/578, 528.21
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`2/1988 Doshi eta!. ............................ 364/578
`4,725,971
`2/1989 Naron et a!. ............................ 370/218
`4,807,224
`5,170,470 12/1992 Pindar et a!.
`........................... 395/828
`
`5,247,650
`5,426,737
`5,479,654
`5,649,230
`5,710,712
`5,732,283
`5,765,022
`
`9/1993 Judd et a!. .............................. 395!500
`6/1995 Jacobs ..................................... 395/847
`12/1995 Squibb .................................... 707/201
`7/1997 Lentz ...................................... 395/872
`1!1998 Labun ................................ 364/528.21
`3/1998 Rose et a!. .............................. 395/836
`6/1998 Kaiser et a!.
`........................... 395/842
`
`Primary Examiner-Thomas C. Lee
`Assistant Examiner-Michael G. Smith
`Attorney, Agent, or Firm-Gates & Cooper
`
`[57]
`
`ABSTRACT
`
`A method of controlling an input/output (110) device con(cid:173)
`nected to a computer to facilitate fast 1!0 data transfers. An
`address space for the 1!0 device is created in the virtual
`memory of the computer, wherein the address space com(cid:173)
`prises virtual registers that are used to directly control the
`1!0 device. In essence, control registers and/or memory of
`the 1!0 devices are mapped into the virtual address space,
`and the virtual address space is backed by control registers
`and/or memory on the 1!0 device. Thereafter, the 1!0 device
`detects writes to the address space. As a result, a pre-defined
`sequence of actions can be triggered in the 1!0 device by
`programming specified values into the data written into the
`mapped virtual address space.
`
`27 Claims, 8 Drawing Sheets
`
`TWO USER PROCESSES
`VIRTUAL MEMORY WITH
`VIRTUAL HARDWARE
`
`PHYSICAL MEMORY
`ON I I 0 ADAPTER
`
`SINGLE PAGE
`
`SINGLE PAGE
`
`/
`
`/
`
`\_206
`
`' ' '
`
`--
`--
`' ' '
`
`' ' ' '
`
`210
`
`212
`
`'
`
`' ' ' '
`
`DEVICE DRIVER
`VIRTUAL MEMORY
`IN KERNEL
`
`208-
`
`' '
`
`','i--------------1
`
`'
`
`' '
`
`' '1----------l
`
`~202
`
`I----- 204
`
`~
`~
`
`~
`~
`~
`
`' '
`
`'
`
`' '
`
`'
`
`,·
`
`r--
`
`~ /
`
`-2 12
`
`210
`
`- SECURE VIEW TO
`
`PER USER PROCESS
`
`VIRTUAL HARDWARE
`
`I
`
`WISTRON CORP. EXHIBIT 1021.001
`
`
`
`""-
`N
`~
`....
`Ul
`~
`\C
`....
`Ul
`
`00
`
`'"""' 0 ......,
`~ .....
`'JJ. =(cid:173)~
`
`~ = ?
`
`'0
`'0
`'"""'
`'0
`~N
`N
`
`~ = ......
`~ ......
`~
`•
`\Jl
`d •
`
`CONVENTIONAL NETWORK INTERFACE
`
`CONVENTIONAL NETWORK INTERFACE
`
`R1
`
`NETWORK MAC 1
`
`r 118
`
`120-
`
`-114
`
`NETWORK MAC
`
`S5
`
`116\
`
`INTERFACE
`
`DRIVER
`
`122-
`
`INTERFACE ~112
`
`DRIVER
`
`28
`
`~ SOCKET LAYER I J USER BUFFER I
`' I APPLICATION I ...
`
`RECEIVER
`
`132
`
`102
`
`I SOCKET LAYER f
`APPLICATION r
`
`1
`
`__,
`
`SENDER
`
`7
`
`BUFFERING
`
`KERNAL
`
`08\
`
`104
`
`F
`
`BUFFERING
`
`KERNAL
`
`~
`
`26
`
`~ [1
`
`~
`
`MODULES
`PROTOCOL
`
`124-
`
`-110
`
`MODULES
`PROTOCOL
`
`WISTRON CORP. EXHIBIT 1021.002
`
`
`
`""-
`N
`~
`._.
`Ul
`~
`\C
`._.
`Ul
`
`00
`0 ......,
`N
`~ .....
`'JJ. =(cid:173)~
`
`~ = ?
`
`'0
`'0
`'"""'
`'0
`~N
`N
`
`~ = ......
`~ ......
`~
`•
`\Jl
`d •
`
`' '
`
`' '
`
`' ' ' ' ' ' '
`
`' '
`
`' '
`
`212
`
`210
`
`' ' ' ' ' ' '
`
`..,
`
`208
`
`IN KERNEL
`
`VIRTUAL MEMORY
`DEVICE DRIVER
`
`206
`
`SINGLE PAGE
`
`SINGLE PAGE
`
`ON I I 0 ADAPTER
`PHYSICAL MEMORY
`
`12
`
`~
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`/
`
`/
`
`/
`
`"
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`/
`
`/
`
`1--
`
`/
`
`/
`
`/
`
`_I- --
`
`204
`
`f--
`
`~2
`
`l
`
`VIRTUAL HARDWARE
`
`PER USER PROCESS
`
`SECURE VIEW TO
`
`210
`
`FIG. 2
`
`VIRTUAL HARDWARE
`
`VIRTUAL MEMORY WITH
`TWO USER PROCESSES
`
`WISTRON CORP. EXHIBIT 1021.003
`
`
`
`""-
`N
`~
`....
`Ul
`~
`\C
`....
`Ul
`
`00
`0 ......,
`~
`
`~ .....
`'JJ. =(cid:173)~
`
`~ = ?
`
`'0
`'0
`'"""'
`'0
`~N
`N
`
`~ = ......
`~ ......
`~
`•
`\Jl
`d •
`
`INTERFACE
`COMMODITY ~322
`
`I PHYSICAL HARDWARE REGISTERS I
`
`320
`
`I VIRTUAL HARDWARE I VIRTUAL HARDWARE
`316\
`
`318\
`
`PASS-THROUGH
`
`DRIVER
`
`J SETUP DRIVER J
`
`310\
`
`314"\
`
`312-
`
`~ PROCESSING
`
`STREAMS
`NORMAL
`
`~ 308-
`
`APPLICATION
`
`SLOW
`
`306-
`
`FIG. 3
`
`I
`
`I
`
`I
`
`I
`
`FAST APPLICATION
`
`FAST APPLICATION
`
`304
`
`302
`
`WISTRON CORP. EXHIBIT 1021.004
`
`
`
`U.S. Patent
`
`Jun.22,1999
`
`Sheet 4 of 8
`
`5,915,124
`
`FIG. 4
`
`408
`
`DAI I FRONT END
`
`404
`
`PROCESS..-------+~
`
`TRADITIONAL 1-4----+-~
`OS INTERFACE
`
`406
`
`DATA IS ROUTED
`DIRECTLY TO
`THE APPLICATION
`
`WISTRON CORP. EXHIBIT 1021.005
`
`
`
`""-
`N
`~
`....
`Ul
`~
`\C
`....
`Ul
`
`00
`0 .....,
`Ul
`~ .....
`'JJ. =(cid:173)~
`
`~ = ?
`
`'0
`'0
`'"""'
`'0
`~N
`N
`
`~ = ......
`~ ......
`~
`•
`\Jl
`d •
`
`\.518
`
`HARDWARE RO
`OS DRIVER R I W
`
`PHYSICAL ADDRESS
`
`BUFFER MAP
`
`(510
`
`PROTOCOL DATA
`
`~ ENDPOINT
`
`(514
`
`SCRIPTS
`PROTOCOL
`
`(516
`
`ENDPOINT TABLE
`
`APPLICATION ID
`
`INDEXED BY
`
`512
`
`ADAPTER MEMORY
`
`---------------------~
`
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`APPLICATION R I W
`
`HARDWARE RO
`
`508)
`
`REGISTER
`SOFTWARE
`
`FIG. 5
`
`502
`
`POOL
`BUFFER
`
`(506
`
`--------------
`
`\.5o4
`
`REGISTER
`HARDWARE
`
`APPLICATION RO
`HARDWARE R I W
`
`MAIN MEMORY
`
`WISTRON CORP. EXHIBIT 1021.006
`
`
`
`U.S. Patent
`
`Jun.22,1999
`
`Sheet 6 of 8
`
`5,915,124
`
`HEX DEC
`0 0
`1 1
`2 2
`3 3
`4 4
`5 5
`6 6 604
`7 7
`8 8
`9 9
`10
`a
`11
`b
`12
`c
`13
`d
`14
`e
`f
`15
`16
`10
`17
`11
`18
`12
`13
`19
`14
`20
`21
`15
`22
`16
`17
`23 606
`24
`18
`19
`25
`1a
`26
`27
`1b
`1c
`28
`1d
`29
`1e
`30
`1f
`31
`32
`20
`21
`33
`22
`34
`23
`35
`24
`36
`37 608
`25
`26
`38
`27
`39
`40
`28
`41
`29
`42
`2a
`
`610
`
`ETHERNET
`HEADER
`(14 BYTES)
`
`TARGET ETHERNET ADD
`(6 BYTES)
`
`FIG. 6
`
`SOURCE ETHERNET ADDRESS
`(6 BYTES)
`
`PROTOCOL TYPE (Ox0800 = IP)
`
`IP HEADER
`(20 BYTES)
`
`TOTAL LENGTH= Ox001d (29 BYTES: 20- BYTE IP H
`PLUS 8- BYTE UDP HEADER PLUS 1 -BYTE USER DA
`DATAGRAM ld = Oxe0a1
`
`FLAG Ox4 DO_NOT_FRAGMENT
`FRAGMENT OFFSET= OxOOO
`
`OF SOURCE= 128.1.192.7
`
`IP ADDRESS OF DESTINATION= 128.1.192.8
`
`UDP HEADER
`(8 BYTES)
`
`SOURCE PORT= Ox0007 (ECHO DATAGRAM)
`
`DP LENGTH = Ox0009
`-BYTE UDP HEADER PLUS 1- BYTE USER DATA)
`UDP CHECKSUM = Ox0cf8
`
`USER DATA
`(VARIABLE)
`
`1 BYTE USER DATAGRAM= "g"
`
`WISTRON CORP. EXHIBIT 1021.007
`
`
`
`U.S. Patent
`
`Jun.22,1999
`
`Sheet 7 of 8
`
`5,915,124
`
`FIG. 7
`
`ETHERNET
`HEADER
`(14 BYTES)
`
`TARGET ETHERNET ADDRESS
`(6 BYTES)
`
`IP HEADER
`(20 BYTES)
`
`SOURCE ETHERNET ADDRESS
`(6 BYTES)
`
`L TYPE (Ox0800 = IP)
`
`DATAGRAM ld
`
`FLAG Ox4 DO NOT FRAGMENT
`FRAGMENT OFFSET= OxOOO
`
`= 128.1.192.7
`
`IP ADDRESS OF DESTINATION = 128.1.192.8
`
`UDP HEADER
`(8 BYTES)
`
`SOURCE PORT= Ox0007 (ECHO DATAGRAM)
`
`DESTINATION PORT= Ox3018
`
`UDP LENGTH
`
`UDP CHECKSUM
`
`HEX DEC
`0 0
`1 1
`2 2
`3 3
`4 4
`5 5
`6 6 704
`7 7
`8 8
`9 9
`10
`a
`11
`b
`12
`c
`13
`d
`14
`e
`15
`f
`10
`16
`11
`17
`12
`18
`19
`13
`14
`20
`15
`21
`16
`22
`17
`23 706
`18
`24
`19
`25
`1a
`26
`1b
`27
`28
`1c
`1d
`29
`1e
`30
`31
`1f
`20
`32
`21
`33
`22
`34
`23
`35
`24
`36
`25
`37 708
`26
`38
`27
`39
`40
`28
`29
`41
`
`WISTRON CORP. EXHIBIT 1021.008
`
`
`
`U.S. Patent
`
`Jun.22,1999
`
`Sheet 8 of 8
`
`5,915,124
`
`FIG. 8
`
`CPU
`
`I
`
`/
`BRIDGE l
`
`806
`
`804
`
`r
`
`SYSTEM
`MEMORY
`
`•
`I
`
`I
`
`802\
`I
`+
`
`I
`I
`I
`I
`I
`I
`
`BUSA
`
`BUS B
`
`I
`
`~
`
`I
`
`~
`
`808
`
`INTERRUPT
`
`I BRIDGE
`
`BUSC
`
`~
`
`DATA
`\
`
`\
`
`I
`I
`I
`I
`I
`I
`
`HEADER
`
`SVH DEVICE rJ~"[E_R_R_U_pJ ____ PEER TARGET DEVICE l
`I
`l 812
`I)
`810
`
`WISTRON CORP. EXHIBIT 1021.009
`
`
`
`5,915,124
`
`2
`direct access to an 110 device without intervention of the
`operating system on a per 110 basis.
`
`SUMMARY OF THE INVENTION
`
`1
`METHOD AND APPARATUS FOR A FIRST
`DEVICE ACCESSING COMPUTER MEMORY
`AND A SECOND DEVICE DETECTING THE
`ACCESS AND RESPONDING BY
`PERFORMING SEQUENCE OF ACTIONS
`
`CROSS-REFERENCE TO PARENT
`APPLICATION
`
`This application is related to and commonly assigned U.S.
`patent application Ser. No. 08/577,678, filed Dec. 21, 1995,
`now U.S. Pat. No. 5,768,618 dated Dec. 1, 1998 , by G.
`Erickson et al., entitled "DIRECT PROGRAMMED 110
`DEVICE CONTROL METHOD USING VIRTUAL
`REGISTERS", and which application is incorporated by
`reference herein.
`
`BACKGROUND OF THE INVENTION
`
`5
`
`15
`
`To overcome the limitations in the prior art described
`above, and to overcome other limitations that will become
`apparent upon reading and understanding the present
`specification, the present invention discloses a "Shared
`Virtual Hardware" technique that provides the capability for
`10 multiple user processes in a single computing node to
`simultaneously share direct access to an 110 device contain(cid:173)
`ing "Virtual Hardware" functionality without operating sys(cid:173)
`tem intervention.
`The Virtual Hardware functionality, which is described in
`the co-pending application cited above, is a method of
`controlling an input/output (110) device connected to a
`computer to facilitate fast 110 data transfers. An address
`space for the 110 device is created in the virtual memory of
`the computer, wherein the address space comprises virtual
`20 registers that are used to directly control the 110 device. In
`essence, control registers and/or memory of the 110 device
`are mapped into the virtual address space, and the virtual
`address space is backed by control registers and/or memory
`on the 110 device. Thereafter, the 110 device detects writes
`25 to the address space. As a result, a predefined sequence of
`actions can be triggered in the 110 device by programming
`specified values into the data written into the mapped virtual
`address space.
`The Shared Virtual Hardware technique of the present
`invention extends the Virtual Hardware technique by incor(cid:173)
`porating additional capability into the Virtual Hardware in
`order to share its functionality with peer devices. Therefore,
`Shared Virtual Hardware enables significant performance
`35 enhancements to 110 devices which do not contain the
`Virtual Hardware functionality.
`User processes interacting with standard, off-the-shelf
`devices may benefit from the Virtual Hardware technique. In
`addition, the system bandwidth requirements for the control
`of the off-the-shelf device may be reduced as the controlling
`element moves closer in proximity to the device in multi(cid:173)
`tiered bus structures (from host processor to peer level
`processor).
`
`1. Field of the Invention
`This invention relates in general to computer input/output
`(110) device interfaces, and in particular, to a method of
`using virtual registers to directly control an 110 device
`adapter to facilitate fast data transfers.
`2. Description of Related Art
`Modern computer systems are capable of running mul(cid:173)
`tiple software tasks or processes simultaneously. In order to
`send information to a software process or receive informa(cid:173)
`tion from a software process, an input/output (110) device
`interface is typically used. An 110 device interface provides
`a standardized way of sending and receiving information, 30
`and hides the physical characteristics of the actual 110
`device from the software process. Software processes which
`use 110 device interfaces are typically easier to program and
`are easier to implement across multiple types of computers
`because they do not require knowledge or support for
`specific physical 110 devices.
`An 110 device interface is typically implemented by a
`software program called an 110 device driver. The 110 device
`driver must take information coming from an external
`source, such as a local area network, and pass it along to the
`appropriate software process. Incoming data is frequently
`buffered into a temporary storage location in the device
`driver's virtual address space (VAS), where it subsequently
`copied to the VAS of the user process during a separate step.
`However, as recent advances in communication technol(cid:173)
`ogy have rapidly increased the bandwidth of many 110
`devices, the copying step from the device 110 driver's VAS
`to the user process' VAS represents a potential bottleneck.
`For instance, the bandwidth for fiber optic link lines is now 50
`typically measured in gigabits per second. This tremendous
`bandwidth creates a problem when information is copied
`within the computer. When information is copied, all data
`passes through the computer processor, memory, and inter(cid:173)
`nal data bus several times. Therefore, each of these campo- 55
`nents represents a potential bottleneck which will limit the
`ability to use the complete communications bandwidth. 110
`latency and bandwidth are impaired by this standard pro(cid:173)
`gramming paradigm utilizing intermediate copying.
`In addition, programming an 110 device driver usually 60
`involves a user process making an operating system call,
`which involves a context switch from the user process to the
`operating system. This context switch further inhibits the
`ability of computer systems to handle a high 110 data
`bandwidth.
`Therefore, there is a need for multiple user processes in a
`single computing node to be able to simultaneously share
`
`40
`
`45
`
`65
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a flow diagram illustrating a conventionaliiO
`data flow between a sender and a receiver;
`FIG. 2 is a block diagram illustrating a Virtual Hardware
`memory organization compatible with the present invention;
`FIG. 3 is a flow diagram describing the system data flow
`of fast and slow applications compatible with the present
`invention;
`FIG. 4 is a block diagram describing direct application
`interface (DAI) and routing of data between processes and
`an external data connection which is compatible with the
`present invention;
`FIG. 5 is a block diagram illustrating the system organi(cid:173)
`zation between a main memory and an 110 device adapter
`memory which is compatible with the present invention;
`FIG. 6 is a block diagram illustrating a typical Ethernet(cid:173)
`based UDP datagram sent by a user process;
`FIG. 7 is a block diagram illustrating a UDP datagram
`header template in the 110 device adapter's memory; and
`FIG. 8 is a block diagram illustrating a Shared Virtual
`Hardware implementation compatible with the present
`invention.
`
`WISTRON CORP. EXHIBIT 1021.010
`
`
`
`5,915,124
`
`3
`DETAILED DESCRIPTION OF 1HE
`PREFERRED EMBODIMENT
`In the following description of the preferred embodiment,
`reference is made to the accompanying drawings which
`form a part hereof, and in which is shown by way of
`illustration a specific embodiment in which the invention
`may be practiced. It is to be understood that other embodi(cid:173)
`ments may be utilized and structural changes may be made
`without departing from the scope of the present invention.
`Overview
`Programming an input/output (110) device typically
`involves a user process making a call to the operating
`system. This involves a context switch that swaps informa(cid:173)
`tion in system registers and memory in order to process
`incoming data. Further, in many environments, the routing
`of 1!0 data also entails one or more memory-to-memory
`copies of the data before the physical 1!0 occurs on the
`actual device. It will be recognized that 1!0 latency and
`bandwidth are impaired by invoking the operating system
`through the use of an exception handler, as well as by
`performing multiple memory-to-memory copies of the data.
`The Virtual Hardware technique, described in co-pending
`and commonly assigned U.S. patent application Ser. No.
`08/577,678, filed Dec. 21, 1995, by G. Erickson et al.,
`entitled "DIRECT PROGRAMMED 1!0 DEVICE CON(cid:173)
`TROL METHOD USING VIRTUAL REGISTERS", and
`attorney's docket number 6368, which application is incor(cid:173)
`porated by reference herein, provides the capability for
`multiple user processes in a single computing node to
`simultaneously share direct access to an 1!0 device without
`the intervention of the operating system for each data
`transfer as it occurs. The present invention extends this
`technique by describing how a device containing Virtual
`Hardware functionality can be modified to include Shared
`Virtual Hardware functionality to create the most efficient
`method to transfer directly between the user process and
`off-the-shelf devices resident on the same node.
`Hardware Environment
`FIG. 1 is a flow diagram illustrating a conventional I/O
`data flow between a sender and a receiver. At 102, a sender
`application sends data across a memory bus to a user buffer
`104 via socket layer 106. The data is subsequently buffered
`through the operating system kernel 108 by the protocol
`modules 110, before it is sent out by the interface driver 112
`through conventional network interface 114 to the network
`media access control (MAC) 116. It will be noted that in this
`system model, the data makes at least three trips across the
`memory bus at S2, S3 and S5.
`For the receiving application, the steps are reversed from
`those of the sender application. The data is received via the
`network media access control (MAC) 118 into conventional
`network interface 120. The interface driver 122 and protocol
`modules 124 buffer the data through the operating system
`kernel126. The protocol modules 124 then send the data to
`user buffer 128, where it may be accessed by socket layer
`130 or receiver application 132. It will be noted that in this
`system model, the data makes at least three trips across the
`memory bus at R1, R2, and R4.
`Virtual Hardware Memory Organization
`FIG. 2 is a block diagram illustrating a Virtual Hardware
`memory organization. 1!0 device adapters on standard 1!0
`buses, such as ISA, EISA, MCA, or PCI buses, frequently
`have some amount of memory and memory-mapped regis(cid:173)
`ters which are addressable from a device driver in the
`operating system. User processes 202, 204 cause 1!0 opera(cid:173)
`tions by making a request of the operating system which
`transfers control to the device driver. The sharing of a single
`
`4
`1!0 device adapter by multiple user processes is managed by
`the device driver running in the kernel, which also provides
`security.
`The Virtual Hardware technique maps a portion of
`5 memory 206 physically located on the 1!0 device adapter
`into a device driver's address space 208. The Virtual Hard(cid:173)
`ware technique also maps sub-portions, e.g., pages, 210,
`212, of the 1!0 device adapter's memory 206 into the
`address spaces for one or more user processes 202, 204,
`10 thereby allowing the user processes 202, 204 to directly
`program the 1!0 device adapter without the overhead of the
`operating system, including context switches. Those skilled
`in the art will recognize that the sub-portions 210, 212 may
`be mapped directly from the 1!0 device adapter's memory
`15 206, or that the sub-portions 210, 212 may be mapped
`indirectly from the 1!0 device adapter's memory 206
`through the device driver's address space 208.
`The 1!0 device adapter subsequently snoops the virtual
`address space 210, 212 to detect any reads or writes. If a read
`20 or write is detected, the 1!0 device adapter performs a
`specific predefined script of actions, frequently resulting in
`an 1!0 operation being performed directly between the user
`process' address space 202, 204 and the 1!0 device adapter.
`The user process triggers the execution of the script by
`25 setting or "spanking" the 1!0 control registers in 210, 212,
`which in turn causes the execution of the script.
`The memory mapping must be typically performed in
`increments of the page size for the particular system or
`environment. Allocating memory in increments of the page
`30 size allows each user process to have a Virtual Hardware
`space 210, 212 that is secure from all other processes which
`might be sharing the same 1!0 device adapter. This security
`between user processes is maintained by the operating
`system in conjunction with virtual memory capabilities
`35 offered by most processors.
`Each user process creates a memory mapping by perform(cid:173)
`ing an operating system request to open the 1!0 device
`adapter for access. Having the operating system create the
`virtual memory address space allows the operating system
`40 and 1!0 device driver to grant only very specific capabilities
`to the individual user process.
`A script is prepared by the operating system for the 1!0
`device adapter to execute each time the specific user process
`programs its specific Virtual Hardware. The user process is
`45 given a virtual address in the user process' address space that
`allows the user process very specific access capabilities to
`the 1!0 device adapter.
`Virtual Hardware is also referred to as virtual registers.
`Virtual registers are frequently used to describe the view
`50 which a single user process has of the addressable registers
`of a given 1!0 device adapter.
`Maintaining security between multiple software processes
`is important when sharing a single 1!0 device adapter. If the
`1!0 device adapter controls a network interface, such as an
`55 Ethernet device, then the access rights granted to the user
`process by the operating system could be analogous to a
`Transmission Control Protocol (TCP) address or socket.
`A TCP socket is defined by a communications transport
`layer and defines a set of memory addresses through which
`60 communication occurs. These transport addresses form a
`network-wide name space that allows processes to commu(cid:173)
`nicate with each other. A discussion of the form and structure
`of TCP sockets and packets, which are well-known within
`the art, may be found in many references, including Com-
`65 puter Networks by Andrew S. Tanenbaum, Prentice-Hall,
`New Jersey, 1981, pp. 326-327, 373-377, which is incor(cid:173)
`porated by reference herein.
`
`WISTRON CORP. EXHIBIT 1021.011
`
`
`
`5,915,124
`
`5
`Typically, the only capability to which the user process
`can get direct access is to send and receive bytes over a
`specified socket or transport address range. The user process
`is not necessarily given permission to emit any arbitrary
`packet on the media (e.g., an Ethernet network). It will be
`recognized by those skilled in the art that the Virtual
`Hardware technique applies not only to Ethernet or other
`interconnect communication devices, but also to almost any
`1!0 device adapter in use by a multi-user operating system.
`Flow Diagram
`FIG. 3 is a flow diagram describing the system data flow
`of fast and slow applications 302, 304, and 306 compatible
`with the Virtual Hardware technique. A traditional slow
`application 306 uses normal streams processing 308 to send
`data to a pass-through driver 310. The pass-through driver
`310 initializes the physical hardware registers 320 of the 1!0
`device adapter 314 to subsequently transfer the data through
`the 1!0 device adapter 314 to the commodity interface 322.
`With the Virtual Hardware technique, fast user applications
`302 and 304 directly use a setup driver 312 to initialize the
`physical hardware registers 320, then send the data directly
`through the 1!0 device adapter 314 to the commodity
`interface 322 via Virtual Hardware 316 and 318. Thus, the
`overhead of the normal streams processing 308 and pass(cid:173)
`through driver 310 are eliminated with the use of the Virtual
`Hardware 316 and 318, and fast applications 302 and 304
`are able to send and receive data more quickly than slow
`application 306. As a result, the Virtual Hardware technique
`provides higher bandwidth, less latency, less system
`overhead, and shorter path lengths.
`Direct Application Interface
`FIG. 4 is a block diagram describing a direct application
`interface (DAI) and routing of data between processes and
`an external data connection which is compatible with the
`Virtual Hardware technique. Processes 402 and 404 transmit
`and receive data directly to and from an interconnect 410
`(e.g., 1!0 device adapter) through the DAI interface 408. The
`data coming from the interconnect 410 is routed directly to
`a process 402 or 404 by use of Virtual Hardware and
`registers, rather than using a traditional operating system
`interface 406.
`Conceptually, the Virtual Hardware technique may be
`thought of as providing each user process with its own 1!0
`device adapter, which makes the user process and 1!0 device
`adapter logically visible to each other. The user process
`initiates a data transfer directly through a write to memory,
`thereby avoiding the overhead processing which would be
`incurred if the operating system were used to service the data
`transfer request. The user process determines the status of
`the data transfer through a memory read. The operating so
`system and 1!0 device adapter remain free to allocate Virtual
`Hardware resources as needed, despite the presence of
`multiple user processes.
`An 1!0 device adapter typically can have an arbitrary
`amount of random access memory (RAM) ranging from
`several hundred kilobytes to several megabytes, which may
`be used for mapping several user processes in a single
`communications node. Each user process that has access to
`the Virtual Hardware is typically assigned a page-sized area
`of physical memory on the 1!0 device adapter, which is then
`mapped into the virtual address space of the user process.
`The 1!0 device adapter typically is implemented with snoop(cid:173)
`ing logic to detect accesses within the page-sized range of
`memory on the 1!0 device adapter. If the 1!0 device adapter
`detects access to the physical memory page, a predefined
`script is then executed by the 1!0 device adapter in order to
`direct the data as appropriate.
`
`6
`Scripts typically serve two functions. The first function is
`to describe the protocol the software application is using.
`This includes but is not limited to how to locate an appli(cid:173)
`cation endpoint, and how to fill in a protocol header template
`s from the application specific data buffer.
`The second function is to define a particular set of
`instructions to be performed based upon the protocol type.
`Each type of protocol will have its own script. Types of
`protocols include, but are not limited to, TCP/IP, UDP/IP,
`10 BYNET lightweight datagrams, deliberate shared memory,
`active message handler, SCSI, and File Channel.
`System Organization
`FIG. 5 is a block diagram illustrating the system organi(cid:173)
`zation between a main memory and an 1!0 device adapter
`15 memory which is compatible with the Virtual Hardware
`technique. The main memory 502 implementation includes
`a hardware register 504 and a buffer pool 506. The 1!0
`device adapter implementation includes a software register
`508 and a physical address buffer map 510 in the adapter's
`20 memory 512. An endpoint table 514 in the memory 512 is
`used to organize multiple memory pages for individual user
`processes. Each entry within the endpoint table 514 points to
`various protocol data 518 in the memory 512 in order to
`accommodate multiple communication protocols, as well as
`25 previously defined protocol scripts 516 in the memory 512,
`which indicate how data or information is to be transferred
`from the memory 512 of the 1!0 device adapter to the
`portions of main memory 502 associated with a user process.
`Typically, when a user process opens a device driver, the
`30 process specifies its type, which may include, but is not
`limited to, a UDP datagram, source port number, or register
`address. The user process also specifies either a synchronous
`or asynchronous connection. The device driver sets up the
`registers 508 and 504, endpoint table 514, and endpoint
`35 protocol data 518. The protocol script 516 is typically based
`upon the endpoint data type, and the endpoint protocol data
`518 depends on protocol specific data.
`The Virtual Hardware technique may be further enhanced
`by utilizing read-local, write-remote memory access. A user
`40 process typically causes a script to execute by using four
`virtual registers, which include STARTINGADDRESS,
`LENGTH, GO, and STATUS. The user process preferably
`first writes information into memory at the locations speci(cid:173)
`fied by the values in the STARTINGADDRESS and
`45 LENGTH virtual registers. Next, the user process then
`accesses the GO virtual register to commence execution of
`the script. Finally, the user process accesses or polls the
`STATUS virtual register to determine information about the
`operation or completion of this 1!0 request.
`It will be recognized that if all four registers are located
`in memory on the 1!0 device adapter, then less than optimal
`performance may result if the user process frequently polls
`the STATUS virtual register. It is possible to improve the
`performance of the Virtual Hardware technique by imple-
`ss menting a read-local, write-remote strategy. With such a
`strategy, the Virtual Hardware technique stores values which
`are likely to be read in locations which are closest to the
`reading entity, whereas values which are likely to be written
`are stored in locations which are farthest away from the
`60 writing entity. This results in values which are likely to be
`read by the user process being stored in cacheable main
`memory, and thus minimizes the time required to access the
`cached values. Values which are likely to be read by the 1!0
`device adapter are stored in the non-cacheable memory on
`65 the 1!0 device adapter. Thus, the registers
`STARTINGADDRESS, LENGTH, and GO are physically
`located preferably in the non-cacheable memory on the 1!0
`
`WISTRON CORP. EXHIBIT 1021.012
`
`
`
`5,915,124
`
`8
`FIG. 7 is a block diagram illustrating a UDP datagram
`template 702 (without a user data area) residing in the 1!0
`device adapter's memory. The user process provides the
`starting address and the length for the user data in its virtual
`address space, and then "spanks" a GO register to trigger the
`1!0 device adapter's execution of a predetermined script.
`The 1!0 device adapter stores the user data provided by the
`user process in the 1!0 device adapter's memory, and then
`transmits the completed UDP datagram 702 over the media.
`An example of programming that triggers the 1!0 device
`adapter is provided below:
`
`udpscript(void *USERDATA_ADDRESS,
`int USERDATA_LENGTH,
`template_t *template)
`
`char *physaddress;
`template->lP.TotalLength ~ sizeof (IPHeader) +
`sizeof(UDpHeader) + USERDATA_LENGTH;
`template->IP.Datagram!D ~ nextid();
`ipchecksum (template);
`template->UDPLength ~ sizeof (UDPHeader)
`+ USERDATA_LENGTH;
`physaddress ~ vtophys(USERDATA_ADDRESS,
`USERDATA_LENGTH);
`udpchecksum(physaddress, USERDATA_LENGTH, template);
`
`10
`
`20
`
`7
`adapter, whereas the STATUS register is preferably located
`in a page of cacheable main memory mapped into the user
`process' virtual address space.
`UDP Datagram
`FIG. 6 is a block diagram illustrating a UDP datagram 602 5
`sent by a user process over Ethernet media. However, those
`skilled in the art will recognize that the Virtual Hardware
`technique is applicable to UDP datagrams, LAN protocols,
`secondary storage devices such as disks, CDROMs, and
`tapes, as well as other access methods or protocols.
`The example of FIG. 6 shows the actual bytes of a sample
`UDP datagram 602 as it might be transmitted over an
`Ethernet media. There are four separate portions of this
`Ethernet packet: (1) Ethernet header 604, (2) IP header 606,
`(3) UDP header 608, and ( 4) user data 610. All of the bytes
`are sent contiguously over the media, with no breaks or 15
`delineation between the constituent fields, followed by suf(cid:173)
`ficient pad bytes on the end of the datagram 602, if neces(cid:173)
`sary.
`In the Virtual Hardware technique, the access privileges
`given to the user processes are very narrow. Each user
`process has basically pre-negotiated almost everything
`about the datagram 602, except the actual user data 610. This
`means most of the fields in the three header areas 604, 606,
`and 608 are predetermined.
`In this example, the user process and the device driver has
`pre-negotiated the following fields from FIG. 6: (1) Ethernet
`Header 604 (Target Ethernet Address, Source Ethernet
`Address, and Protocol Type); (2) IP Header 606 (Version, IP
`header Length, Service Ty