`Erickson et al.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US005768618A
`[111 Patent Number:
`[451 Date of Patent:
`
`5,768,618
`Jun. 16, 1998
`
`[54] METHOD FOR PERFORMING SEQUENCE
`OF ACTIONS IN DEVICE CONNECTED TO
`COMPUTER IN RESPONSE TO SPECIFIED
`VALUES BEING WRITTEN INTO SNOOPED
`SUB PORTIONS OF ADDRESS SPACE
`
`[75]
`
`Inventors: Gene R. Erickson; Douglas E.
`Hundley, both of Poway; P. Keith
`Muller; Curtis H. Stebley. both of San
`Diego, all of Calif.
`
`[73] Assignee: NCR Corporation. Dayton. Ohio
`
`[21] Appl. No.: 577,678
`
`Dec. 21, 1995
`[22] Filed:
`...................................................... G06F 15/02
`Int. CI.6
`[51]
`[52] U.S. Cl. .............................................................. 395/829
`[58] Field of Search ..................................... 395/821, 823,
`395/829. 832. 846. 882. 284. 309. 500,
`473
`
`[56]
`
`References Cited
`
`U.S. PPJENT DOCUMENTS
`
`5/1986 Shah et al. .............................. 395/828
`4,589,063
`10/1988 Boetlner et al ......................... 395/823
`4,777,589
`5/1991 Van Loo et al. ........................ 395/678
`5.016,161
`5/1991 Van Loo et al ......................... 395/674
`5,016,166
`6/1992 Rosenthal et al ....................... 711/202
`5,127,098
`111994 Shimodaica et al. .. ................. 395/880
`5,280,587
`511995 Reid et al ............................... 395/830
`5,420,987
`8/1996 Hirayama ................................ 395/823
`5,548,778
`9/1996 Norcross et al ........................ 395/280
`5,553,244
`6/1997 Pedrizetti ........................... 395/185.01
`5,642,481
`9/1997 Feeney et al ........................... 395/834
`5,671,442
`FOREIGN PPJENT DOCUMENTS
`7/1993 European Pat Off ..
`
`551148
`
`OTHER PUBLICATIONS
`'The Performance of Message-Passing Using Restricted
`Virtual Memory Remapping". by Shin-Yuan Tzou and
`David P. Anderson. in Software-Practice & Experience, vol
`21(3). 251-267 (Mar. 1991).
`
`"The DASH Local Kemal Structure" by David P. Anderson
`and Shin-Yuan Tzou, Report No. UCB/CSD 88/463. Nov. 7,
`1988. Computer Science Division (EECS). University of
`California. Berkeley 94720.
`
`"A Users' Guide to PICL-A Portable Instrumented Com(cid:173)
`munication Library" By G.A. Geist et. al .• Oak Ridge
`National Laboratory, Mathematical Sciences Section. P.O.
`Box 2009. Bldg. 9207-A. Oak Ridge, TN 37831-8083
`(Aug. 1990).
`
`"Architecture and Implementation of Vulcan" By Craig B.
`Stunkel. et. al .. ffiM Research Division, Yorktown Heights.
`New York (Sep. 22. 1993).
`
`"MPI-F: An MPI Prototype Implementation on ffiM SPl"
`by Hubertus Franke et. al.. pub. by ffiM. T.J. Watson
`Research Center, Yorktown Heights, New York 10598.
`
`Primary Emminer-Moustafa M. Meky
`Attorney, Agent, or Firm-Merchant. Gould. Smith. Edell.
`Welter & Schmidt
`
`[57]
`
`ABSTRACT
`
`A method of controlling an input/output (1/0) device con(cid:173)
`nected to a computer to facilitate fast J/0 data transfers. An
`address space for the J/0 device is created in the virtual
`memory of the computer. wherein the address space com(cid:173)
`prises virtual registers that are used to directly control the
`110 device. In essence, control registers and/or memory of
`the 110 device are mapped into the virtual address space, and
`the virtual address space is backed by control registers
`and/or memory on the J/0 device. Thereafter. the J/0 device
`detects writes to the address space. As a result, a pre-defined
`sequence of actions can be triggered in the 110 device by
`programming specified values into the data written into the
`mapped virtual address space.
`
`19 Claims, 7 Drawing Sheets
`
`DEVICE DRIVER
`VIRTUAL MEMORY
`IN-KER~L
`
`! - - - - - - - -
`
`TWO USER PROCESSES
`VIRTUAL MEMORY WITH
`VIRTUAL HARDWARE
`
`f--202
`r--
`
`1--204
`
`PHYSICAL MEMORY
`ON II 0 ADAPTER
`
`[',,
`
`'
`
`,/
`
`/
`
`SINGLE PAGE
`
`SINGLE PAGE
`
`\206
`
`PER USER PROCESS
`SECURE VIEW TO
`VIRTUAL HARDWARE
`
`r--
`
`i/
`
`I
`
`WISTRON CORP. EXHIBIT 1005.001
`
`
`
`QO
`="' ~
`....
`QO
`="'
`'I
`....
`Ol
`
`I
`
`I
`
`I CONVENTIONAL NETWORK INTERFACE
`
`I
`
`CONVENTIONAL NETWORK INTERFACE
`
`R1
`
`I NETWORK MAC I
`
`[
`118
`
`120]
`
`r114
`
`+I NETWORK MAC I
`j
`116
`
`ss
`
`INTERFACE
`
`DRIVER
`
`122
`
`l---112
`
`INTERFACE
`
`DRIVER
`
`BUFFERING
`
`KERNAL
`
`... _ -
`
`-~2
`
`MODULES
`PROTOCOL
`
`124
`
`110
`
`PROTOCOL
`
`KERNAL I J MODULES
`~
`
`--
`~----~4
`
`BUFFERING
`
`108\
`
`~ -
`
`t:r'
`'J).
`
`....:a
`e,
`"""'
`
`-
`
`QC
`\C
`"""'
`\C
`"'
`="'
`"""'
`~
`
`~ a
`~ ......
`
`•
`rJ).
`~ •
`
`126
`
`128
`
`--1
`--~ SOCKET LA YEA I ----USER BUFFER
`
`f
`
`APPLICATION
`
`130
`
`RECEIVER
`
`132
`
`~-. CALL INTERFACE
`
`SYSTEM
`
`----I SOCKET LAYER :---
`
`106
`
`102
`
`APPLICATION
`
`l
`
`---,.----...~~........_ S2
`1
`S
`
`SENDER
`
`FIG.l
`
`WISTRON CORP. EXHIBIT 1005.002
`
`
`
`Q(>
`~
`~
`.....
`Q(>
`~
`""-1
`.....
`Ut
`
`g; a
`
`~
`N
`
`-...)
`
`~
`
`~
`.......
`$1'
`.......
`~
`
`~ a
`......
`~ • 00 • ;p
`
`VIRTUAL MEMORY
`DEVICE DRIVER
`
`IN KERNAL
`
`'
`
`I
`...
`I
`
`I
`
`I
`I
`
`I
`
`....
`
`' '
`'
`'
`
`....
`
`' ' ....
`'
`
`....
`
`' '
`
`....
`
`206
`
`....
`
`'
`' '
`
`'
`
`208
`
`....
`
`' '
`
`....
`
`....
`
`....
`
`....
`
`'
`....
`
`SINGLE PAGE
`
`SINGLE PAGE
`
`ON I I 0 ADAPTER
`PHYSICAL MEMORY
`
`I
`
`I
`
`I
`
`I
`
`~
`
`I
`
`I
`
`I
`
`~
`
`I
`
`I
`
`~
`
`~
`
`- -- --
`
`~
`~
`~
`
`-
`
`...
`
`VIRTUAL HARDWARE -,.
`SECURE VIEW TO
`PER USER PROCESS
`---
`
`,.
`
`I
`
`I
`
`I
`
`~
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`~
`I
`
`1--
`
`r---
`
`204
`
`r---
`1--2
`
`FIG. 2
`
`VIRTUAL HARDWARE
`VIRTUAL MEMORY WITH
`TWO USER PROCESSES
`
`WISTRON CORP. EXHIBIT 1005.003
`
`
`
`~
`I-'
`~ =""
`="" ~
`......}
`fJI
`
`~
`
`~
`~
`t..J
`!
`g:
`
`~
`.....
`~a-,
`.....
`~
`
`~
`
`~ a
`~ f""to..
`00. •
`•
`Cj
`
`322
`
`INTERFACE
`COMMODITY
`
`320 --r::-:=.~-:-:-------1~--.
`
`PHYSICAL HARDWARE REGISTERS
`
`314
`
`PASS-THROUGH
`
`DRIVER
`
`PROCESSING
`
`STREAMS
`NORMAL
`
`APPLICATION
`
`SLOW
`
`306
`
`l ---·-I
`1-----I
`FAST APPLICATION I I FAST APPLICATION
`
`FIG. 3
`
`304
`
`302""
`
`WISTRON CORP. EXHIBIT 1005.004
`
`
`
`U.S. Patent
`
`Jun. 16, 1998
`
`Sheet 4 of 7
`
`5,768,618
`
`FIG. 4
`
`408
`
`DAI I FRONT END
`
`404
`
`PROCESS ..__ _ __,.,
`
`TRADITIONAL ..__..,.
`OS INTERFACE
`
`406
`
`DATA IS ROUTED
`DIRECTLY TO
`THE APPLICATION
`
`WISTRON CORP. EXHIBIT 1005.005
`
`
`
`PROTOCOL DATA
`
`~ ENDPOINT
`
`(514
`
`SCRIPTS
`PROTOCOL
`
`[516
`
`ENDPOINT TABLE
`
`APPLICATION ID
`
`INDEXED BY
`
`~ = ~ a
`~ • rJ'1 •
`
`512
`
`ADAPTER MEMORY
`
`QO
`'4 ="' ~
`="' QO
`....,)
`'4
`Ol
`
`~
`
`....,;j
`
`00 =-!!.
`
`(1'1
`
`w
`
`~
`
`~
`!i"
`
`~
`
`518
`
`HARDWARE RO
`OS DRIVER R I W
`
`PHYSICAL ADDRESS
`
`BUFFER MAP
`
`(510
`
`502
`
`...--
`
`POOL
`BUFFER
`
`--------------
`
`\.5o4
`
`(506
`
`---------------------~
`I
`I
`I
`I
`
`APPLICATION R I W
`
`HARDWARE RO
`
`I
`I
`I
`I
`I
`I
`I
`I
`
`5oaJ
`
`REGISTER
`SOFTWARE
`
`FIG. 5
`
`REGISTER ~
`HARDWARE
`
`APPLICATION RO
`HARDWARE R I W
`
`MAIN MEMORY
`
`WISTRON CORP. EXHIBIT 1005.006
`
`
`
`U.S. Patent
`
`Jun. 16, 1998
`
`Sheet 6 of 7
`
`5,768,618
`
`FIG. 6
`
`Ethernet
`Header
`(14 bytes)
`
`Target Ethernet Address
`(6 bytes)
`
`I 602
`
`Hex Dec
`0
`0
`1
`1
`2
`2
`3
`3
`4
`4
`5
`5
`6
`6
`7
`7
`8
`8
`9
`9
`a
`10
`b
`11
`c
`12
`d
`13
`e
`14
`f
`15
`10
`16
`11
`17
`12
`18
`13
`19
`14
`20
`15
`21
`16
`22
`17
`23
`18
`24
`19
`25
`1a
`26
`1b
`27
`1c
`28
`1d
`29
`1e
`30
`1f
`31
`20
`32
`21
`33
`22
`34
`23
`35
`24
`36
`25
`37
`26
`38
`27
`39
`28
`40
`29
`41
`2a
`42
`
`604 -
`
`606
`-
`
`UDP
`Header
`(8 bytes)
`
`608
`-
`
`User 610
`Data
`lVariable)
`
`01
`02
`03
`04
`05
`06
`07
`08
`09
`Oa
`Ob
`Oc
`08
`00
`IP
`45
`Version = 4 IP Header Len (Words) = 5
`Header
`:-_QQ_ Service Type
`(20 bytes) • 00
`• Total Length= Ox001d (29 bytes: 20-byte IP Header
`1d
`plus 8-bvte UDP header plus 1-bvte user data)
`eO
`Datagram ld = Oxe0a1
`a1
`40
`00
`40
`11
`da
`1b
`80
`01
`cO
`07
`80
`01
`cO
`08
`00
`07
`30
`18
`00
`09
`Oc
`f8
`67
`
`Source Ethernet Address
`(6 bytes)
`
`Protocol Type (Ox0800 = IP)
`
`Flag Ox4 DO_NOT _FRAGMENT
`Fraoment Offset= OxOOO
`Time-to-Live = Ox40
`IP Protocol= Ox11 (UDP)
`IP Header Checksum= Oxda1b
`
`IP Address of Source= 128.1.192. 7
`
`I P Address of Destination = 128.1.192.8
`
`Source Port= Ox0007 (echo datagram)
`
`Destination Port = Ox3018
`
`UDP Length= Ox0009 (8-byte UDP Header plus
`1-bvte user data)
`UDP Checksum = Ox0cf8
`
`1 byte user datagram.= "g"
`
`WISTRON CORP. EXHIBIT 1005.007
`
`
`
`U.S. Patent
`
`Jun. 16, 1998
`
`Sheet 7 of 7
`
`5,768,618
`
`Ethernet
`Header
`(14 bytes)
`
`704
`-
`
`Hex Dec
`0
`0
`1
`1
`2
`2
`3
`3
`4
`4
`5
`5
`6
`6
`7
`7
`8
`8
`9
`9
`a
`10
`b
`11
`c
`12
`13
`d
`e
`14
`IP
`f
`15 Header
`10
`16
`(20 bytes)
`11
`17
`12
`18
`13
`19
`14
`20
`15
`21
`16
`22
`17
`23
`18
`24
`19
`25
`1a
`26
`1b
`27
`1c
`28
`1d
`29
`1e
`30
`1f
`31
`20
`32
`21
`33
`22
`34
`23
`35
`24
`36
`25
`37
`26
`38
`27
`39
`28
`40
`29
`41
`
`706
`-
`
`UDP
`Header
`(8 bytes)
`
`708
`-
`
`01
`02
`03
`04
`05
`06
`07
`08
`09
`Oa
`Ob
`Oc
`08
`00
`45
`00
`
`I
`I
`I
`I
`I
`
`----
`·----
`
`40
`00
`40
`11
`-~--
`I
`I
`
`·----
`
`80
`01
`cO
`07
`80
`01
`cO
`08
`00
`07
`30
`
`18 ----
`
`I 702
`
`FIG. 7
`
`Target Ethernet Address
`(6 bytes)
`
`Source Ethernet Address
`(6 bytes)
`
`Protocol Type (Ox0800 = IP)
`
`Version= 4, IP Header Len (Words)= 5
`Service Tvoe
`1 Total Length
`I
`•
`1 Datagram ld
`I
`
`Flag Ox4 DO_NOT _FRAGMENT
`Fraqment Offset= OxOOO
`Time-to-Live = Ox40
`IP Protocol = Ox11 (UDPJ
`1 IP Header Checksum
`I
`
`IP Address of Source = 128.1.192. 7
`
`IP Address of Destination = 128.1.192.8
`
`Source Port= Ox0007 (echo datagram)
`
`Destination Port = Ox3018
`
`I
`I
`I
`I
`I
`
`I UDP Length
`I
`_I_
`1 UDP Checksum
`I
`·----_I
`
`WISTRON CORP. EXHIBIT 1005.008
`
`
`
`5,768,618
`
`15
`
`1
`METHOD FOR PERFORMING SEQUENCE
`OF ACTIONS IN DEVICE CONNECTm TO
`COMPUTER IN RESPONSE TO SPECIFIED
`VALUES BEING WRITTEN INTO SNOOPm
`SUB PORTIONS OF ADDRESS SPACE
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`This invention relates in general to computer input/output
`(I/0) device interfaces, and in particular, to a method of
`using virtual registers to directly control an 110 device
`adapter to facilitate fast data transfers.
`2. Description of Related Art
`Modern computer systems are capable of running mul-
`tiple software tasks or processes simultaneously. In order to
`send information to a software process or receive informa(cid:173)
`tion from a software process. an input/output (110) device
`interface is typically used. An 110 device interface provides
`a standardized way of sending and receiving information, 20
`and hides the physical characteristics of the actual 110
`device from the software process. Software processes which
`use I/0 device interfaces are typically easier to program and
`are easier to implement across multiple types of computers
`because they do not require knowledge or support for 25
`specific physical 110 devices.
`An I/0 device interface is typically implemented by a
`software program called an I/0 device driver. The 110 device
`driver must take information coming from an external
`source. such as a local area network, and pass it along to the 30
`appropriate software process. Incoming data is frequently
`buffered into a temporary storage location in the device
`driver's virtual address space (VAS), where it subsequently
`copied to the VAS of the user process during a separate step.
`However. as recent advances in communication technol- 35
`ogy have rapidly increased the bandwidth of many I/0
`devices. the copying step from the device 110 driver's VAS
`to the user process' VAS represents a potential bottleneck.
`For instance, the bandwidth for fiber optic link lines is now
`typically measured in gigabits per second. This tremendous
`bandwidth creates a problem when information is copied
`within the computer. When information is copied. all data
`passes through the computer processor. memory, and inter(cid:173)
`nal data bus several times. Therefore. each of these compo-
`nents represents a potential bottleneck which will limit the
`ability to use the complete communications bandwidth. I/0
`latency and bandwidth are impaired by this standard pro(cid:173)
`gramming paradigm utilizing intermediate copying.
`In addition. programming an 110 device driver usually 50
`involves a user process making an operating system calL
`which involves a context switch from the user process to the
`operating system. This context switch further inhibits the
`ability of computer systems to handle a high 110 data
`bandwidth.
`Therefore, there is a need for multiple user processes in a
`single computing node to be able to simultaneously share
`direct access to an 110 device without intervention of the
`operating system on a per 110 basis.
`SUMMARY OF THE INVENITON
`To overcome the limitations in the prior art described
`above, and to overcome other limitations that will become
`apparent upon reading and understanding the present
`specification, the present invention discloses a method of
`controlling an input/output (I/0) device connected to a
`computer to facilitate fast I/0 data transfers. An address
`
`2
`space for the I/0 device is created in the virtual memory of
`the computer, wherein the address space comprises virtual
`registers that are used to directly control the 110 device. In
`essence, control registers and/or memory of the I/0 device
`s are mapped into the virtual address space, and the virtual
`address space is backed by control registers and/or memory
`on the 110 device. Thereafter, the I/0 device detects writes
`to the address space. As a result. a pre-defined sequence of
`actions can be triggered in the I/0 device by programming
`10 specified values into the data written into the mapped virtual
`address space.
`BRIEF DESCRII'I10N OF THE DRAWINGS
`F1G. 1 is a :flow diagram illustrating a conventional I/0
`data :flow between a sender and a receiver;
`FlG. 2 is a block diagram illustrating a virtual hardware
`memory organization compatible with the present invention;
`F1G. 3 is a :flow diagram describing the system data :flow
`of fast and slow applications compatible with the present
`invention;
`FlG. 4 is a block diagram describing direct application
`interface (DAl) and routing of data between processes and
`an external data connection which is compatible with the
`present invention;
`F1G. 5 is a block diagram illustrating the system organi(cid:173)
`zation between a main memory and an I/0 device adapter
`memory which is compatible with the present invention;
`F1G. 6 is a block diagram illustrating a typical Ethernet(cid:173)
`based UDP datagram sent by a user process; and
`F1G. 7 is a block diagram illustrating a UDP datagram
`header template in the I/0 device adapter's memory.
`DEfAliED DESCRIPilON OF THE
`PREFERRED EMBODIMENT
`In the following description of the preferred embodiment,
`reference is made to the accompanying drawings which
`form a part hereof. and in which is shown by way of
`illustration a specific embodiment in which the invention
`may be practiced. It is to be understood that other embodi(cid:173)
`ments may be utilized and structural changes may be made
`without departing from the scope of the present invention.
`Programming an input/output (110) device typically
`involves a user software process making a call to the
`operating system. This involves a context switch that swaps
`information in system registers and memory in order to
`process incoming data. Further. in many environments. the
`routing of 110 data also entails one or more memory-to(cid:173)
`memory copies of the data before the physical I/0 occurs on
`the actual device. It will be recognized that I/0 latency and
`bandwidth are impaired by invoking the operating system
`through the use of an exception handler. as well as by
`performing multiple memory-to-memory copies of the data
`The present invention provides the capability for multiple
`55 user processes in a single computing node to simultaneously
`share direct access to an 110 device without the intervention
`of the operating system for each data transfer as it occurs.
`Further. the present invention is structured to accommodate
`system and I/0 device security by using the operating
`60 system to initialize the virtual memory address space for the
`user process. In addition, the number of simultaneous user
`processes is scalable and is not constrained by the number of
`physical registers within the I/0 device. This is implemented
`by mapping a small portion of the memory of the I/0 device
`65 directly into the virtual address space of the user process,
`which provides a secure way for the user process to directly
`trigger the execution of a prepared I/0 script
`
`40
`
`45
`
`WISTRON CORP. EXHIBIT 1005.009
`
`
`
`5,768,618
`
`3
`4
`Thus, it will be recognized that the present invention
`setting or "spanking" the J/0 control registers in 210, 212,
`increases the efficiency of J/0 operations in the following
`which in turn causes the execution of the script.
`ways:
`The memory mapping must be typically performed in
`1. Writing information to and from a user address space
`increments of the page size for the particular system or
`without intermediate memory-to-memory copies.
`5 environment. Allocating memory in increments of the page
`2. Accessing an J/0 device simultaneously from multiple
`size allows each user process to have a virtual hardware
`user processes in a single node.
`space 210, 212 that is secure from all other processes which
`3. Eliminating calls to the operating system and the
`might be sharing the same J/0 device adapter. This security
`associated context switches on a per I/O basis.
`between software processes is maintained by the operating
`10 system in conjunction with virtual memory capabilities
`4. Maintaining system security for the J/0 device by using
`offered by most processors.
`the operating system to initialize the virtual memory
`Each user process creates a memory mapping by perform-
`address space of the user process.
`ing an operating system request to open the J/0 device
`5. Accessing the J/0 device under the full control of the
`adapter for access. Having the operating system create the
`resource allocation policies and permissions granted by 15 virtual memory address space allows the operating system
`the operating system.
`and J/0 device driver to grant only very specific capabilities
`6. Working with a plurality of well-known standard
`to the individual user process.
`operating systems including. but not limited to, UNIX,
`A script is prepared by the operating system for the J/0
`OS/2. Microsoft Windows, Microsoft Windows Nf. or
`device adapter to execute each time the specific user process
`Novell Netware.
`20 programs its specific virtual hardware. The user process is
`7. Providing low-latency high-performance control of J/0
`given a virtual address in the user process' address space that
`devices.
`allows the user process very specific access capabilities to
`FlG. 1 is a flow diagram illustrating a conventional J/0
`the J/0 device adapter.
`data flow between a sender and a receiver. At 102, a sender
`Virtual hardware is also referred to as virtual registers.
`application sends information across the memory bus to a 25 Virtual registers are frequently used to describe the view
`user buffer 164. which in turn is then read back across the
`which a single user process has of the addressable registers
`memory bus by protocol modules 110. The information is
`of a given J/0 device adapter.
`subsequently buffered through the operating system kernel
`Maintaining security between multiple software processes
`108 before it is sent out through conventional network
`is important when sharing a single J/0 device adapter. If the
`interface 114 to the network media access control (MAC) 30 J/0 device adapter controls a network interface. such as an
`116. It will be noted that in this system model. the data
`Ethernet device, then the access rights granted to the user
`makes at least three trips across the memory bus at S2, S3
`process by the operating system could be analogous to a
`and SS. For the receiving application. the steps are reversed
`Transmission Control Protocol (TCP) address or socket.
`from those of the sender application, and once again the data
`A TCP socket is defined by a communications transport
`makes at least three trips across the memory bus at R1. R4, 35 layer and defines a set of memory addresses through which
`and RS.
`communication occurs. These transport address form a
`FlG. 2 is a block diagram illustrating a virtual hardware
`network-wide name space that allows processes to commu-
`memory organization compatible with the present invention.
`nicate with each other. A discussion of the form and structure
`J/0 device adapters on standard I/0 buses. such as ISA,
`of TCP sockets and packets, which are well-known within
`EISA. MCA. or PCI buses, frequently have some amount of 40 the art, may be found in many references, including Com-
`memory and memory-mapped registers which are address-
`puter Networks by Andrew S. Tanenbaum, Prentice-Hall,
`able from a device driver in the operating system. User
`New Jersey. 1981. pp. 326-327. 373-377. which is herein
`processes 202. 204 cause J/0 operations by making a request
`incorporated by reference.
`of the operating system which transfers control to the device
`Typically. the only capability to which the user process
`driver. The sharing of a single J/0 device adapter by multiple 45 can get direct access is to send and receive bytes over a
`user processes is managed by the device driver running in
`specified socket or transport address range. The user process
`the kernel. which also provides security.
`is not necessarily given permission to emit any arbitrary
`The present invention maps a portion of memory 206,
`packet on the media (e.g .. an Ethernet network). It will be
`physically located on the J/0 device adapter into a device
`recognized by those skilled in the art that the present
`driver's address space 208. The present invention also maps 50 invention applies not only to Ethernet or other interconnect
`sub-portions, e.g., pages, 210, 212. of the J/0 device adapt-
`communication devices, but also to almost any J/0 device
`er's memory 206 into the address spaces for one or more
`adapter in use by a multi-user operating system.
`user processes 202. 204, thereby allowing the user processes
`FlG. 3 is a flow diagram describing the system data flow
`202. 204 to directly program the I/0 device adapter without
`of fast and slow applications 302. 304. and 306 compatible
`the overhead of the operating system. including context 55 with the present invention. A traditional slow application
`switches. Those skilled in the art will recognize that the
`306 uses normal streams processing 308 to send information
`sub-portions 210, 212 may be mapped directly from the J/0
`to a pass-through driver 310. The pass-through driver 310
`device adapter's memory 206, or that the sub-portions 210.
`initializes the physical hardware registers 320 of the J/0
`212 may be mapped indirectly from the J/0 device adapter's
`device adapter 314 to subsequently transfer the information
`memory 206 through the device driver's address space 208. 60 through the J/0 device adapter 314 to the commodity
`The J/0 device adapter subsequently snoops the virtual
`interface 322. With the present invention, fast user applica-
`address space 210. 212 to detect any reads or writes. If a read
`tions 302 and 304 directly use a setup driver 312 to initialize
`or write is detected. the J/0 device adapter performs a
`the physical hardware registers 320. then send the informa-
`specific predefined script of actions. frequently resulting in
`tion directly through the J/0 device adapter 314 to the
`an J/0 operation being performed directly between the user 65 commodity interface 322 via virtual hardware 316 and 318.
`process' address space 202. 204 and the I/0 device adapter.
`Thus. the overhead of the normal streams processing 308
`The user process triggers the execution of the script by
`and pass-through driver 310 are eliminated with the use of
`
`WISTRON CORP. EXHIBIT 1005.010
`
`
`
`5,768,618
`
`10
`
`5
`the virtual hardware 316 and 318 of the present invention,
`and fast applications 302 and 304 are able to send and
`receive information more quickly than slow application 306.
`As a result, the present invention provides higher bandwidth.
`less latency, less system overhead, and shorter path lengths.
`FIG. 4 is a block diagram describing a direct application
`interface (DAI) and routing of data between processes and
`an external data connection which is compatible with the
`present invention. Processes 402 and 404 transmit and
`receive information directly to and from an interconnect 410
`(e.g., I/0 device adapter) through the DAl interface 408. The
`information coming from the interconnect 410 is routed
`directly to a process 402 or 404 by use of virtual hardware
`and registers, rather than using a traditional operating sys(cid:173)
`tem interface 406.
`Conceptually. the present invention may be thought of as 15
`providing each user process with its own J/0 device adapter.
`which makes the user process and J/0 device adapter
`logically visible to each other. The user process initiates a
`data transfer directly through a write to memory. thereby
`avoiding the overhead processing which would be incurred 20
`if the operating system were used to service the data transfer
`request. The user process determines the status of the data
`transfer through a memory read. The operating system and
`J/0 device adapter remain free to allocate virtual hardware
`resources as needed. despite the presence of multiple user 25
`processes.
`An I/0 device adapter typically can have an arbitrary
`amount of random access memory (RAM) ranging from
`several hundred kilobytes to several megabytes. which may
`be used for mapping several user processes in a single 30
`communications node. Each user process that has access to
`the virtual hardware is typically assigned a page-sized area
`of physical memory on the J/0 device adapter. which is then
`mapped into the virtual address space of the user process.
`The J/0 device adapter typically is implemented with snoop- 35
`ing logic to detect accesses within the page-sized range of
`memory on the J/0 device adapter. If the J/0 device adapter
`detects access to the physical memory page. a predefined
`script is then executed by the J/0 device adapter in order to
`direct the data as appropriate.
`Protocol scripts typically serve two functions. The first
`function is to describe the protocol the software application
`is using. This includes but is not limited to how to locate an
`application endpoint. and how to fill in a protocol header
`template from the application specific data buffer. The 45
`second function is to define a particular set of instructions to
`be performed based upon the protocol type. Each type of
`protocol will have its own script. Types of protocols include.
`but are not limited to. TCPIIP. UDP/IP. BYNEf lightweight
`datagrams. deliberate shared memory, active message 50
`handler. SCSL and File Channel
`FIG. 5 is a block diagram illustrating the system organi(cid:173)
`zation between a main memory and an J/0 device adapter
`memory which is compatible with the present invention. The
`main memory 502 implementation includes a hardware 55
`register 504 and a buffer pool 506. The J/0 device adapter
`implementation includes a software register 508 and a
`physical address buffer map 510 in the adapter's memory
`512. An endpoint table 514 in the memory 512 is used to
`organize multiple memory pages for individual user pro- 60
`cesses. Each entry within the endpoint table 514 points to
`various protocol data 518 in the memory 512 in order to
`accommodate multiple communication protocols. as well as
`previously defined protocol scripts 516 in the memory 512.
`which indicate how data or information is to be transferred 65
`from the memory 512 of the J/0 device adapter to the
`portions of main memory 502 associated with a user process.
`
`6
`Typically. when a user process opens a device driver, the
`process specifies its type. which may include. but is not
`limited to. a UDP datagram. source port number, or register
`address. The user process also specifies either a synchronous
`5 or asynchronous connection. The device driver sets up the
`registers 508 and 504, endpoint table 514. and endpoint
`protocol data 518. The protocol script 516 is typically based
`upon the endpoint data type, and the endpoint protocol data
`518 depends on protocol specific data.
`The preferred embodiment of the present invention may
`be further enhanced by utilizing read-local. write-remote
`memory access. A user process typically causes a script to
`execute by using four virtual registers, which include
`STAR11NGADDRESS. LENGfH. GO. and STATUS. The
`user process preferably first writes information into memory
`at the locations specified by the values in the STAR11N(cid:173)
`GADDRESS and LENGTH virtual registers. Next. the pro(cid:173)
`cess then accesses the GO virtual register to commence
`execution of the script. Finally, the user process accesses or
`polls the STATUS virtual register to determine information
`about the operation or completion of this J/0 request.
`It will be recognized that if all four registers are located
`in memory on the J/0 device adapter, then less than optimal
`performance may result if the user process frequently polls
`the STATUS virtual register. It is possible to improve the
`performance of the present invention by implementing a
`read-local, write-remote strategy. With such a strategy. the
`present invention stores values which are likely to be read in
`locations which are closest to the reading entity, whereas
`values which are likely to be written are stored in locations
`which are farthest away from the writing entity. This results
`in values which are likely to be read by the user process
`being stored in cacheable main memory. and thus minimizes
`the time required to access the cached values. Values which
`are likely to be read by the 110 device adapter are stored in
`the non-cacheable memory on the I/0 device adapter. Thus.
`the registers STARfiNGADDRESS, LENGTH. and GO are
`physically located preferably in the non-cacheable memory
`on the J/0 adapter, whereas the STATUS register is prefer-
`40 ably located in a page of cacheable main memory mapped
`into the user process' virtual address space.
`FIG. 6 is a block diagram illustrating a UDP datagram 602
`sent by a user process over Ethernet media. However. those
`skilled in the art will recognize that the invention is appli(cid:173)
`cable to UDP datagrarns. LAN protocols. secondary storage
`devices such as disks, CDROMs. and tapes. as well as other
`access methods or protocols.
`The example of FIG. 6 shows the actual bytes of a sample
`UDP datagram 602 as it might be transmitted over an
`Ethernet media. There are four separate portions of this
`Ethernet packet: (1) Ethernet header 604, (2) IP header 606.
`(3) UDP header 608, and (4) user data 610. All of the bytes
`are sent contiguously over the media. with no breaks or
`delineation between the constituent fields, followed by suf(cid:173)
`ficient pad bytes on the end of the datagram 602, if neces(cid:173)
`sary.
`In the present application. the access privileges given to
`the user processes are very narrow. Each user process has
`basically pre-negotiated almost everything about the data(cid:173)
`gram 602. except the actual user data 610. This means most
`of the fields in the three header areas 604, 606, and 608 are
`predetermined.
`In this example. the user process and the device driver has
`pre-negotiated the following fields from FIG. 6: (1) Ethernet
`Header 604 (Target Ethernet Address, Source Ethernet
`Address. and Protocol Type); (2) IP Header 606 (Version, IP
`header Length. Service Type. Flag. Fragment Offset, Time_
`
`WISTRON CORP. EXHIBIT 1005.011
`
`
`
`5,768.618
`
`senduserdatagram (void *USERD..uA_ADDRESS,
`int
`USERDATAJ.ENG1H)
`I* wait till adapter available */
`{
`while (vhLp·>STATIJS != IDlE) { };
`vbr_p·>STARTINGADDRESS = USERDATA...ADDRESS;
`vbr_p->LENGIH
`= USERD.AI:A_LENGIH
`I* trigger adapter */
`vbr_p·>GO
`/*wait till adapter completes */
`while (vhLp->STATIJS = BUSY) {};
`}
`
`=I;
`
`7
`to _Live. IP Protocol. IP Address of Source, and IP Address
`of Destination); and (3) UDP Header 608 (Source Port and
`Destination Port). Only the shaded fields in FIG. 6, and the
`user data 610. need to be changed on a per-datagram basis.
`To further illustrate the steps performed in accordance
`with the present invention. assume that the user processes
`has initiated access to the 110 device adapter via the methods
`described in this disclosure. Further. assume that the user
`process has the user data 610 of the datagram 602 resident
`in its virtual address space at the locations identified by the
`values in the USERDAT