`
`
`
`TWO USER PROCESSES
`VIRTUAL MEMORY WITH
`VIRTUAL HARDWARE
`
`FIG. 2
`
`r
`i---t.
`
`~
`
`204
`
`--
`
`...
`
`... --
`
`I
`
`I
`
`I
`
`I
`
`I
`
`PER USER PROCESS
`SECURE VIEW TO
`VIRTUAL HARDWARE
`
`---
`-,.
`
`,.
`
`,______
`
`....._
`
`I
`
`DEVICE DRIVER
`VIRTUAL MEMORY
`IN KERNAL
`
`~ • rJ1 •
`~ ......
`~ a
`
`~
`
`~
`.......
`$"
`.......
`~
`
`PHYSICAL MEMORY
`ON I I 0 ADAPTER
`
`SINGLE PAGE
`
`'
`
`SINGLE PAGE
`
`' ....
`
`....
`
`....
`
`....
`
`208
`
`~
`
`....
`
`....
`
`.....
`
`....
`
`....
`
`.....
`
`.....
`
`.....
`
`....
`
`/
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`/
`
`I
`
`I
`
`/
`
`I
`
`I
`
`I
`
`I
`
`/
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`206
`
`....
`
`.....
`
`.....
`
`....
`
`' '
`
`....
`
`....
`
`....
`
`.....
`
`'
`
`' '
`
`....
`
`'}""~~~~~~~~~~~~~~~~
`
`g; a
`
`N
`~
`
`-...)
`
`Ut
`....
`'1
`~
`QC>
`....
`~
`~
`QC>
`
`Ex.1005.003
`
`DELL
`
`
`
`
`
`
`
`FIG. 5
`
`SOFTWARE
`REGISTER
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`5oa7
`I
`APPLICATION R I W
`I
`I
`HARDWARE RO
`I
`~---------------------~
`
`(510
`
`PHYSICAL ADDRESS
`BUFFER MAP
`
`MAIN MEMORY
`
`HARDWARE RI W
`APPLICATION RO
`
`HARDWARE
`REGISTER
`
`\.504
`
`~--------------
`(506
`
`BUFFER
`POOL
`
`502
`
`ADAPTER MEMORY
`
`512
`
`ENDPOINT TABLE
`INDEXED BY
`APPLICATION ID
`
`(514
`
`(516
`
`PROTOCOL
`SCRIPTS
`
`~ ENDPOINT
`
`PROTOCOL DATA
`
`OS DRIVER RI W
`HARDWARE RO
`
`\.s1a
`
`~ • rJJ. •
`~ = ~ a
`
`w -.?--~
`
`00 =-!!.
`
`(l'I
`
`~
`....:i
`
`'4
`
`Ol
`.....i
`="' QC
`'4 ="' ~
`QC
`
`Ex.1005.006
`
`DELL
`
`
`
`U.S. Patent
`
`Jun. 16, 1998
`
`Sheet 6 of 7
`
`5,768,618
`
`FIG. 6
`
`1602
`
`Hex Dec
`0
`0
`1
`1
`2
`2
`3
`3
`4
`4
`5
`5
`6
`6
`7
`7
`8
`8
`9
`9
`a
`10
`b
`11
`c
`12
`d
`13
`e
`14
`15
`f
`10
`16
`11
`17
`12
`18
`19
`13
`14
`20
`15
`21
`16
`22
`17
`23
`18
`24
`19
`25
`1a
`26
`1b
`27
`1c
`28
`1d
`29
`1e
`30
`1f
`31
`20
`32
`21
`33
`22
`34
`23
`35
`24
`36
`25
`37
`26
`38
`27
`39
`28
`40
`29
`41
`2a
`42
`
`Ethernet
`Header
`(14 bytes)
`
`604
`
`IP
`Header
`(20 bytes)
`
`606
`
`UDP
`Header
`(8 bytes)
`
`608
`
`User 610
`Data
`(Variable)
`
`Target Ethernet Address
`(6 bytes)
`
`Source Ethernet Address
`(6 bytes)
`
`Protocol Type (OxOBOO =IP)
`
`-----~~.=.;;_~.;;._~~.,.--~~--:--=""""'"..,.---,..-~
`
`Version = 4. IP Header Len (Words) = 5
`Service Tvoe
`' Total Length= Ox001d (29 bytes: 20-byte IP Header
`olus 8-bvte UDP header olus 1-bvte user data)
`Datagram Id= Oxe0a1
`
`01
`02
`03
`04
`05
`06
`07
`08
`09
`Oa
`Ob
`Oc
`08
`00
`45
`00
`' 00
`1d
`eO
`a1
`40
`00
`40
`11
`da
`1b
`80
`01
`co
`07
`80
`01
`co
`08
`00
`07
`30
`18
`00
`09
`Oc
`f 8
`67
`
`Flag Ox4 DO_NOT _FRAGMENT
`Fraament Offset= OxOOO
`Time-to-Live = Ox40
`IP Protocol= Ox11 (UDP)
`IP Header Checksum= Oxda1b
`
`IP Address of Source= 128.1.192. 7
`
`IP Address of Destination = 128.1.192.8
`
`Source Port= Ox0007 (echo datagram)
`
`Destination Port = Ox3018
`
`UDP Length= Ox0009 (8-byte UDP Header plus
`1-bvte user data)
`UDP Checksum = Ox0cf8
`
`1 byte user datagram.= "g"
`
`Ex.1005.007
`
`1 - - - -1
`
`DELL
`
`
`
`U.S. Patent
`
`Jun. 16, 1998
`
`Sheet 7 of 7
`
`5,768,618
`
`Hex Dec
`0
`0
`1
`1
`2
`2
`3
`3
`4
`4
`5
`5
`6
`6
`7
`7
`8
`8
`9
`9
`10
`a
`11
`b
`c
`12
`13
`d
`e
`14
`15
`f
`16
`10
`11
`17
`12
`18
`19
`13
`20
`14
`15
`21
`22
`16
`17
`23
`18
`24
`25
`19
`1a
`26
`1b
`27
`28
`1c
`1d
`29
`1e
`30
`31
`1f
`20
`32
`21
`33
`22
`34
`23
`35
`36
`24
`25
`37
`26
`38
`27
`39
`28
`40
`41
`29
`
`Ethernet
`Header
`(14 bytes)
`
`704
`-
`
`IP
`Header
`(20 bytes)
`
`706
`-
`
`UDP
`Header
`(8 bytes)
`
`708
`-
`
`FIG. 7
`
`Target Ethernet Address
`(6 bytes)
`
`I 702
`
`Source Ethernet Address
`(6 bytes)
`
`Protocol Type (Ox0800 =IP)
`
`01
`02
`03
`04
`05
`06
`07
`08
`09
`Oa
`Ob
`De
`08
`00
`Version= 4 IP Header Len (Words)= 5
`45
`- _0,9_ Service Tvoe
`• Total Length
`I
`I
`I
`I
`I
`• Datagram Id
`I
`I
`I
`
`40
`00
`40
`11
`
`I
`I
`
`·----
`--- -
`·- ---
`
`80
`01
`co
`07
`80
`01
`co
`08
`00
`07
`30
`
`Flag Ox4 DO_NOT _FRAGMENT
`F raament Offset = OxOOO
`Time-to-Live = Ox40
`IP Protocol = Ox11 (UDP)
`• IP Header Checksum
`I
`
`IP Address of Source = 128.1.192. 7
`
`IP Address of Destination = 128.1.192.8
`
`Source Port= Ox0007 (echo datagram)
`
`Destination Port = Ox3018
`
`I UDP Length
`I
`I
`• UDP Checksum
`I
`
`Ex.1005.008
`
`18 ----
`
`I
`I
`I
`I
`I
`
`,_ --- I
`
`DELL
`
`
`
`5,768,618
`
`1
`METHOD FOR PERFORMING SEQUENCE
`OF ACTIONS IN DEVICE CONNECTm TO
`COMPUTER IN RESPONSE TO SPECIFIED
`VALUES BEING WRITTEN INTO SNOOPm
`SUB PORTIONS OF ADDRESS SPACE
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`This invention relates in general to computer input/output
`(I/O) device interfaces, and in particular. to a method of
`using virtual registers to directly control an 110 device
`adapter to facilitate fast data transfers.
`2. Description of Related Art
`Modern computer systems are capable of running mul- 15
`tiple software tasks or processes simultaneously. In order to
`send information to a software process or receive informa(cid:173)
`tion from a software process. an input/output (110) device
`interface is typically used. An I/O device interface provides
`a standardized way of sending and receiving information. 20
`and hides the physical characteristics of the actual I/O
`device from the software process. Software processes which
`use I/O device interfaces are typically easier to program and
`are easier to implement across multiple types of computers
`because they do not require knowledge or support for 25
`specific physical 110 devices.
`An I/O device interface is typically implemented by a
`software program called an 110 device driver. The I/O device
`driver must take information coming from an external
`source. such as a local area network. and pass it along to the 30
`appropriate software process. Incoming data is frequently
`buffered into a temporary storage location in the device
`driver's virtual address space (VAS). where it subsequently
`copied to the VAS of the user process during a separate step.
`However. as recent advances in communication technol- 35
`ogy have rapidly increased the bandwidth of many I/O
`devices. the copying step from the device 110 driver's VAS
`to the user process' VAS represents a potential bottleneck.
`For instance, the bandwidth for fiber optic link lines is now
`typically measured in gigabits per second. This tremendous
`bandwidth creates a problem when information is copied
`within the computer. When information is copied. all data
`passes through the computer processor. memory. and inter(cid:173)
`nal data bus several times. Therefore. each of these compo-
`nents represents a potential bottleneck which will limit the
`ability to use the complete communications bandwidth. 110
`latency and bandwidth are impaired by this standard pro(cid:173)
`gramming paradigm utilizing intermediate copying.
`In addition. programming an 110 device driver usually 50
`involves a user process making an operating system call.
`which involves a context switch from the user process to the
`operating system. This context switch further inhibits the
`ability of computer systems to handle a high 1/0 data
`bandwidth.
`Therefore. there is a need for multiple user processes in a
`single computing node to be able to simultaneously share
`direct access to an I/O device without intervention of the
`operating system on a per I/O basis.
`
`2
`space for the I/O device is created in the virtual memory of
`the computer, wherein the address space comprises virtual
`registers that are used to directly control the I/O device. In
`essence, control registers and/or memory of the I/O device
`5 are mapped into the virtual address space, and the virtual
`address space is backed by control registers and/or memory
`on the 110 device. Thereafter. the 110 device detects writes
`to the address space. As a result. a pre-defined sequence of
`actions can be triggered in the I/O device by programming
`10 specified values into the data written into the mapped virtual
`address space.
`
`BRIEF DESCRIPI10N OF THE DRAWINGS
`FlG. 1 is a :flow diagram illustrating a conventional I/O
`data :flow between a sender and a receiver;
`FIG. 2 is a block diagram illustrating a virtual hardware
`memory organization compatible with the present invention;
`FlG. 3 is a :flow diagram describing the system data :flow
`of fast and slow applications compatible with the present
`invention;
`FIG. 4 is a block diagram describing direct application
`interface (DAI) and routing of data between processes and
`an external data connection which is compatlble with the
`present invention;
`FlG. 5 is a block diagram illustrating the system organi(cid:173)
`zation between a main memory and an I/O device adapter
`memory which is compatible with the present invention;
`FlG. 6 is a block diagram illustrating a typical Ethernet(cid:173)
`based UDP datagram sent by a user process; and
`FlG. 7 is a block diagram illustrating a UDP datagram
`header template in the I/O device adapter's memory.
`
`45
`
`40
`
`DEfAIIED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`In the following description of the preferred embodiment.
`reference is made to the accompanying drawings which
`form a part hereof. and in which is shown by way of
`illustration a specific embodiment in which the invention
`may be practiced. It is to be understood that other embodi(cid:173)
`ments may be utilized and structural changes may be made
`without departing from the scope of the present invention.
`Programming an input/output (110) device typically
`involves a user software process making a call to the
`operating system. This involves a context switch that swaps
`information in system registers and memory in order to
`process incoming data. Further. in many environments. the
`routing of I/O data also entails one or more memory-to(cid:173)
`memory copies of the data before the physical I/O occurs on
`the actual device. It will be recognized that I/O latency and
`bandwidth are impaired by invoking the operating system
`through the use of an exception handler. as well as by
`performing multiple memory-to-memory copies of the data.
`The present invention provides the capability for multiple
`55 user processes in a single computing node to simultaneously
`share direct access to an I/O device without the intervention
`of the operating system for each data transfer as it occurs.
`Further. the present invention is structured to accommodate
`system and I/O device security by using the operating
`60 system to initialize the virtual memory address space for the
`user process. In addition. the number of simultaneous user
`processes is scalable and is not constrained by the number of
`physical registers within the I/O device. This is implemented
`by mapping a small portion of the memory of the I/O device
`65 directly into the virtual address space of the user process.
`which provides a secure way for the user process to directly
`trigger the execution of a prepared I/O script.
`
`SUMMARY OF THE JNVENITON
`
`To overcome the limitations in the prior art described
`above. and to overcome other limitations that will become
`apparent upon reading and understanding the present
`specification, the present invention discloses a method of
`controlling an input/output (I/O) device connected to a
`computer to facilitate fast 110 data transfers. An address
`
`Ex.1005.009
`
`DELL
`
`
`
`5,768,618
`
`3
`Thus, it will be recognized that the present invention
`increases the efficiency of J/O operations in the following
`ways:
`1. Writing information to and from a user address space
`without intermediate memory-to-memory copies.
`2. Accessing an JIO device simultaneously from multiple
`user processes in a single node.
`3. Eliminating calls to the operating system and the
`associated context switches on a per J/O basis.
`
`4
`setting or "spanking" the J/O control registers in 210. 212,
`which in turn causes the execution of the script.
`The memory mapping must be typically performed in
`increments of the page size for the particular system or
`5 environment. Allocating memory in increments of the page
`size allows each user process to have a virtual hardware
`space 210. 212 that is secure from all other processes which
`might be sharing the same J/O device adapter. This security
`between software processes is maintained by the operating
`IO system in conjunction with virtual memory capabilities
`offered by most processors.
`4. Maintaining system security for the J/O device by using
`Each user process creates a memory mapping by perform-
`the operating system to initialize the virtual memory
`ing an operating system request to open the J/O device
`address space of the user process.
`adapter for access. Having the operating system create the
`5. Accessing the J/O device under the full control of the
`resource allocation policies and permissions granted by 15 virtual memory address space allows the operating system
`the operating system.
`and J/O device driver to grant only very specific capabilities
`6. Working with a plurality of well-known standard
`to the individual user process.
`A script is prepared by the operating system for the J/O
`operating systems including. but not limited to, UNIX.
`OS/2. Microsoft Windows, Microsoft Windows Nf. or
`device adapter to execute each time the specific user process
`20 programs its specific virtual hardware. The user process is
`Novell Netware.
`7. Providing low-latency high-performance control of J/O
`given a virtual address in the user process' address space that
`devices.
`allows the user process very specific access capabilities to
`FlG. 1 is a flow diagram illustrating a conventional J/O
`the J/O device adapter.
`data flow between a sender and a receiver. At 102. a sender
`Virtual hardware is also referred to as virtual registers.
`application sends information across the memory bus to a 25 Virtual registers are frequently used to describe the view
`user buffer 104. which in turn is then read back across the
`which a single user process has of the addressable registers
`memory bus by protocol modules 110. The information is
`of a given J/O device adapter.
`subsequently buffered through the operating system kernel
`Maintaining security between multiple software processes
`is important when sharing a single IJO device adapter. If the
`108 before it is sent out through conventional network
`interface 114 to the network media access control (MAC) 30 JJO device adapter controls a network interface. such as an
`116. It will be noted that in this system model. the data
`Ethernet device. then the access rights granted to the user
`makes at least three trips across the memory bus at S2. S3
`process by the operating system could be analogous to a
`and SS. For the receiving application. the steps are reversed
`Transmission Control Protocol (TCP) address or socket.
`from those of the sender application, and once again the data
`A TCP socket is defined by a communications transport
`makes at least three trips across the memory bus at RI. R4, 35 layer and defines a set of memory addresses through which
`and RS.
`communication occurs. These transport address form a
`FlG. 2 is a block diagram illustrating a virtual hardware
`network-wide name space that allows processes to commu-
`memory organization compatible with the present invention.
`nicate with each other. A discussion of the form and structure
`J/O device adapters on standard I/O buses, such as ISA,
`of TCP sockets and packets, which are well-known within
`EISA. MCA. or PCI buses. frequently have some amount of 40 the art, may be found in many references. including Com-
`memory and memory-mapped registers which are address-
`puter Networks by Andrew S. Tanenbaum. Prentice-Hall,
`New Jersey. 1981. pp. 326-327. 373-377. which is herein
`able from a device driver in the operating system. User
`processes 202. 204 cause J/O operations by making a request
`incorporated by reference.
`of the operating system which transfers control to the device
`Typically, the only capability to which the user process
`driver. The sharing of a single I/O device adapter by multiple 45 can get direct access is to send and receive bytes over a
`user processes is managed by the device driver running in
`specified socket or transport address range. The user process
`the kernel. which also provides security.
`is not necessarily given permission to emit any arbitrary
`The present invention maps a portion of memory 206,
`packet on the media (e.g .. an Ethernet network). It will be
`physically located on the J/O device adapter into a device
`recognized by those skilled in the art that the present
`driver's address space 208. The present invention also maps 50 invention applies not only to Ethernet or other interconnect
`sub-portions. e.g., pages, 210, 212. of the J/O device adapt-
`communication devices. but also to almost any J/O device
`er's memory 206 into the address spaces for one or more
`adapter in use by a multi-user operating system.
`user processes 202. 204, thereby allowing the user processes
`FIG. 3 is a flow diagram describing the system data flow
`202. 204 to directly program the I/O device adapter without
`of fast and slow applications 302. 304. and 306 compatible
`the overhead of the operating system. including context 55 with the present invention. A traditional slow application
`306 uses normal streams processing 308 to send information
`switches. Those skilled in the art will recognize that the
`sub-portions 210, 212 may be mapped directly from the J/O
`to a pass-through driver 310. The pass-through driver 310
`device adapter's memory 206, or that the sub-portions 210.
`initializes the physical hardware registers 320 of the J/O
`212 may be mapped indirectly from the J/O device adapter's
`device adapter 314 to subsequently transfer the information
`memory 206 through the device driver's address space 208. 60 through the J/O device adapter 314 to the commodity
`interface 322. With the present invention, fast user applica-
`The UO device adapter subsequently snoops the virtual
`address space 210. 212 to detect any reads or writes. If a read
`tions 302 and 304 directly use a setup driver 312 to initialize
`or write is detected. the JJO device adapter performs a
`the physical hardware registers 320. then send the informa-
`specific predefined script of actions. frequently resulting in
`tion directly through the J/O device adapter 314 to the
`an J/O operation being performed directly between the user 65 commodity interface 322 via virtual hardware 316 and 318.
`process' address space 202. 204 and the UO device adapter.
`Thus. the overhead of the normal streams processing 308
`The user process triggers the execution of the script by
`and pass-through driver 310 are eliminated with the use of
`
`Ex.1005.010
`
`DELL
`
`
`
`5,768,618
`
`6
`5
`Typically. when a user process opens a device driver, the
`the virtual hardware 316 and 318 of the present invention,
`and fast applications 302 and 304 are able to send and
`process specifies its type. which may include, but is not
`receive information more quickly than slow application 306.
`limited to. a UDP datagram, source port number, or register
`address. The user process also specifies either a synchronous
`As a result. the present invention provides higher bandwidth,
`less latency. less system overhead. and shorter path lengths. 5 or asynchronous connection. The device driver sets up the
`FIG. 4 is a block diagram describing a direct application
`registers 508 and 504, endpoint table 514. and endpoint
`interface (DAI) and routing of data between processes and
`protocol data 518. The protocol script 516 is typically based
`upon the endpoint data type. and the endpoint protocol data
`an external data connection which is compatible with the
`present invention. Processes 402 and 404 transmit and
`518 depends 00 protocol specific data.
`receive information directly to and from an interconnect 410 10
`The preferred embodiment of the present invention may
`(e.g .. I/O device adapter) through the DAI interface 408. The
`be further enhanced by utilizing read-local. write-remote
`information coming from the interconnect 410 is routed
`directly to a process 402 or 404 by use of virtual hardware
`memory access. A user process typically causes a script to
`execute by using four virtual registers, which include
`and registers. rather than using a traditional operating sys-
`tern interface 406.
`STARl1NGADDRESS. LENGfH. GO. and STATUS. The
`Conceptually, the present invention may be thought of as 15 user process preferably first writes information into memory
`providing each user process with its own I/O device adapter,
`at the locations specified by the values in the STAR11N-
`which makes the user process and I/O device adapter
`GADDRESS and LENGTH virtual registers. Next, the pro-
`logically visible to each other. The user process initiates a
`cess then accesses the GO virtual register to commence
`data transfer directly through a write to memory. thereby
`execution of the script. Finally. the user process accesses or
`avoiding the overhead processing which would be incurred 20 polls the STATUS virtual register to determine information
`if the operating system were used to service the data transfer
`about the operation or completion of this 110 request.
`It will be recognized that if all four registers are located
`request. The user process determines the status of the data
`transfer through a memory read. The operating system and
`in memory on the 110 device adapter. then less than optimal
`performance may result if the user process frequently polls
`110 device adapter remain free to allocate virtual hardware
`resources as needed. despite the presence of multiple user 25 the STATUS virtual register. It is possible to improve the
`processes.
`performance of the present invention by implementing a
`An I/O device adapter typically can have an arbitrary
`read-local, write-remote strategy. With such a strategy. the
`amount of random access memory (RAM) ranging from
`present invention stores values which are likely to be read in
`several hundred kilobytes to several megabytes, which may
`locations which are closest to the reading entity. whereas
`be used for mapping several user processes in a single 30 values which are likely to be written are stored in locations
`communications node. Each user process that has access to
`which are farthest away from the writing entity. This results
`in values which are likely to be read by the user process
`the virtual hardware is typically assigned a page-sized area
`of physical memory on the I/O device adapter. which is then
`being stored in cacheable main memory. and thus minimizes
`mapped into the virtual address space of the user process.
`the time required to access the cached values. Values which
`The 110 device adapter typically is implemented with snoop- 35 are likely to be read by the 110 device adapter are stored in
`ing logic to detect accesses within the page-sized range of
`the non-cacheable memory on the I/O device adapter. Thus.
`memory on the I/O device adapter. If the 110 device adapter
`the registers STARTINGADDRESS. LENGTH. and GO are
`detects access to the physical memory page. a predefined
`physically located preferably in the non-cacheable memory
`script is then executed by the 110 device adapter in order to
`on the I/O adapter, whereas the STATUS register is prefer-
`direct the data as appropriate.
`40 ably located in a page of cacheable main memory mapped
`Protocol scripts typically serve two functions. The first
`into the user process' virtual address space.
`function is to describe the protocol the software application
`FIG. 6 is a block diagram illustrating a UDP datagram 602
`is using. This includes but is not limited to how to locate an
`sent by a user process over Ethernet media. However. those
`application endpoint. and how to fill in a protocol header
`skilled in the art will recognize that the invention is appli-
`template from the application specific data buffer. The 45 cable to UDP datagrams. LAN protocols. secondary storage
`second function is to define a particular set of instructions to
`devices such as disks, CDROMs. and tapes. as well as other
`be performed based upon the protocol type. Each type of
`access methods or protocols.
`protocol will have its own script. Types of protocols include.
`The example of FIG. 6 shows the actual bytes of a sample
`but are not limited to. TCP/IP, UDP/IP, BYNEf lightweight
`UDP datagram 602 as it might be transmitted over an
`datagrams. deliberate shared memory, active message 50 Ethernet media. There are four separate portions of this
`handler. SCSL and File Channel
`Ethernet packet: (1) Ethernet header 604, (2) IP header 606.
`FIG. 5 is a block diagram illustrating the system organi-
`(3) UDP header 608, and (4) user data 610. All of the bytes
`zation between a main memory and an 110 device adapter
`are sent contiguously over the media. with no breaks or
`memory which is compatible with the present invention. The
`delineation between the constituent fields. followed by suf-
`main memory 502 implementation includes a hardware 55 ficient pad bytes on the end of the datagram 602. if neces-
`register 504 and a buffer pool 506. The 110 device adapter
`sary.
`implementation includes a software register 508 and a
`In the present application. the access privileges given to
`physical address buffer map 510 in the adapter's memory
`the user processes are very narrow. Each user process has
`512. An endpoint table 514 in the memory 512 is used to
`basically pre-negotiated almost everything about the data-
`organize multiple memory pages for individual user pro- 60 gram 602. except the actual user data 610. This means most
`cesses. Each entry within the endpoint table 514 points to
`of the fields in the three header areas 604. 606. and 608 are
`various protocol data 518 in the memory 512 in order to
`predetermined.
`accommodate multiple communication protocols, as well as
`In this example, the user process and the device driver has
`previously defined protocol scripts 516 in the memory 512.
`pre-negotiated the following fields from FIG. 6: (1) Ethernet
`which indicate how data or information is to be transferred 65 Header 604 (Target Ethernet Address. Source Ethernet
`from the memory 512 of the I/O device adapter to the
`Address. and Protocol Type); (2) IP Header 606 (Version, IP
`portions of main memory 502 associated with a user process.
`header Length. Service Type. Flag. Fragment Offset, Time_
`
`Ex.1005.011
`
`DELL
`
`
`
`5,768.618
`
`7
`to_Llve. IP Protocol. IP Address of Source. and IP Address
`of Destination); and (3) UDP Header 608 (Source Port and
`Destination Port). Only the shaded fields in FIG. 6, and the
`user data 610. need to be changed on a per-datagram basis.
`To further illustrate the steps performed in accordance
`with the present invention. assume that the user processes
`has initiated access to the J/O device adapter via the methods
`described in this disclosure. Further. assume that the user
`process has the user data 610 of the datagram 602 resident
`in its virtual address space at the locations identified by the
`values in the USERDATA_ADDRFSS and USERDATA_
`LENGTH variables. Also. assume that the address for the
`virtual registers 210. 212, 316. 318. or 508 in the memory of
`the adapter are stored in the pointer "vhr_p" (virtual hard(cid:173)
`ware remote pointer) and the address for the control infor(cid:173)
`mation in the virtual address space of the user process 504
`is stored in the pointer "vhl_p" (virtual hardware local
`pointer).
`An example of user process programming that triggers the
`J/O device adapter is set forth below:
`
`senduserdatagram (void *USERD..uA__ADDRESS,
`int
`USERDXfA_LENG1H)
`I* wait till adapter available */
`{
`while (vhl_p->STATIJS != IDIE) { };
`vhr_p->STARTINGADDRESS = USERDATA....ADDRESS;
`vhr_p->LENGIH
`= USERDAIA_LENG1H
`I* trigger adapter */
`vhr_p->GO
`I* wait till adapter completes •/
`while (vhLp->STATIJS = BUSY) {};
`}
`
`=I;
`
`Those skilled in the art will also recognize that the
`"spanking" of the vhr_p~GO register. i.e .. by setting its
`value to 1. could be combined with the storing of the length
`value into the vhr_p~LENGTH register to reduce the
`number of YO accesses required.
`FIG. 7 is a block diagram illustrating a UDP datagram
`template 702 (without a user data area) residing in the YO
`device adapter's memory. The user process provides the
`starting address and the length for the user data in its virtual
`address space. and then "spanks" a GO register to trigger the
`J/O device adapter's execution of a predetermined script.
`The YO device adapter stoces the user data provided by the
`user process in the YO device adapter's memory. and then
`transmits the completed UDP datagram 702 over the media
`An example of programming that triggers the YO device
`adapter is provided below:
`
`udpscript (void *USERDA.TA_ADDRESS,
`int
`USERDATA_LENGIH,
`template_! •template)
`
`{
`char •physaddress;
`template->IP.Tota!Lcngth = sizeof (IPHeader) +
`sizeof(UDPHeader) + USERDATA..J..ENGTH;
`template->IP .DatagramID = nextid() ;
`ipchecksum (template) ;
`template->UDE'Lellgth = sizeof (UDPHeader)
`+ USERDAIA_LENG1H;
`physaddress = vtophys (USERDATA-ADDRESS,
`USERDAIA_LENGTH) ;
`udpchecksum (physaddress, USERDATA_LENGlH, template);
`}
`
`The script that executes the above function provides the 65
`USERDATA_ADDRESS and USERDATA_LENGTH
`which the user process programmed into the adapter's
`
`8
`memory. This information quite likely varies from datagram
`602 to datagram 602. The script is also passed the appro(cid:173)
`priate datagram 702 template based on the specific software
`register (508 in FIG. 5 or 316 in FIG. 3). There are different
`5 scripts for different types of datagrams 702 (e.g .• UDP or
`TCP). Also, the script would most likely make a copy of the
`datagram 702 template (not shown here). so that multiple
`datagrams 602 for the same user could be simultaneously in
`transit.
`10 Within the udpscript procedure described above. the
`nextid( ) function provides a monotonically increasing
`16-bit counter required by the IP protocol. The
`ipchecksum( ) function performs a checksum for the IP
`Header 706 portion of the datagram 702. The vtophys( )
`15 function performs a translation of the user-provided virtual
`address into a physical address usable by the adapter. In all
`likelihood. the adapter would have a very limited know ledge
`of the user process' virtual address space, probably only
`knowing how to map virtual-to-physical for a very limited
`20 range. maybe as small as a single page. Pages in the user
`process• virtual address space for such buffers would need to
`be fixed. The udpscript procedure would need to be
`enhanced if the user data were allowed to span page bound(cid:173)
`aries. The udpchecksum( ) procedure generates a checksum
`25 value for both the UDP Header 708 plus the user data (not
`shown).
`The adapter would not want to be forced to access the user
`data twice over the J/O bus, once for the calculation per(cid:173)
`formed by the udpchecksum( ) function. and a second time
`30 for transmission over the media. Instead, the adapter would
`most likely retrieve the needed user data from the user
`process• virtual address space using direct memory access
`(DMA) into the main memory over the bus and retrieving
`the user data into some portion of the adapter's memory.
`35 where it could be referenced more efficiently. The program(cid:173)
`ming steps performed in the udpscript( ) procedure above
`might need to be changed to reflect that.
`There are many obvious implementation choices. and the
`programming in the udpscript( ) procedure is not meant to be
`40 exhaustive. For example. a script could allow the user to
`specify a different UDP target port, or a different destination
`IP address. There are performance and security tradeoffs
`based upon the more options offered to the user process. The
`example given herein is intended merely to demonstrate the
`45 essence of this invention. and not to be exhaustive.
`In summary, the present invention discloses a method of
`using virtual registers to directly control an J/O device
`adapter to facilitate fast data transfers. The method initial(cid:173)
`izes a socket endpoint for a virtual register memory con-
`50 nection within an YO adapter's main memory. Incoming
`data is then written to the virtual memory and detected by
`polling or "snooping" hardware. The snooping hardware.
`after detecting the write to virtual registers. generates an
`exception for the system bus controller. The bus controller
`55 then transfers the data to the J/O device adapter and initiates
`the registers of the J/O device adapter to execute a prede(cid:173)
`termined script to process the data
`The preferred embodiment of the present invention typi(cid:173)
`cally implements logic on the J/O device adapter to translate
`60 a read or write operation to a single memory address so that
`it c