throbber
Computer Graphics Volume 18, Number 3 July 198,1 Manipulating Simulated Objects with Real-world Gestures using a Force and Position Sensitive Screen Margaret R. Minsky Atari Cambridge Research Cambridge, Massachusetts Author's present address: Media Laboratory Massachusetts Institute of Technology Cambridge, Massachusetts Abstract A flexible interface to computing environments can be provided by gestural input. We describe a prototype system that recognizes some types of single-finger gestures and uses these gestures to manipulate displayed objects.. An experimental gesture input device yields information about single finger gestures in terms of position, pressure, and shear forces on a screen. The gestures are classified by a "gesture parser" and used to control actions in a fingerpainting program, an interactive computing system dcsigned for young children, and an interactive digital logic simulation. CR Categories and Subject Dcscriptors: 1.3.6 [Computer Graphics] Methodology and Techniques - interaction techniques; H.1.2 [Models and Principles]: User/Machine Systems - human information processing: D.2.2 [Software Engineering] Tools and Techniques - user interfaces; 1.3.1 [Computer Graphics] Hardware Architecture - input devices General Terms: Design, Experimentation, Languages Additional Key Words and Phrases: gesture, touch-sensitive screen, visual programming, computers and education, paint programs One goal of this research is to make a natural general purpose interface which feels physical. Another goal is to extend some ideas from the Logo pedagogical culture - where young children learn to program and control computing environments [5] - to gestural and dynamic visual representations of programming-like activities. How could we introduce programming ideas to very young children? They already know how to accomplish goals by using motions and gestures. So we speculate, it would be easier for them to learn new things if we can give them the effect of handling somewhat abstract objects in our displayed worlds. For this we need to find simple languages of gesture that can be learned mostly by exploration, and to find visual representations that can be manipulated and program reed by these "gestu re languages". The "Put-That-There" project at the MIT Architecture Machine Group [2] has some goals and techniques in common with this research. We also share some goals with the "visual programnfing" research community. 1. Introduction We want to create worlds within the computer that can be manipulated in a concrete natural way using gesture as the mode of interaction. The effect is intended to have a quality of "telepresence" in the sense that, to the user, the distinction between real and simulated physical objects displayed on a screen can be blurred by letting the user touch, poke, and move the objects around with finger motions. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. © 1984 ACM 0-89791-138-5/84/007/0195 $00.75 We wanted multiple sources of gesture information including position and configuration of the hand, velocity, and acceleration to experiment with hand gestures. Our first step was to build an experimental input device by mounting a transparent touch-sensitive screen in a force-sensing frame. This yields information about single finger gestures in terms of position, pressure and shear forces on the screen. Thus our system can measure the position of a touch, and the direction and intensity of the force being applied. Sections 2, 3 and 4 of this paper describe environments that we have built that are controlled through this kind of gesture input, and our gesture classification. Section 5 decribes the hardware and signal processing we use to recognize these gestures. Section 6 discusses the future directions of this work. 195
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 1 of 9
`
`

`

`@SIGGRAPH'84 2. Fingerpaint: A First Gesture Environment To explore the issues involved in this kind &gestural input, we first built a fingerpaint program. The program tracks the motion of a finge~ (or stylus) on the screen and paints wherever the finger moves. This application makes essential use of the finger's pressure as well as its location. It also uses the shear-force information to smooth the interpretation of the gesture information. The user's finger squooshes a blob of paint onto the screen (Fig. 1). Figure 1: Fingerpaint If the user presses harder, he gets a bigger blob of paint (Fig. 2). Figure 1: Fingerpaint with Varying Pressure The user can choose from several paint colors, and can also paint with simulated spray paint (Fig. 3). In one version of this program, brush "pictures" can be picked up and stamped in other places on the screen. Directions for a Gesture Paint Program We would like to improve fingerpaint in the direction of making a painting system that allows more artistic control and remains sensually satisfying. At the same time we want to avoid making the system too complex for young beginners. We plan to implement a "blend" gesture, a set of paint pots out of which to choose colors with the fingers, and some brushstrokes which depend on the force contour of the painting gesture. The idea of magnification proportional to pressure used in the paint program suggests use of pressure to scale objects in other environments. 3. Parsing Gestures for Manipulating Simulated Objects The paint program follows the finger and implicitly intel"prets gestures to spread paint on the screen. For applications in which discrete, previously defined objects are to be manipulated using gestures, we need more complex gesture recognition. We want the user to be able to indicate, by gestures, different actions to perform on objects. The process of recognizing these gestures can be though of as parsing the gestures of a "gesture language". Our gesture parser recognizes the initiation of a new gesture (just touching the screen after lifting off), then dynamically assigns to it a gesture type. It can recognize three gesture types: the "selection" gesture, the "move" gesture that consists of motion along an approximate line, and the "path" gesture that moves along a path with inflections. We are planning to introduce recognition of a gesture that selects an area of the screen. These gesture types, along with details of their state (particular trajectory, nearest object, pressure, pressure-time contour, shear direction, and so forth) are used by the system to respond to the user's motions. 4. Soft lnlplementations of Some Existing Visually Oriented Systems To support our experimentation, we built a fairly general system to display the 2-D objects that are manipulated by gestures. The following sections describe environments built from these components (gesture parser and 2-D object system), and some anecdotal findings. 4.1 Button Box The gesture system Button Box was inspired by some experiments by Radia Perlman with special terminals (called Button Box and Slot Machine) built for preliterate children [6,7]. The Slot Machine is a plexiglass bar with 196
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 2 of 9
`
`

`

`Computer Graphics Volume 18, Number 3 July 1984 Figure 3: Fingerpaintings Figure 4: Forward ~ ¸'¸~I~¸ .8 Figure 5: Arranging Buttons Figure 6: A Button being Copied 197
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 3 of 9
`
`

`

`~SIGGRAPH'84 slots to put cards in. Each card represents a program command, for example, a Logo turtle command. A child writes a program by putting the cards in the slots in the order they want, then pushing a big red button at the end of the bar. Each card in sequence is selected by the progam counter (a light bulb lights up at the selected card) and that card's command is run. This provides a concrete model of computation and procedure. With various kinds of jump and iteration cards, kids use this physical equipment to learn about control structures and debugging. The gesture system Button Box is even more flexible than the original specially constructed hardware devices; since it is software it can be modified and reconfigured. The current implementation makes use of some force and gesture information. It can be viewed as work in progress toward making models of computation that are particularly suited to having their pieces picked up, tapped upon, tossed about, and smudged by finger gestures. Pictures of buttons that control various actions appear on the screen. In our example domain, the buttons are commands to a Logo-style turtle [1]. For example, one button is the FD (FORWARD) command, another is the RT (RIGHT TURN) command. If the user taps a button rather hard (think of hitting or pressing a mechanical button), the button "does its thing". Whatever action the button represents happens. If the FD button is tapped, the display turtle moves forward (Fig. 4). If the user selects a button by applying fairly constant pressure to it for a longer time than a "tap" gesture, the gesture is interpreted as a desire to move the selected button on the screen. The button follows the finger until the finger lifts off the screen, and the button remains in its new position. This allows the user to organize the buttons in any way that makes sense to him, for example, the user may place buttons in a line in the order in which they should be tapped to make the turtle draw something (Fig. 5). Some of these buttons control rather concrete actions such as moving the turtle or producing a beep sound. Other buttons represent more abstract concepts, for example, the PU/PD button represents the state of the turtle's drawing pen. When the PU/PD button is tapped it changes the state of the turtle's pen, and it also changes its own label. There are also buttons which operate on the other buttons. The COPY button can be moved to overlap any other button, and then tapped to produce a copy of the overlapped button (Fig. 6). Some concepts in programming are available in the button box world. The environment lends itself to thinking about the visual organization of actions. In our anecdotal studies of non-programmers using the button box, most of our subjects produced a library of copies of turtle commands and arranged them systematically on the screen. They then chose from the library the buttons that allowed them to control the turtle in a desired way and arranged them at some favored spot on the screen. There are mechanisms fol explicitly creating simple procedures. At this time, only unconditionally ordered sequences of action represented by sequences of button pushes are available; we are working on representations of conditionals and variables, qhe user can specify a sequence of buttons to be grouped into a procedure. We have experimented with two ways of gathering buttons into procedures: boxes and magic paint. The first method uses boxes. The BOX button, when tapped, turns into a box. The box is sU'etchy and its corners can be moved, so it can be expanded to any size, and placed around a group of buttons (or the user can move buttons into the box). There are buttons which, when tapped, make the system "tap" every button in the box in sequence (Fig. 7). The second method uses "magic paint". Magic paint is a genie button. As the user moves it, it paints. The user uses it to paint over a sequence of buttons. The path created shows the sequence in which buttons should be pushed. When the end of the paint path is tapped, the system "goes along" the path, tapping each button in sequence (Fig. 8). The user can group buttons with either method and have the system "push its own buttons". The user can also tell the system to create a new button from this grouping. The CLOSE button closes up a box and makes a new button. The new button becomes part of the button box world with a status equal to any other button. The new button appears on the screen and can be moved and can be tapped like any other. Thus it becomes a subroutine (Fig. 9). Most of our experiments so far have used the box metaphor. We plan to develop gesture semantics for magic paint, which seems more promising because the paint path makes the order of button pushes more explicit than the box grouping. It feels more "gestural" to program by drawing a swooping path. 4.2 Logic Design- Rocky's Boots We have applied the same set of gestures to make a smooth interface to another environment: a graphic logic-design system based on Warren Robinett's program, Rocky's Boots [Robinett 82]. 198
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 4 of 9
`
`

`

`Computer Graphics Volume 18, Number 3 July 1984 Figure 7: Making a Box and Using it to Group Buttons Figure 9: Creating a New Button by Closing a Box Figure 8: Using Magic Paint to Group Buttons Figure 10: Logic objects: Gates, HI input, clock input (blurred), output light 199
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 5 of 9
`
`

`

`@SIGGRAPH'84 Gates, wires, logic inputs, and outputs are the objects in this environment. The user moves them around in the same way as buttons. They are always "doing their thing" They connect to each other when an input is brought close to an output. The user can cut them apart by making a gesture while holding a knife (Figs. 10, 11, 12). Since these logic components are objects, like buttons, we can operate on them the same ways we can with buttons (e.g. copying with the COPY button) (Fig. 13). There aae actions we plan to implement in the logic world, by creating buttons to perform new actions on the objects or in some cases by recognizing new gestures. For example, we could get rid of the need for the knife object by recognizing a "cutting" gesture. We haven't yet defined mechanisms for creating new logic elements from combinations of existing ones but these are the kinds of extensions that can be made with the current gesture repertoire. 4.3 Rooms All of the gesture-controlled environments: button box programming, interactive logic simulation, and a rudimentary Colorformstm-like environment, have been combined in an information environment we call Rooms. Rooms is an extension of a visual representation for adventure game maps and other visual information designed by Warren Robinett [8,9]. There ,are rooms which contain objects. The rooms connect to each other through doorways. In our implementation, each room takes up the whole screen, doors are at the edge of the screen. As the user's finger moves through a doorway, the adjacent room appears on the screen, filled With whatever objects it contains. The user can drag any object (button, logic gate, etc) through a doorway. Our environment starts with a room containing the button box environment, a room containing the logic objects, a room containing colorforms, and an extra room with a few miscellaneous buttons in it. Colorforms is a more fiee-form environment consisting of colorform-like shapes that can be moved around and stamped with finger gestures to create pictures. Our shapes are a face, eyes, mouth, and so forth, that can be grabbed and moved around (Fig. 14). 4.4 Combining Environments One of the tenets of developing good gesture interfaces is that the objects being manipulated by gestures should act like possible physical objects in their reactions to the user's gestures. When the objects are brought near each other they should interact in plausible ways. We have seen that the COPY button can copy logic elements. We introduced logic elements that can act on buttons. A special output, the HAND, taps a button when it is clocked (Fig. 15). The circuit in the figure is an example of parallel processing invented by a user who was experimenting with the interface between the Button World and the Logic World (Fig. 16). Amazing! Or is il~? Nobody is amazed when a real object, like a teapot, can be stacked on top of another kind of real object, for example, a table. However, programs can hardly ever fit together meaningfully, much less smoothly. The object nature of these programming-like environments, and their necessary analogy to physical objects deriving from the gesture interface, has allowed this smooth combination to happen. Reflecting on our box and magic paint programming metaphors, we can see that in these worlds, procedures are things which embody processes in their behavior. Here, the processes are represented by the paths of the user's gestures as he constructs a configuration of objects on the screen. 5. Gesture System Hardware and Software Hardware Configuration We mounted a commercially available transparent, resistive-film, touch-sensitive screen on the face of a color display monitor. The touch screen is supported by four force-measuring strain gauges ("load-cells") at the four cornel~ of the screens. The mechanical arrangement is shown (Fig. 17). Figure 17: Mounted Force Sensors and Position Sensitive Screen We note that a touch screen that used a different arrangement to obtain some force information from strain gauges was built in the 1970's at the M1T Architecture Machine Group. 200
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 6 of 9
`
`

`

`Computer Graphics Volume 18, Number 3 July 1984 Figure 11: A Circuit with a Connection being Put In O Figure 12: The Knife being.used to Remove a Component Figure 13: Copying a Logic Object Figure 14: Colorforms Room Figure 15: HAND Tapping the FD Button Figure 16: A "Program" that Draws a Circle 201
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 7 of 9
`
`

`

`@SIGGRAPH'84 The load cells are mounted so that they supply useful force information through a 0-10 lb. range for finger gestures, and are protected from damage due to overloading. To make the user feel that they are actually "touching" objects on the screen, the surface touched by the user must be as close ,as possible to the monitor face to prevent parallax problems. In this arrangement, the position sensitive panel floats about 1/8" above the display surface. This is a block diagram of mechanical connections and information flow in our system (Fig. 18). Lisp Maohlne I Color Display ] Touch Ser~ [ Force Sensors I Finger or St)'lus Figure 18: Block Diagram Our gesture system software is written in LISP and runs on a LISP Machine. Raw position and load information is processed by the gesture software into recognized motions or gestures. The effects of the gestures on one of our gesture-interfaced environments is computed, and the LISP Machine color system creates the display seen through the transparent gesture sensing screen. Signal Processing A variety of smoothing a~nd calibration strategies are necessary. The position screen reports a finger position with a nominal resolution of 4K x 4K. The force sensor hardware reports twelve bits of force data from each of the four load cells. To get the display and touch screens into good registration, the system performs an initial calibration when it is turned on. This includes finding linear scaling factors and offsets for the x and y position components. The x and y force offsets are also calculated. During operation, an asynchronous process interprets the position and the four force sensor inputs in terms of finger gestures. This process looks for.the initiation of a touch, waits until it has computed a trustworthy, filtered initial position, and then signals that a new gesture has begun. This process tracks the trajectory of the finger during a gesture. The tracking of touched positions is aided by a three-point median smoothing algorithm applied to the position data and by using continuity constraints derived from the force data. The gesture process uses local continuity of the shear force magnitude and direction to ignore sudden changes in sensed position which may be due to position screen noise. Most position readings that are associated with these sudden changes must be ignored, since finger trajectories on the screen are mechanically constrained (over short time scales) to have shear forces parallel to the direction of motion, and smooth rotation of shear at inflection points. Another asynchronous process recalibrates the force sensors every few seconds. The load cells are arranged at the corners of the screen: 1 ...... - ...... 2 I I I I I I I I 3 ............. 4 If the loads (corrected for zero force offset) on the load cells are S1, $2, $3, and $4, the z-force (pressure) is: PRESSURE = S1 + $2 + S3 + $4 We compute ~e components in the plane of the sc~en from: X-FORCE (horizontal) = (SI + S3) (S2 + $4) - zero calibration offset Y-FORCE (vertical) = (SI + $2) ($3 + S4) - zero calibration offset During calibration and for these calculations we assume that the screen is flat, the load cells are at the corners, the load cells' data is not noisy, and the touch position from the position sensing screen (after smoothing) is correct. There are no corrections for nonlinearity over the screen. There is noticeable nonlinearity, but it does not seem to affect gesture recognition and tracking. Thus the user has to "get.the feel" of the screen, since tracking in the corners feels slightly different than in the center. We plan to correct for this. It may become more important if the system is trying to recognize more complex gestures than we have worked with so far. We have gained an advantage in this setup from multiple 202
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 8 of 9
`
`

`

`Computer Graphics Volume 18, Number 3 July 1984 sources of gesture information, e.g. the use of local continuity of forces to help track positions. We could theoretically derive position from the force sensors, but the touch-sensitive screen gives us higher position resolution and perhaps more reliability. We do not cross-check because this system worked well enough for our purposes. 6. Future Directions Directions for Gesture Programming There are several force controlled gestures with which we have experimented briefly and on which we plan to do more work. An example is "flicking". The user can send an object to another part of the screen by flicking it with the finger, as in tiddlywinks. Shear is used to determine the direction of motion and the force determines the initial velocity of the object, which slows down by "friction". We intend to create more application worlds to put into the Rooms. For example, we plan extensions to the Button Box, a gesture controlled kit for making treasure maps, and a button environment in which the buttons represent musical notes and phrases. We would like to study classes of gestures that are useful in systems intended for expert use, in other words, systems where the gestures may be useful once learned but are not as easy or obvious as the ones in our current repertoire. We see the need to do more careful motion studies, and to record more data about people's ability to use the gestures we recognize. There is already extensive literature in related areas, for example [4]. We have several ideas for gestures we would like to recognize that make more use of force and force-time contours to express analog quantities. We also plan to build a multiple finger touch version of our current hardware to allow gestures that use more of the hand. We also intend to explore vision based gesture input, which allows the user much more freedom. There is ongoing research in this laboratory on real-thne visual interpretation of whole body gesture as input to a motion and animation display [3]. We may also explore recognition of hand gestures through vision of hand silhouettes. Directions for Information Layout This research has prompted the beginning of a project to use gestural input in "Information Spaces". A prototype of an information organization system has been created that displays representations of file systems and large programs. There are objects displayed in these representations that are analogous to our logic gates and buttons. This system uses a modified mouse and recognizes more "iconic" gestures than the systems described in this paper. Acknowledgements I would like to thank Danny Hillis for providing the initial leadership for this project, and for continuing ideas, inventions, and support. Ed Hardebeck has done a large amount of the design and much of the implementation of the systems described in this paper. Others who worked on this project are Dan Huttentocher, Gregor Kiczales, Warren Robinett, and Fred Thornburgh. David Wallace and David Chapman partipated in early design of the gesture environments, Special thanks go to Cynthia Solomon, Director of Atari Cambridge Research, for reading many drafts of this paper; thanks for help with the paper also go to Ed Hardebeck, Danny Hillis, Dan Huttenlocher, and Marvin Minsky. References 1. Abelson, Harold, and Andrea DiSessa. Turtle Geometry. MIT Press, 1981. 2. Schmandt, C. and Hulteen, E. "The Intelligent Voice- Interactive Interface". Proc. Human Factors in Computer Systems, Gaithersburg, MD, 1982 3. Hardebeck, Edward F. "Gestural Input to Computers through Visual Recognition of Body Silhouettes". Atari Cambridge Research Internal Memo 3, Atari Cambridge Research, January 1984. 4. Loomis, Jeffrey, Poizner, Howard, Bellugi, Ursula, Blakemore, Alynn, and John Hollerbach. "Computer Graphic Modeling of American Sign Language". Proc. Siggraph '83, Computer Graphics 17, no. 3, July 1983. 5. Papert, Seymour A. Mindstorms. Basic Books, New York, 1980. 6. Perlman, Radia. "Tortis: Toddler's Own Recursive Turtle Interpreter System", Logo Memo 9, Logo Laboratory, Massachusetts Institute of Technology, Cambridge, July 1974. 7. Perlman, Radia. "How to Use the Slot Machine", Logo Working Paper 43, Logo Laboratory, Massachusetts Institute of Technology, Cambridge, January 1976. 8. Atari VCS Adventure. Software product of Atari, Inc., Sunnyvale, CA, 1979. 9. Rocky's Boots. Software Product of The Learning Co., Portola Valley, CA, 1982. Colorforms is a trademark of Colorforms, Inc. 203
`
`Microsoft Ex. 1013
`Microsoft v. Philips - IPR2018-00026
`Page 9 of 9
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket