`Google Inc. v. At Home Bondholders' Liquidating Trust
`IPR2015-00657
`
`1
`
`
`
`CGI Progranming
`on the World Wide Web
`
`Shishir Gundavaram
`
`Cambridge
`
`- Kéln
`
`- Paris
`
`- Sebdstopol
`
`- Tokyo
`
`O’Reilly 8: Associates, Inc.
`
`2
`
`
`
`
`
`n._e___,_.-.,.._.__....al
`
`CG! Programming on the World Wide Web
`by Shishir Gundavaram
`
`Copyright © 1996 O’Reilly 8c Associates, Inc. All rights reserved.
`Printed in the United States of America.
`
`Published by O’Reilly 8c Associates, lnc., l0l Morris Street, Sebastopol, CA 95472.
`
`EditDI'S.' Andy Oram and Linda Mui
`
`Production Editor: Jane Eum
`
`Printing History:
`
`March 1996:
`
`First Edition
`
`Nutshell Handbook and the Nutshell Handbook logo are registered trademarks and the Java
`Series is a trademark of O’Reiily 8; Associates, Inc.
`
`Many of the designations used by manufacturers and sellers to distinguish their products are
`claimed as trademarks. Where those designations appear in this book, and O’Reilly 8:
`Associates, Inc. Was aware of a trademark claim, the designations have been printed in caps
`or initial caps.
`
`While every precaution has been taken in the preparation of this book, the publisher assumes
`no responsibility for errors or omissions, or for damages resulting from the use of the
`information contained herein.
`
`®
`{Q9
`This book is printed on acid-free paper with 85% recycled content, 15% post-consumer Waste.
`O’Reilly 8: Associates is committed to using paper with the highest recycled content available
`consistent with high quality.
`
`ISBN: 15659246842
`
`[1/98]
`
`3
`
`
`
`Chapter I: The Common Gateway Interface (CGI)
`
`0 ArchiePlex Gateway
`A gateway to the Archie search server. Allows the user to search for a specific
`String and returns a virtual hypertext document. This useful gateway is
`located at
`/mp.-//pubweb.nexorco.wk/jbublic/arcbie/arc/9ieplex/arcbiep/ex.btml.
`A simple Archie gateway is presented in Chapter 10.
`
`Guestbook wéth World Map
`
`A guestbook is a formsbased application that allows users to leave messages
`for everyone to see. Though there are numerous guestbooks on the Web, this
`is one of the best. You can access it at bttp://www.cosy.sbg.acat/reoguestboo/e.
`
`Japanese <-> English Dictionary
`A sophisticated CGI program that queries the user for an English Word, and
`returns a virtual document with graphic images of an equivalent Japanese
`Word, or vice versa. It can be accessed at btljj.-//www.ug.omr0n.cojp/cgfibm/V
`j—e?SA5E=1‘fz'edl.bz‘mI or at /mp.-./'/em‘e7prz'se.iagc.ca/cgiv--bin/j—e.
`
`they illustrate the powerful
`Although most of these documents are curiosities,
`aspects of CGI. The interface allows for the creation of highly effective virtual
`documents using forms and gateways.
`
`Internal Workings of CGI
`
`So how does the whole interface work? Most servers expect CGI programs and
`scripts to reside in a special directory, usually called cgzT—bm, and/or to have a
`certain file extension.
`(These configuration parameters are discussed in the
`“Configuring the Server” section in this chapter.) When a user opens a URL associ-
`ated With a CGI program, the client sends a request to the server asking for the
`file.
`
`For the most part, the request for a CGI program looks the same as it does for all
`Web documents. The difference is that when a server recognizes that the address
`being requested is a CGI program, the server does not return the file contents
`Verbatim. Instead, the server tries to execute the program. Here is What a sample
`
`client request might look like:
`
`GET /cgi—bin/we1come.pl HTTP/1.0
`Accept: www/Source
`Accept:
`text/html
`Accept:
`image/gif
`User—Agent: Lynx/2.4 libwww/2.14
`From: shishir@bu.edu
`
`This GET request identifies the file to retrieve as /cgz'—bm/welcomepl. Since the
`server is configured to recognize all files in the Cgz'—bz'n directory tree as CGI
`programs, it understands that it should execute the program instead of relaying it
`
`4
`
`
`
`Internal Workings of (161
`
`directly to the browser. The string HTTP/1.0 identifies the communication
`protocol to use.
`
`(www/'sot1rce,
`The client request also passes the data formats it can accept
`text/html, and irriage/gif).
`identifies itself as a Lynx client, and sends user
`information. All this information is made available to the CGI program along with
`additional information from the server.
`
`The way that CGI programs get their input depends on the server and on the
`native operating system. On a UNIX system, CGI programs get their input from
`standard input
`(STDIN) and from UNI): environment variables, These variables
`store such information as the input search string (in the case of a form),
`the
`format of the input, the length of the input (in bytes), the remote host and user
`passing the input, and other client information. They also store the server name,
`the communication protocol, and the name of the software running the server.
`
`Once the CGI program starts running, it can either create and output a new docu-
`r;1ent, or provide the URL to an existing one. On UNIX, programs send their
`output to standard output (STDOUT) as a data stream. The data stream consists of
`two parts. The first part is either a full or partial HTTP header that (at minimum)
`describes what format the returned data is in (e.g., HTML. plain text, GIF, etc.) A
`blank line signifies the end of the header section. The second part is the body,
`which contains the data conforming to the format type reflected in the header.
`The body is not modified or interpreted by the server in any Way.
`
`A CGI program can choose to send the newly created data directly to the client or
`to send it indirectly through the server. if the output consists of a complete HTTP
`header, the data is sent directly to the client Without server modification. (_It’s actu—
`ally a little more complicated than this, as We will discuss in Chapter 5, Output
`from the Common Gazfeway Interface.) Or, as is usually the case, the output is sent
`to the server as a data stream. The server is then responsible for adding the
`complete header information and using the HTTP protocol to transfer the data to
`the client.
`
`Here is the sample output of a program generating an HTML virtual document,
`with the complete HTTP header:
`
`HTTP/1.0 200 OK
`
`Date: Thursday, 22-. bruary—96 08:28:00 GMT
`Server: NCSA/1.4.2
`MIME—version: 1.0
`Cont ent —type:
`text /h:rnl
`Content-length: 2000
`
`<HTML
`
`<HEAD <TIl”L3'>'Wel:ome ‘to Shi_shi:'s "Ww Server!</T1TLE></1-iEAD>
`
`5
`
`
`
`60
`
`Chapter 4:F01”ms and CGI
`
`First, each form element’s name————specified by the NAME attribute—is equated
`with the value entered by the user to create a key-value pair. For example, if the
`user entered “SO” when asked for the age, the key—Value pair would be (age=30).
`Each key—value pair is separated by the “:81” character.
`
`Second, since the variable names for the form element and the actual form data
`
`is possible this text could consist of characters that will
`it
`are standard text,
`confuse browsers. To prevent possible errors, the encoding scheme translates all
`“special” characters to their corresponding hexadecimal codes. These “special”
`characters include control characters and certain alphanumeric symbols. For
`example,
`the
`string
`“Thanks
`for
`the
`help!” would
`be
`converted
`to
`“Thanl<s%20for%20the%20help%21”. This process is repeated for each kcy—va1ue
`pair to create a query string.‘
`
`For text and password fields, the user input will represent the value. If no informa-
`
`tion was entered, the key—Va1ue pair will be sent anyway, with the value left blank
`(i.e., “name=”).
`
`For ra:1i<> buttons and checkboxes, the VALUE attribute represents the value when
`
`is checked. If no VALUE is specified,
`
`the value defaults to
`
`the button element
`::
`on.”
`
`An unchecked checkbox will not be sent as a lzey—value pair;
`
`it will be
`
`ignored.
`
`The CGI program then has to “decode” this information in order to access the
`
`form data. The encoding scheme is the same for both GET and POST.
`
`GET vs. POST
`
`There are two methods for sending form data: GET and POST. The main difference
`
`between these methods is the way in which the form data is passed to the CGI
`
`program. If the GET method is used, the query string is simply appended to the
`
`URL of the program when the client issues the request to the server. This query
`
`string can then be accessed by using the environment Variable QUERY_STR1NG.
`
`Here is a sample GET request by the client, which corresponds to the first form
`
`example:
`
`GET /cgi—bin/program.pl?user:Larry%2OBird&age;3Sscpassztesting HTTP/1.0
`Accept: www/ source
`Accept :
`text /html
`Accept:
`text/plain
`Use‘:—Agen:: Lynx./2 .4 libwww/2.14
`
`* Before the forms interface, the only way you could retrieve user information was through a search field
`(i.e., <ISINDEX>), which passed the data to the server with spaces converted to plus signs ( “+”).
`
`6
`
`
`
`Sending Data to the Server
`
`61
`
`As We discussed in Chapter 2, the query string is appended to the URL after the
`
`character. The server then takes this string and assigns it to the environment
`
`variable QUERY_STR[NG.
`
`The GET method has both advantages and disadvantages. The main advantage is
`
`that you can access the CGI program with a query Without using a form. In other
`WOTClS, you can create “canned queries.” Basically, you are passing parameters to
`the program. For example, if you Want to send the previous query to the program
`directly, you can do this:
`
`<A HREF: " /cgi-—bin/program . pl ?user=Larry%2 OBird&age=3 Sscpassztesting " >CGI
`Progra.m< /A>
`
`Here is a simple program that Will aid you in encoding data:
`
`# ! /usr/ local/bin/perl
`
`print "Please enter a string to encode:
`$string : <STDIN>;
`chop ($string) ;
`
`";
`
`$si:ring =~ s/ (\W} /sprintf ( "%%%x" , ord($l) )/eg;
`
`print “The encoded string is:
`print $string, "\n";
`exit (0) ;
`
`",
`
`"\r1";
`
`This is not a CGI program; it is meant to be run from the shell. When you run the
`
`prograna, the program will prompt you for a string to encode. The <S'i‘DlN'> oper-
`ator reads one line from standard input.
`It
`is similar to the <FILEHANDLE>
`
`construct we have been using. The chop command removes the trailing newline
`
`the user—specified string is
`string. Finally,
`from the input
`(“\n”)
`character
`Converted to a hexadecimal value with the sprinzf command, and printed out to
`
`standard output.
`
`A query is one method of passing information to a CGI program via the URL. The
`other method involves sending extra path information to the program. Here is an
`
`example:
`
`<A HREF= " /cgi—bin/program . pl /user=Larry%2 OBird/ age=3 5 /pass=testing>CGI
`Program</A>
`
`The string “/user=Larryo/o20Bird/age=35/pass=testing” will be placed in the envi-
`ronment variable PATH_INFO when the request gets to the CGI program. This
`
`method of passing, information to the CGI program is generally used to provide
`file information, rather than form data. The NCSA imagemap program works in
`
`" The information in the password field is not encrypted in any Way; it is plain text. You have to be very
`careful when asking for sensitive data using the password field. If you want security, please use server
`authentication.
`
`7