`
`piijj.-,-,'•-_^ '.'•T (cid:127) -^'".i'"
`
`m
`
`; •• r-x^^^.;-. :• \i
`
`i/5^
`
`>%;•
`
`-/yp^ &;
`
`Modu
`
`Perl and C
`
`O'REILLY
`
`Lincoln Stein & Doug MacEachern
`
`1
`
`AT&T - Exhibit 1007
`
`
`
`Writing Apache Modules
`With Perl and C
`
`Lincoln Stein and Doug MacEachem
`
`Beijing - Cambridge - Fambam - Kc'iln - Sebastopol - Tokyo
`
`O’REILLY®
`
`2
`
`2
`
`
`
`Writing Apache Modules with Perl and L‘
`by Lincoln Stein and Doug MaeEaebem
`
`Copyright (C) 1999 O‘Reilly is Associates, Inc. All rights resen ed.
`Printed in the United States of America.
`
`Published by O’Reilly & Associates, loo, 101 Morris Street, Sebastopol, CA 95472.
`
`Editor: Linda Mui
`
`Production Editor: Melanie Wang
`
`Printing History:
`
`March 1999:
`
`First Edition.
`
`The association between the image of a white-tailed eagle and the topic of Apache modules
`is a trademark ofO’Reilly 8c Associates, Inc. Nutshell Handbook, the Nutshell Handbook logo,
`and the O‘Reilly logo are registered trademarks ol“ O‘Reilly & Associates: Inc. Many of the
`designations used by manufacturers and sellers to distinguish their products are claimed as
`trademarks. Where those designations appear in this book: and O’Reilly & Associates. Inc.
`was aware of a trademark claim, the designations have been printed in caps or initial eaps.
`
`While every precaution has been taken in the preparation ol'this book, the publisher assumes
`no responsibility for errors or omissions, or for damages resulting from the use of the
`information contained herein.
`
`ISBN: 978-1565925674)
`lLSll
`
`|2(ll'l-l 1—04]
`
`3
`
`
`
`r
`
`
`
`In this Chapter,-
`. Content Handlers as
`File Procesmm
`
`. Virtual [JOCHI’nentS
`
`. {edirectfon
`
`. Processing Input
`
`. Apache'sRegriser)!
`. {auditing Errors
`Elmo/ting Content
`
`Handlers
`
`. Method Handlers
`
`C0” ten 1‘ Handl61/.S
`
`This chapter is about writing content handlers for the Apache response phase,
`when the contents of the page are actually produced. In this chapter you’ll learn
`how to produce dynamic pages from thin air, how to modify real documents on
`the fly to produce effects like server—side includes, and how Apache interacts with
`the MIME—typing system to select which handler to invoke.
`
`to using the Apache Perl API exclusively for
`Starting with this chapter we shift
`code examples and function prototypes. The Perl AP] covers the majority of what
`C programmers need to use the C-language AP]. What’s missing are various menr
`01y management functions that are essential to C programmers but irrelevant in
`Perl. If you are a C programmer, just have patience and the missingr pieces will be
`filled in eventually. In the meantime, follow along with the Perl examples and
`
`enjoy yourself. Maybe you’ll even become a convert.
`
`Content Handlers as Fz'le Processors
`
`Early web servers were designed as engines for transmitting physical files from the
`host machine to the browser. Even though Apache does much more,
`the file—
`oriented legacy still
`remains. Files can be sent
`to the browser unmodified or
`passed through content handlers to transform them in various ways before send—
`
`ing them on to the browser. Even though many of the documents that you pro—
`
`duce with modules have no corresponding physical files, some parts of Apache
`
`still behave as if they did.
`
`translation
`is passed through any URI
`When Apache receives a request, the URI
`handlers that may be installed (see Chapter 7. Other Request Phases, for informa—
`tion on how to roll your own),
`transforming it
`into a file path. The monLalz‘as
`translation handler (compiled in by default) will first process any Alias, ScriptAZias,
`
`4
`
`
`
`
`
`86 Chapter 4: Content Handlers
`
`Redirect, or other morLaiz’as directives. If none applies, the brathore default trans—
`lator will simply prepend the Documentfeoor directory to the beginning of the URI.
`
`Next, Apache attempts to divide the file path into two parts: a “filename” part
`which usually (but not always) corresponds to a physical file on the host’s filesys—
`tem, and an “additional path information” part corresponding to additional stuff
`that follows the l’ilename. Apache divides the path using a very simple-minded
`algorithm. It steps through the path components from left to right until
`it finds
`something that doesn’t correspond to a directory on the host machine. The part of
`the path up to and including this component becomes the filename, and every-
`thing that’s left over becomes the additional path information.
`
`Consider a site with a document root of /bome/www that has just received a
`
`request for URI /abc/de_//gbi. The way Apache splits the file path into filename and
`path information parts depends on what directories it finds in the document root:
`
`Physical Directory
`manic/want;
`
`| Translated Filename
`/bmne/wttrw/abc
`
`Additional Path Information
`flieflgbz'
`
`
`
`
`
`
`/bome/www/crbc
`
`/bome/tttttrw/abC/def
`
`/gbi
`
`/bome/www/abc/de/'
`
`/b(Jirie/tttwttI/ozbc/def/gb2'
`empty
`
`
`empty
`/bmne/uttuw/abade/Zeb1'
`//Jome/ttttuw/ctJJC/def/gla1'
`
`
`
`Note that the presence of any actual files in the path is irrelevant to this process.
`The division between the filename and the path information depends only on
`
`what directories are present.
`
`it determines what MIME
`Once Apache has decided where the file is in the path,
`type it might be. This is again one of the places where you can intervene to alter
`the process with a custom type handler. The default type handler (mocL-mime) just
`compares the filename’s extension to a table of MIME types. If there’s a match, this
`becomes the MIME type. if no match is found, then the MIME type is undefined.
`Again, note that
`this mapping from filename to MIME type occurs even when
`there’s no actual file there.
`
`There are two special cases. If the last component of the filename happens to be a
`physical directory, then Apache internally assigns it a “magic" MIME type, defined
`by the DIR_MAGIC_TYPE constant as btipd/unix-directory This is used by the
`directory module to generate automatic directory listings. The second special case
`occurs when you have the optional modfimz‘mgmagz’c module installed and the
`file actually exists. In this case Apache will peek at the first few bytes of the file’s
`contents to determine what type of file it might be. Chapter 7 shows you how to
`write your own MIME type checker handlers to implement more sophisticated
`
`MIME type determination schemes.
`
`I
`
`rem
`
`
`
`hm”:wt';_u..
`
`5
`
`
`
`r
`
`Corlfem Handlers as File Processors
`
`8‘7
`
`After Apache has determined the name and type of the file referenced by the URI,
`it decides what to do about it. One way is to use information hard—wired into the
`module’s static data structures. The module’s handler_rec table, which we
`describe in detail in Chapter it), CAP! Reference Guide, Part I, declares the mod-
`ule’s willingness to handle one or more magic MIME types and associates a con—
`[ent handler with each one. For example,
`the moc’chi module associates MIME
`my; nmpllcah’on/x—bWad-Cgi with its chbc‘rrrdferO handler subroutine. When
`Apache detects that a filename is of type sip/firearmi’iflx-b/rpcl—cgi? it
`invokes ch
`bandlerfi) and passes it information about the file. A module can also declare its
`desire to handle an ordinary MIME type. such as video/quicki‘nire. or even a wild—
`card type, such as video/"f In this case, all requests for liRIs with matching MIME
`types will be passed through the modules content handler unless some other
`module registers a more specific type.
`
`NeWer modules use a more flexible method in which content handlers are associ-
`ated with files at runtime using explicit names. When this method is used,
`the
`
`module declares one or more content handler names in its handler_rec array
`instead of, or in addition to, MIME types. Some examples of content handler
`names you might have seen include CgFSCl‘lpf, sewer—irtfo, sewer-parsed,
`imapT/i'le,
`and pert-script. Handler names can be associated with files using either Adm-1am
`dler or SerHandler directives. AddHandler associates a handler with a particular
`
`file extension. For example, a typical configuration file will contain this line to
`associate .sbrml files with the server—side include handler:
`
`AddHandler server—parsed .shtml
`
`Now, the server-parsed handler defined by rrrochclude will be calied on to pro—
`cess all files ending in “.shtml" regardless of their MIME type.
`
`Seszra’ler is used within <Directory>, <Loccrlz‘on>, and <Files> sections to associ-
`
`ate a particular handler with an entire section of the site's URI space. In the two
`examples that follow, the <Location> section attaches the server-parsed method to
`all files within the virtual directory /5btml, while the <Files> section attaches i'map—
`
`flle to all files that begin with the prefix “map-":
`
`<Location lshtml>
`SetHandler server—parsed
`</Location>
`
`<Files map—*>
`SetHandler imap—file
`</Files>
`
`the AddHandfer and Selezdlei-' directives are not actually imple—
`Surprisingly,
`mented in the Apache core. They are implemented by the standard moaLacn'ons
`
`6
`
`
`
`
`
`Chapter 4: Content Handlers
`88
`aa______a,______a_______w___—____a_____________a___h______a‘_______a__________fi
`
`module, which is compiled into the server by default. In Chapter 7. We show how
`to reimplement mocLact-ions using the Perl API.
`
`to use explicitly named content handlers in your modules
`You'll probably want
`rather than hardcoded MIME types. Explicit handler names make configuration
`files cleaner and easier to understand. Plus, you don't have to invent a new magic
`MIMF. type every time you add a handler.
`
`Things are slightly different for mocher! users because (we directives are needed
`to assign a content handler to a directoiy or file. The reason for this is that the
`only real content handler defined by mocherl is its internal pert-script handler.
`You use SetHandler to assign peril-scrip! the responsibility for a directory or partial
`URI, and then use a Perchmdler directive to tell the peril—script handler which Perl
`module to execute. Directories supervised by Perl API content handlers will look
`
`something like this:
`
`<Location lgraph>
`SetHandler perl—script
`PerlHandler Apache::Graph
`</Location>
`
`Don't try to assign pert-script to a file extension using something like AddHandler
`perl—seript .pl; this is generally useless because you’d need to set PerlHnn—
`dler too. If you’d like to associate a Perl content handler with an extension, you
`should use the <Files> directive. Here’s an example:
`
`<Files ~ "\.graph$">
`SetHandler perlfiscript
`PerlHandler Apache::Graph
`</Fi1es>
`
`There is no UnSetHa-ndler directive to undo the effects of SeIchndler. However,
`should you ever need to restore a subdirectory’s handler to the default, you can
`do it with the directive SetHandler default—handler, as follows:
`
`<Location /graph/tutorial>
`SetHandler default—handler
`</Location>
`
`Adding a Canned Footer t0 Pages
`
`To show you how content handlers work, we’ll develop a module with the Perl
`API that adds a canned footer to all pages in a particular directory. You could use
`this, for example,
`to automatically add copyright information and a link back to
`the home page. Later on, we’ll turn this module into a full—featured navigation bar.
`
`Example 4—1 gives the code for Apache-footer, and Figure 4-7] shows a screenshot
`of it in action. Since this is our first substantial module, we’ll step through the code
`
`section by section.
`
`7
`
`
`
`r
`
`89
`Carrie-11f Handlers as File Processors
`/____—__——__9—
`
`file
`
`Edit View
`
`Go Bookmarks Options Directory Window
`
`l—W
`i l __l Eel—E
`Becki remand Home
`Edit
`Retoad toad 1111;11th 0 en
`Print...
`Location. .})ttp://1oca1hostffooterftlemo. html
`
`
`
`This page contains a" canned footer
`
`The two lines at the bottom of this page aren’t in the original source. cede, but were
`added by Attaches Footer
`
`© 19980'1’rillit;Associates
`Last Mod:fied: Wedilprt? 0?.48.04 1998
`
`Figure 4- i.
`
`'Illiefhoier on this page atlas generated attaining/15.5111”)! by Apachej-"ooiei:
`
`package ApachezzFooter;
`
`use strict;
`
`use Apache: :Constants qwizcommon);
`use Apache::File i);
`
`The code begins by declaring its package name and loading various Perl modules
`that
`it depends on. The use strict pragma activates Peri checks that prevent us
`from using global variables before declaring them, disallows the use of function
`calls without
`the parentheses,
`and prevents other unsafe practices. The
`Apcicbe.-.-Corisiams module defines constants for the various Apache and HTTP
`result codes; we bring in only those constants that belong to the Frequently used
`.-common set. Apache-:Fr‘le defines methods that are useful for manipulating files.
`
`sub handler {
`my $r : shift;
`return DECLINED unless $r—>content_type() eq 'text/html‘;
`
`The band/cit) subroutine does all the work of generating the content. it is roughly
`
`divided into three parts. in the first part, it fetches information about the requested
`file and decides whether it wants to handle it. in the second part,
`it creates the
`
`canned footer dynamically from information that it gleans about the file. In the
`third part,
`it rewrites the file to include the footer.
`
`In the first part of the process, the handler retrieves the Apache request object and
`
`stores it
`in $1: Next
`it calls the requests contentJypeO method to retrieve its
`MIME type. Unless the document is of type text/farm! the handler stops here and
`returns a DECLINED result code to the server. This tells Apache to pass the
`
`8
`
`
`
`_,._.—_;__————————"—‘————"_“——
`90
`Chapter 4: Content Handlers
`
`document on to any other handlers that have declared their willingness to handle
`this type of document. In most cases, this means that the document or image will
`be passed through to the browser in the usual way.
`
`
`
`my Sfile = $r—>filename;
`
`{
`unless lee Sr—>finfo)
`Sre>log_error(“File does not exist: Sfile");
`return NOT_FOUND;
`
`l u
`
`}
`
`nless (er _)
`
`i
`
`$r—>log_error("File permissions deny access: Sfile");
`return FORBIDDEN;
`
`At this point we go ahead and recover the file path, by calling the request object’s
`jllenameO method. just because Apache has assigned the document a MIME type
`doesn’t mean that it actually exists or,
`if it exists, that its permissions allow it to be
`read by the current process. The next two blocks of code check for these cases.
`Using the Perl —e file test, we check whether the file exists. If not, we log an error
`to the server log using the request object’s log_error() method and return a result
`code of NOT_FOUND. This will cause the server to return a page displaying the
`404 “Not Found” error (exactly what’s displayed is under the control of the Error—
`Document directive).
`
`There are several ways to perform file status checks in the Perl API. The simplest
`way is to recover the file’s pathname using the request object’s filenameO method,
`and pass the result to the Perl —e file test:
`
`unless (—e $rw>filename)
`
`{
`
`$r—>log_error("File does not exist: Sfile“);
`return NOT_FOUND;
`
`}
`
`A more efficient way, however, is to take advantage of the fact that during its path
`walking operation Apache already performed a system smtO call to collect filesys-
`tem information on the file. The resulting status structure is stored in the request
`object and can be retrieved with the object’s flitfoO method. So the more efficient
`idiom is to use the test —e $r—>finfo.
`I
`
`file—
`the statO information is stored into the magic Perl
`Once finfoO is called,
`handle _ and can be used for subsequent file testing and statO operations, saving
`even more CPU time. Using the _ filehandie, we next test that the file is readable
`by the current process and return FORBIDDEN if this isn’t the case. This displays a
`403 “Forbidden" error.
`
`my $modtime = localtime((stat _)[9]);
`
`9
`
`
`
`(. ”lg-fl! [familiar-s as File Processors
`i0 -
`
`9]
`
`\flU" performing these tests, we get the file modification time by calling SUMO. We
`‘
`n use the _ filehandle here too, avoiding the overhead of repeating the stalO
`C”
`.
`.
`.
`.
`.
`,
`all. The modification time Is passed to the built-in Perl localinneO func—
`
`Systcm C
`[ion [0 ‘3‘
`
`invert it into a human—readable string.
`
`my $fh:
`{
`unless (th = Apache::File4>new{$file)l
`SrA>log_error("Couldn't open $file for reading: $1”);
`return SERVERfiERROR;
`
`}
`
`N this point, we attempt to open the file for reading using Apache-file’s newO
`method. For the most part, Apache-file acts just like Perl's 10,-:F1'le object—oriented
`[/0 package, returning a filehandle on success or uridef on failure. Since we've
`already handled the two failure modes that we know how to deal with, we return
`a result code of SERVER_ERROR if the open is unsuccessful. This immediately
`uhorts all processing of the document and causes Apache to display a 500 “Inter—
`nal Server Error” message.
`
`my Sfooter = <<END;
`
`<hr>
`
`&cOpy; 1998 <a href="http://www.ora.com/”>O‘Rei11y & Associates<fa><br>
`<em>Last Modified: $modtime</em>
`END
`
`Having successfully opened the file, we build the footer. The footer in this
`example script is entirely static, except for the document modification date that is
`computed on the fly.
`
`$r—>send_http_header;
`
`{
`While (<$fh>)
`st(</BODY>)!$footer$lloi;
`} continue {
`$r->print ( $_) .-
`
`}
`
`The last phase is to rewrite the document. First we tell Apache to send the HTTP
`header. There’s no need to set the content type first because it already has the
`
`looking for the closing
`appropriate value. We then loop through the document
`</BODY> tag. When we find it, we use a substitution statement to insert the footer
`in front of it. The possibly modified line is now sent to the browser using the
`
`request object’s primO method.
`
`return OK;
`
`10
`
`10
`
`
`
`
`
`92 Chapter 4: Content Handlers
`
`
`
`At the end, we return an OK result code to Apache and end the handler subrou-
`tine definition. Like any other .pm file, the module itself must end by returning a
`true value (usually 1) to signal Perl that it compiled correctly.
`
`lx
`lli
`l
`
`If all this checking for the existence and readability of the file before processing
`seems a bit pedantic, don’t worry.
`It’s actually unnecessary for you to do this.
`instead of explicitly checking the file, we could have simply returned DECLINED if
`the attempt to open the file failed. Apache would then pass the URI to the default
`file handler which will perform its own checks and display the appropriate error
`messages. Therefore we could have replaced the file tests with the single line:‘
`
`my $fh = Apache::Fi1e->new($file)
`
`|| return DECLINED;
`
`Doing the tests inside the module this way makes the checks explicit and gives us
`a chance to intervene to rescue the situation. For example, we might choose to
`
`search for a text file of the same name and present it instead. The explicit tests
`also improve module performance slightly, since the system wastes a small amount
`of CPU time when it attempts to open a nonexistent file. If most of the files the
`module serves do exist, however, this penalty won’t be significant.
`
`
`
`.*‘.A.lw_nqmp_,rym..wm.~W~M*M*.__,_
`
`Example 4—]. Adding a Canned Footer to H7144]. Pages
`
`package Apache::Footer;
`# file: Apache/Footer.pm
`
`use strict;
`use Apache::Constants qw(:common);
`use Apache::File ();
`
`sub handler {
`
`my $1: = shift;
`return DECLINED unless $r—>content_type() eq 'text/html';
`
`my $fi1e = $r—>filename;
`
`unless (—e $r—>finfo)
`
`{
`
`$r—>log_error(“File does not exist: Sfile");
`return NOI‘_FOUND;
`
`} u
`
`{
`nless (-r _)
`$r->log_error(“File permissions deny access: $fi1e");
`return FORBIDDEN;
`
`} m
`
`y Smodtime = localtime((stat _)[9]);
`
`my $fh;
`{
`unless ($fh = Apache::File->new($file))
`$r—>log_error("Couldn't open $file for reading: $2");
`return SERVER_ERROR;
`
`11
`
`11
`
`
`
`[groupie 4-1. Adding a Counted li'ooi‘ei' to HTML Pages (continued)
`my Sfooter = <<END;
`<1'Lf>
`&Copy; 1998 <a href=">http://www.0ra.c0m/“>O'Reilly & Associates</a><br>
`49m>Last Modified: $m0dtime</em>
`END
`
`Sr->send_http_header;
`
`while (<$fh>)
`
`t
`
`s!{</BODY>il$fOOter$1loii
`} continue [
`$r->print ( $_) :
`
`} r
`
`eturn OK;
`
`END _
`
`} 1
`
`;
`
`There are several ways to install and use the Apache-footer content handler. if all
`the files that needed footers were gathered in one place in the directory tree, you
`would probably want to attach Apache-:Footerto that location:
`
`<Location /footer>
`
`SetHandler per1~script
`PerlHandler Apache::Footer
`</L0cation>
`
`if the files were scattered about the document tree, it might be more convenient to
`map Apache-footer to a unique filename extension, such as footer. To achieve
`this, the following directives would suffice:
`
`.footer
`AddType text/html
`(Files ~ "\.footer$“>
`
`SetHandler perl—script
`PerlHandler Apache::Footer
`</Fi1es>
`
`Note that it's important to associate MIME type text/him! with the new extension;
`otherwise, Apache won’t be able to determine its content type during the MIME
`
`type checking phase.
`
`If your server is set up to allow per—directory access control files to include file
`information directives, you can place any of these handler directives inside a Jame-
`Cess file. This allows you to change handlers without restarting the server. For
`example, you could replace the <£ocatiort> section shown earlier with a .biaccess
`file in the directory where you want the footer module to be active:
`
`SetHandler perl—script
`PerlHandler ApachezzFooter
`
`12
`
`12
`
`
`
`
`94
`Chapter 4: Content Handlers
`
`A Server-Side Include System
`
`The obvious limitation of the Apache:.-Footer example is that
`
`the footer text
`
`is
`
`hardcoded into the code. Changing the footer becomes a nontrivial task, and using
`
`different footers for various parts of the site becomes impractical. A much more
`flexible solution is provided by Vivck Khera’s Apaelye:.-szdwicb module. This
`module “sandwiches" HTML pages between canned headers and footers that are
`
`determined by runtime configuration directives. The Apache-Sandwich module
`also avoids the overhead of parsing the request document; it simply uses the sub—
`
`request mechanism to send the header, body, and footer files in sequence.
`
`than ApacbenSnndwicb by using server-side
`We can provide more power
`includes. Sewer—side includes are small snippets of code embedded within HTML
`
`in the standard server—side includes that are imple-
`comments. For example,
`mented in Apache, you can insert the current time and date into the page with a
`comment that looks like this:
`
`Today is <!——#echo var="DATEfiLOCAL"——>.
`
`In this section, we use mocherl to develop our own system of server—side
`
`includes, using, a simple but extensible scheme that lets you add new types of
`includes at a moment’s whim. The basic idea is that HTML authors will cr=ate files
`
`that contain comments of this form:
`
`<!--#DIRECTIW PAJU-lMl PARAMZ' PARAM3 PARAM4.. .-->
`
`A directive name consists of any sequence of alphanumeric characters or under—
`
`scores. This is followed by a series of optional parameters, separated by spaces or
`
`commas.
`
`Jarameters that contain whitespace must be enclosed in single or dou—
`
`ble quotes in shell command style. Backslash escapes also work in the expected
`manner.
`
`The directives themselves are not hardcoded into the module but are instead
`
`dynamically loaded from one or more configuration files created by the site
`administrator. This allows the administrator to create a standard menu of includes
`
`that are available to the site’s HTML authors. Each directive is a short Perl subrou—
`
`tine. A simple directive looks like this one:
`
`sub HELLO { “Hello World!“:
`
`}
`
`This defines a subroutine named HELLOO that returns the string “Hello World!" A
`
`document can now include the string in its text with a comment formatted like this
`one:
`
`I said <l——#HELLO——>
`
`A more complex subroutine will need access to the Apache object and the server-
`
`side include parameters. To accommodate this, the Apache object is passed as the
`
`first function argument, and the server-side include parameters,
`
`if any,
`
`follow.
`
`13
`
`13
`
`
`
`95
`Content Handlers as File Processors
`/__—__f_____._—~___
`
`Here’s a function definition that returns any field from the incoming request's
`HTTP header, using the Apache object's beadegmo method:
`
`sub HTTP_HEADER {
`my (Sr,$field) = @_:
`$r—>header_in($field);
`
`}
`
`With this subroutine definition in place, HTML authors can insert the User—Agent
`
`field into their document using a comment like this one:
`
`You are using the browser <!77 #HTTP_HEADER User—Agent ——>.
`
`Example 4—2 shows an HTML file that uses a few of these includes, and Figure 4—2
`shows what the page looks like after processing.
`
`Example 4-2. An: HTML File 7795;! Uses Extended Server-Side li'iclmles
`
`<html> <head> <title>Server—Side Includes<ltitle><lhead>
`(body bgcolor=white>
`<h1>Server—Side Includes< /h1>
`This is some straight text.<p>
`
`This is a "<l—— #HELLO ——>"
`
`include.<p>
`
`The file size is <strong><!—— #FSIZE ——></strong>, and it was
`last modified on <i—— #MODTIME %x ——><p>
`
`Today is <!—— #DATE “%A,
`
`in <em>anno domini<lem> %Y“——>.<p>
`
`The user agent is <em><i~~fiHTTP_HEADER User—Agent——></em>.<p>
`
`Oops: <!——#OOPS O——><p>
`
`Here is an included file:
`<pre>
`<!——#INCLUDE /include.txt 1——>
`
`</pre>
`
`<!——#FOOTER——>
`
`</body> </html>
`
`Implementing this type of server—side include system might seem to be something
`of a challenge, but in fact the code is surprisingly compact (Example 4—3). This
`module is named Apache-£55], for “extensible server-side includes.”
`
`Again, we’ll step through the code one section at a time.
`
`package Apache::ESSI;
`
`use strict;
`
`use Apache::Constants qw(:common);
`use Apache::File ();
`use Text::ParseWords QW(QUOCEWOId5)I
`my (QEMODIFIED, %SUBSTITUTION);
`
`14
`
`14
`
`
`
`
`
` 96 Chapter 4': Contenl Handlers
`
`7 —J
`
`File
`
`e: Server-Side Includes -
`Nets:
`Edit mew Go Boollcmam Options Directory
`
`voodoo:
`
`
`
`Becki
`
`iicirunmii Home}
`
`Edit]
`
`Reloadl Load imagmsl 0_pen...i PrinLZI Er
`
`, _;
`
`
`
`
`
`
`
`
`Location: i—Bitt-ozf/localhostitest.ehtml
`
`
`Server— Side Includes
`
`This is some straight text.
`
`This is a ”Hello World!" include.
`
`Thefilesizeis59?bytes,anditwaslastmodifiedon04118298
`
`Today is Saturday, in arms domini 199 8.
`
`The user agent is Mazda/3.01 Gala! (X3 1‘; I; Liam 2. 0.33 i585).
`
`Hereismindudedfile:
`
`i |ll 1
`
`\
`
`I
`
`Oops: [ffiegiii division :5]; zero at losrfkome/mmu/confiessi.defs (we 45, Cheryl: 24. j
`
`the quick brown fox jumps over the sleeping dog
`
`
`
`© 1998 93:13ti 8: Associates
`Last Modified: Sui Apr £8 0?:50:57 1998
`
`J
`
` I 5&3!
`
`Figure 4-2. A pugegeuemted by A}JcielJe:.-ESSI
`
`We start as before by declaring the package name and loading, various Perl library
`modules. in atlcliiion [o the modules that we loaded in the Apache-footer exam—
`ple, we import the quotewordsO function from the standard Perl Text:.-ParseWords
`module. This routine provides command shellelike parsing of strings that coniain
`quote marks and backslash escapes. We also define two lexical variables,
`%MODIFIED and %SUBSTITUTION, which are global to the package.
`
`Sub handler {
`
`my $r = shift;
`[I return DECLINED;
`$r—>content_type() eq 'text/html'
`my $fh : Apache::File—>new($rw>filename) H return DECLINED;
`my $Sub = read_definitions($r)
`|
`i return SERVER_ERROR;
`$ 1: —> send_ht tp_header;
`$r—>print: (ssub—> ($r, $fhl ):
`return OK;
`
`15
`
`15
`
`
`
`Content Handlers as File Processors
`
`97
`
`in the Apaches/tower example,
`The bandied) subroutine is quite short. As
`brindle?!) starts by examining the content type of the document being requested
`and declines to handle requests for non—H'i'ML documents. The handler recovers
`the file's physical path by calling the request object's _/i‘i.’ei'ictiiie() method and
`attempts to open it. if the file open fails, the handler again returns an error code of
`DECLINED. This avoids Apache-footers tedious checking of the file's existence
`and access permissions, at the cost of some efficiency every time a nonexistent file
`
`is requested.
`
`Once the file is opened, we call an internal function named rearLa’cf/lrit'it'orisO.
`This function reads the server-side includes configuration file and generates an
`anonymous subroutine to do the actual processing of the document.
`if an error
`occurs while processing the configuration file,
`rectche/i’nitiorisO returns undef
`and we return SERVER_ERROR in order to abort the transaction. Otherwise, we
`send the HTTP header and invoke the anonymous subroutine to perform the sub—
`
`stitutions on the contents of the file. The result of invoking the subroutine is sent
`to the client using the request object‘s primO method, and we return a result code
`of OK to indicate that everything went smoothly.
`
`sub read_definitions {
`my Sr = shift;
`my Sdef : $r~>dir_config( 'ESSIDefs' );
`return unless SdeE;
`return unless —e
`(Sdef = $r—>server__root_relative{$def));
`
`Most of the interesting work occurs in read_de/ir1itionsO. The idea here is to read
`the server-side include definitions, compile them, and then use them to generate
`
`In order to avoid
`an anonymous subroutine that does the actual substitutions.
`recompiling this subroutine unnecessarily, we cache its code reference in the
`package variable %SUBSTITUTION and reuse it if we can.
`
`The recrdfidefim‘tionsO subroutine begins by retrieving the path to the file that
`contains the server—side include definitions. This information is contained in a per-
`
`in the configura—
`directory configuration variable named ESSIDefs, which is set
`tion file using the PerlSefi/‘ar directive and retrieved within the handler with the
`request object’s dir;cortfig() method (see the end of the example for a representa-
`tive configuration file entry).
`If,
`for some reason,
`this variable isn‘t present, we
`return tirade/f Like other Apache configuration files, we allow this file to be speci-
`fied as either an absolute path or a partial path relative to the server root. We pass
`the path to the request object’s semenrooLi/‘elativeO method. This convenient
`function prepends the server root
`to relative paths and leaves absolute paths
`alone. \Ve next check that the file exists using the —e file test operator and return
`
`undefif not.
`
`return $SUBSTI‘I’UTION{$def} if $MODIFIED{$def} && $MODIFIED{$def} <= —M
`
`;
`
`16
`
`16
`
`
`
`
`
`98 Chapter 4’: Content Handlers
`
`Having recovered the name of the definitions file, we next check the cache to see
`
`whether the subroutine definitions are already cached and,
`
`if so, whether the file
`
`hasn’t changed since the code was compiled and cached. We use two hashes for
`
`this purpose. The %SUBSTITUTION array holds the compiled cod '* and %MODIFIED
`contains the modification date of the definition file the last time it was compiled.
`
`Both hashes are indexed by the definition file’s path, allowing the module to han—
`dle the case in which several server—side include definition files are used for differ—
`
`ent parts of the document tree. If the modification time listed in %MODIFIED is less
`than or equal
`to the definition file’s current modification date, we return the
`cached subroutine.
`
`my $package = join "::", #_PACKAGE__, $def;
`Spackage =~ tr/a—zA—ZDm9_/_/c;
`
`The next two lines are concerned with finding a unique namespace in which to
`compile the server—side include functions. Putting the functions in their own
`namespace decreases the chance that function side effects will have unwanted
`effects elsewhere in the module. We take the easy way out here by using the path
`to the definition file to synthesize a package name, which we store in a variable
`
`named $package.
`
`eval "package Spaekage; do ‘Sdef'
`if($@l
`[
`
`u
`
`;
`
`$r—>log_error("Eva1 of $def did not return true: $@“);
`return;
`
`}
`
`We then invoke 9225110 to compile the subroutine definitions into the newly cho—
`sen namespace. We use the package declaration to set the namespace and do to
`load and run the definitions file. We use do here rather than the more common
`
`require because do unconditionally recompiles code files even if they have been
`loaded previously. if the em! was unsuccessful, we log an error and return undef
`
`$SUBSTITUTIONisdef} = sub {
`do_substitutions($package, @_);
`
`};
`
`$MODIFIED£$def} : —M sdef;
`return $SUBSTITUPIONi$def} ,-
`
`# store modification date
`
`}
`
`Before we exit readfide/‘initionsO, we create a new anonymous subroutine that
`
`invokes the do_substirzitions() function, store this subroutine in %SUBSTI'I’UTION,
`
`and update %MODIFIED with the modification date of the definitions file. We then
`return the code reference to our caller. We interpose a new anonymous subrou-
`
`tine here so that we can add the contents of the Spackage variable to the list of
`
`variables passed to the do_sr.rbstimtions() function.
`
`sub do_substitutions {
`my spackage = shift;
`
`17
`
`17
`
`
`
`Content Handlers as File Processors
`
`99
`
`my($r, th) = @_;
`# Make sure that evall) errors aren't trapped,
`local SSIG{__WARN__}:
`local SSIG t__p1:i3__} ,-
`1ocal $/; #slurp th
`
`my $data = <$fh>;
`# start of a function name
`Sdata =~ s/<!——\s*\#(\w+)
`s optional parameters
`\s*(.*?)
`# end of comment
`\s*——>
`/Call_sub($package, $1, $r, $2l/X59gi
`
`$data;
`
`}
`
`it calls do_szrbsli'iiili'orisO to
`\‘(Ihen bandlerO invokes the anonymous subroutine