by Jef fry Dwight
The Common Gateway Interface (CGI) specification lets Web servers execute other programs and incorporate their output into the text, graphics, and audio sent to a Web browser. The server and the CGI program work together to enhance and customize the World Wide Web's capabilities.
By providing a standard interface, the CGI specification lets developers use a wide variety of programming tools. CGI programs work the magic behind processing forms, looking up records in a database, sending e-mail, building on-the-fly page counters, and dozens of other activities. Without CGI, a Web server can offer only static documents and links to other pages or servers. With CGI, the Web comes alive: it becomes interactive, informative, and useful. CGI can also be a lot of fun!
In this chapter, you'll learn about the fundamentals of CGI: how it originated, how it's used today, and how it will be used in the future.
Browsers and Web servers communicate by using the Hypertext Transport
Protocol (HTTP). Tim Berners-Lee at CERN developed the World Wide
Web using HTTP and one other incredibly useful concept: the Universal
Resource Locator (URL). The URL is an addressing scheme that lets
browsers know where to go, how to get there, and what to do after
they reach the destination. Technically, an URL is a form of Universal
Resource Identifier (URI) used to access an object with existing
Internet protocols. Because this book deals only with existing
protocols, all URIs will be called URLs, not worrying about the
technical hair-splitting. URIs are defined by RFC 1630.
| ON THE WEB |
http://ds.internic.net/rfc/rfc1630.txt This site contains a copy of RFC 1630, which you can read if your interested in more details about URIs |
In a simplified overview, six things normally happen when you fire up your Web browser and visit a site on the World Wide Web:
The important point here is that after the server has responded, it breaks the connection. If the document you get back has links to other documents (inline graphics, for instance), your browser goes through the whole routine again. Each time you contact the server, it's as if you'd never been there before, and each request yields a single document. This is what's known as a stateless connection.
Fortunately, most browsers keep a local copy, called a cache, of recently accessed documents. When the browser notices that it's about to re-fetch something already in the cache, it just supplies the information from the cache rather than contact the server again. This alleviates a great deal of network traffic.
Using a cache is fine for retrieving static text or displaying
graphics, but what if you want dynamic information? What if you
want a page counter or a quote-of-the-day? What if you want to
fill out a guest book form rather than just retrieve a file? The
next section can help you out.
| The State of HTTP |
Because the server doesn't remember you between visits, the HTTP 1.0 protocol is called stateless. This means that the server doesn't know the state of your browser, whether this is the first request you've ever made or whether this is the hundredth request for information making up the same visual page. Each GET or POST (the two main methods of invoking a CGI program) in HTTP 1.0 must carry all the information necessary to service the request. This makes distributing resources easy but places the burden of maintaining state information on the CGI application. A "shopping cart" script is a good example of needing state information. When you pick an item and place it in your virtual cart, you need to remember that it's there so that when you get to the virtual check-out counter, you know what to pay for. The server can't remember this for you, and you certainly don't want the user to have to retype the information each time he or she sees a new page. Your program must track all the variables itself and figure out, each time it's called, whether it's been called before, whether this is part of an ongoing transaction, and what to do next. Most programs do this by shoveling hidden fields into their output, so when your browser calls again, the hidden information from the last call is available. In this way, it figures out the state you're supposed to have and pretends you've been there all along. From the user's point of view, it all happens behind the scenes. The Web has used HTTP 1.0 since 1990, but since then, many proposals for revisions and extensions have been discussed. |
| ON THE WEB |
http://www.w3.org/hypertext/WWW/Protocols/ If you're interested in the technical specifications of HTTP, both current and future, stop by this site. http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-state-mgmt-03.txt Of particular interest to CGI programmers is the proposal for maintaining state information at the server. HTTP 1.1, when approved and in widespread use, will provide a great number of improvements for state information. In the meantime, however, the protocol is stateless, and that's what your programs will have to remember. |
Your Web browser doesn't know much about the documents it asks for. It just submits the URL and finds out what it's getting when the answer comes back. The server supplies certain codes, using the Multipurpose Internet Mail Extensions (MIME) specifications, to tell the browser what's what. This is how your browser knows to display a graphic but save a .Zip file to disk. Most Web documents are created with HTML, just plain text with embedded instructions for formatting and displaying.
The server is only smart enough to send documents and to tell the browser what kind of documents they are. But the server also knows one other key thing: how to launch other programs. When a server sees that an URL points to a file, it sends back the contents of that file. When the URL points to a program, however, the server starts the program. The server then sends back the program's output as if it were a file.
What does this accomplish? Well, for one thing, a CGI program can read and write data files (a Web server can only read them) and produce different results each time you run it. This is how page counters work. Each time the page counter is called, it finds the previous count from information stored on the server (usually in a file), increments it by one, and creates a .Gif or .Jpg file on the fly as its output. The server sends the graphic data back to the browser just as if it were a real file living somewhere on the server.
NCSA Software Development maintains the CGI specification. You'll
find the specification online at the World Wide Web Consortium:
http://www.w3.org/hypertext/WWW/CGI/. This document goes
into great detail, including history, rationales, and implications.
If you don't already have a copy, download one and keep it handy.
You won't need it to understand the examples in this book, but
it will give you a wonderful overview of CGI and help you think
through your own projects in the future.
| NOTE |
The current version of the CGI specification is 1.1. The information you'll find at www.w3.org is composed of continually evolving specifications, proposals, examples, and discussions. You should keep this URL handy (make a bookmark) and check in from time to time to see what's new |
A CGI program isn't anything special by itself. That is, it doesn't
do magic tricks or require a genius to create it. In fact, most
CGI programs are fairly simple things written in C or Perl, two
popular programming languages.
| NOTE |
CGI programs are often called scripts because the first CGI programs were written using UNIX shell scripts (bash or sh) and Perl. Perl is an interpreted language, somewhat like a DOS batch file but much more powerful. When you execute a Perl program, the Perl instructions are interpreted and compiled into machine instructions right then. In this sense, a Perl program is a script for the interpreter to follow, much as Shakespeare's Hamlet is a script for actors to follow Other languages, like C, are compiled ahead of time, and the resulting executable isn't normally called a script. Compiled programs usually run faster but are more complicated to program and harder to modify. In the CGI world, however, interpreted and compiled programs are both called scripts. That's the term this chapter will use from now on. |
Before the server launches the script, it prepares a number of environment variables representing the current state of the server, which is asking for the information. The environment variables given to a script are exactly like normal environment variables, except that you can't set them from the command line. They're created on the fly and last only until that particular script is finished. Each script gets its own unique set of variables. In fact, a busy server often has many scripts executing at once, each with its own environment.
You'll learn about the specific environment variables in the later "Designing CGI Applications" section. For now, it's enough to know that they're present and contain important information that the script can retrieve.
Also, depending on how the server invokes the script, the server
may pass information another way, too: Although each server handles
things a little differently, and although Windows servers often
have other methods available, the CGI specification calls for
the server to use STDOUT (standard output) to pass information
to the script.
| Standard Input and Output |
STDIN and STDOUT are mnemonics for standard input and standard output, two predefined stream/file handles. Each process inherits these two handles already open. Command-line programs that write to the screen usually do so by writing to STDOUT. If you redirect the input to a program, you're really redirecting STDIN. If you redirect the output of a program, you're really redirecting STDOUT. This mechanism is what allows pipes to work. If you do a directory listing and pipe the output to a sort program, you're redirecting the STDOUT of the directory program (DIR or LS) to the STDIN of the sort program. For Web servers, STDOUT is the feed leading to the script's STDIN. The script's STDOUT feeds back to the server's STDIN, making a complete route. From the script's point of view, STDIN is what comes from the server, and STDOUT is where it writes its output. Beyond that, the script doesn't need to worry about what's being redirected where. The server uses its STDOUT when invoking a CGI program with the POST method. For the GET method, the server doesn't use STDOUT. In both cases, however, the server expects the CGI script to return its information via the script's STDOUT. This standard works well in the text-based UNIX environment where all processes have access to STDIN and STDOUT. In the Windows environments, however, STDIN and STDOUT are available only to non-graphical (console-mode) programs. To complicate matters further, Windows NT creates a different sort of STDIN and STDOUT for 32-bit programs than it does for 16-bit programs. Because most Web servers are 32-bit services under Windows NT, this means that CGI scripts have to be 32-bit console-mode programs. That leaves popular languages such as Visual Basic 1.0 - 3.0 and Delphi 1.0 out in the cold. One popular Windows NT server, the freeware HTTPS from EMWAC, can talk only to CGI programs this way. Fortunately, there are several ways around this problem. Some Windows NT servers, notably Bob Denny's WebSite, use a proprietary technique using .Ini files to communicate with CGI programs. This technique, which may well become an open standard soon, is called CGI-WIN. A server supporting CGI-WIN writes its output to an .Ini file instead of STDOUT. Any program can then open the file, read it, and process the data. Unfortunately, using any proprietary solution like this one means your scripts will work only on that particular server. For servers that don't support CGI-WIN, you can use a wrapper program. Wrappers do what their name implies: They wrap around the CGI program like a coat, protecting it from the unforgiving Web environment. Typically, these programs read STDIN for you and write the output to a pipe or file. Then they launch your program, which reads from the file. Your program writes its output to another file and terminates. The wrapper picks up your output from the file and sends it back to the server via STDOUT, deletes the temporary files, and terminates itself. From the server's point of view, the wrapper was the CGI program. |
| ON THE WEB |
http://www.greyware.com/greyware/software/cgishell.htp This site has a wrapper program called CGIShell. CGIShell lets you use almost any 16- or 32-bit programming environment to write CGI scripts |
The script picks up the environment variables and reads STDIN as appropriate. It then does whatever it was designed to do and writes its output to STDOUT.
The MIME codes the server sends to the browser let the browser know what kind of file is about to come across the network. Because this information always precedes the file itself, it's usually called a header. The server can't send a header for information generated on the fly by a script because the script could send audio, graphics, plain text, HTML, or any one of hundreds of other types. Therefore, the script is responsible for sending the header. So in addition to its own output, whatever that may be, the script must supply the header information. Failure to do so always means failure of the script because the browser won't understand the output.
The following are the broad steps of the CGI process, simplified for clarity:
It's a bit more complicated than a normal HTML retrieval, but hardly daunting, and that's all there is to how CGI works. Well, no; there's more, but that's the essential mechanism. The scripts become extensions to the server's repertoire of static files and open up the possibilities for real-time interactivity.
Just like any other file on a server, CGI scripts have to live somewhere. Depending on your server, CGI scripts may have to live all in one special directory. Other servers let you put scripts anywhere you want.
Typically-whether required by the server or not-Webmasters put all the scripts in one place. This directory is usually part of the Web server's tree, often just one level beneath the Web server's root. By far the most common directory name is CGI-BIN, a tradition started by the earliest servers that supported CGI. UNIX hacks will like the BIN part, but because the files are rarely named *.Bin and often aren't in binary format anyway, the rest of the world rolls its eyes and shrugs. Today, servers usually let you specify the name of the directory and often support multiple CGI directories for multiple virtual servers (that is, one physical server that pretends to be many different ones, each with its own directory tree).
Suppose that your UNIX Web server is installed so that the fully qualified path name is /usr/bin/https/Webroot. The CGI-BIN directory would then be /usr/bin/https/Webroot/cgi-bin. That's where you, as Webmaster, put the files. From the Web server's point of view, /usr/bin/https/Webroot is the directory tree's root. So if there was a file in that directory named Index.html, you'd refer to that file with an /index.html URL. A script called Myscript.pl in the CGI-BIN directory would be referred to as /cgi-bin/myscript.pl.
On a Windows or Windows NT server, much the same thing happens.
The server might be installed in C:\Winnt35\System32\Https, with
a server root of D:\Webroot. You'd refer to the file Default.htm
in the server root as /Default.htm; never mind that its real location
is D:\Webroot\Default.htm. If your CGI directory is D:\Webroot\Scripts,
you'd refer to a script called Myscript.exe as /Scripts/Myscript.exe.
| NOTE |
Although URL references always use forward slashes-even on Windows and Windows NT machines-file paths are separated by backslashes here. On a UNIX machine, both types of references use forward slashes |
For the sake of simplicity, assume that your server is configured to look for all CGI scripts in one spot and that you've named that spot CGI-BIN off the server root. If your server isn't configured that way, you might want to consider changing it. For one thing, in both UNIX and Windows NT, you can control the security better if all executables are in one place (by giving the server process execute privileges only in that directory). Also, with most servers, you can specify that scripts may run only if they're found in the CGI-BIN directory. This lets you keep rogue users from executing anything they want from directories under their control.
CGI scripts, by their very nature, place an extra burden on the Web server. They're separate programs, which means the server process must spawn a new task for every CGI script that's executed. The server can't just launch your program and then sit around waiting for the response; chances are good that others are asking for URLs in the meantime. So the new task must operate asynchronously, and the server has to monitor the task to see when it's done.
The overhead of spawning a task and waiting for it to complete is usually minimal, but the task itself will use system resources-memory and disk-and also will consume processor time slices. Even so, any server that can't run two programs at a time isn't much of a server. But remember the other URLs being satisfied while your program is running? What if there are a dozen or a hundred of them, and what if most of them are also CGI scripts? A popular site can easily garner dozens of hits almost simultaneously. If the server tries to satisfy all of them and each one takes up memory, disk, and processor time, you can quickly bog your server down so far that it becomes worthless.
There's also the matter of file contention. Not only are the various processes (CGI scripts, the server itself, plus whatever else you may be running) vying for processor time and memory, they may be trying to access the same files. For example, a guestbook script may be displaying the guestbook to three browsers while updating it with the input from a fourth. (There's nothing to keep the multiple scripts running from being the same script multiple times.) The mechanisms for ensuring a file is available-locking it while writing and releasing it when done-all take time: operating system time and simple computation time. Making a script foolproof this way also makes the script bigger and more complex, meaning longer load times and longer execution times.
Does this mean you should shy away from running CGI scripts? Not at all. It just means you have to know your server's capacity, plan your site a bit, and monitor performance on an ongoing basis. No one can tell you to buy a certain amount of RAM or to allocate a specific amount of disk space. Those requirements will vary based on what server software you run, what CGI scripts you use, and what kind of traffic your server sees. However, following are some rules of thumb for several operating systems that you can use as a starting point when planning your site.
The best present you can buy your Windows NT machine is more memory. While Windows NT Server runs with 12M of RAM, it doesn't run well until it has 16M and doesn't shine until it has 32M-64M. Adding RAM beyond that probably won't make much difference unless you're running a few very hungry applications, such as SQL Server. If you give your server 32M of RAM, a generous swap file, and a fast disk, it should be able to handle a dozen simultaneous CGI scripts without sweating or producing a noticeable delay in response. In most circumstances, it also helps to change Windows NT Server's memory management optimization from the default Maximize Throughput for File Sharing to Balance. This tells Windows NT to keep fewer files in cache, so more RAM is immediately available for processes.
Of course, the choice of programming language will affect each variable greatly. A tight little C program hardly makes an impact, whereas a Visual Basic program, run from a wrapper and talking to a SQL Server back end, will gobble up as much memory as it can. Visual Basic and similar development environments are optimized for ease of programming and best runtime speed, not small code and quick loading. If your program loads seven DLLs, an OLE control, and an ODBC driver, you may notice a significant delay. Scripts written in a simpler programming environment, though, such as C or Perl, run just as fast on Windows NT as they do on a UNIX system and often much faster because of Windows NT's multithreaded and preemptive scheduling architecture.
UNIX machines are usually content with significantly less RAM than Windows NT computers, for a number of reasons. First, most of the programs, including the operating system itself and all its drivers, are smaller. Second, it's unusual, if not downright impossible, to use an X Windows program as a CGI script. This means that the resources required are far fewer. Maintenance and requisite system knowledge, however, are far greater. There are trade-offs in everything, and what UNIX gives you in small size and speed, it more than makes up with complexity. In particular, setting Web server permissions and getting CGI to work properly can be a nightmare for the UNIX novice. Even experienced system administrators often trip over the unnecessarily arcane configuration details. After the system is set up, though, adding new CGI scripts goes smoothly and seldom requires adding memory.
If you give your UNIX computer 16M of RAM and a reasonably fast hard disk, it will run quickly and efficiently for any reasonable number of hits. Database queries will slow it down, just as they would if the program weren't CGI. Due to UNIX's multiuser architecture, the number of logged-on sessions (and what they're doing) can significantly affect performance. It's a good idea to let your Web server's primary job be servicing the Web rather than users. Of course, if you have capacity left over, there's no reason not to run other daemons, but it's best to choose processes that consume resources predictably so that you can plan your site.
Of course, a large, popular site-say, one that receives several hits each minute-will require more RAM, just as on any platform. The more RAM you give your UNIX system, the better it can cache, and therefore, the faster it can satisfy requests.
A CGI application is much more like a system utility than a full-blown application. In general, scripts are task-oriented rather than process-oriented. That is, a CGI application has a single job to do: It initializes, does its job, and then terminates. This makes it easy to chart data flow and program logic. Even in a GUI environment, the application doesn't have to worry much about being event-driven: The inputs and outputs are defined, and the program will probably have a top-down structure with simple subroutines.
Programming is a discipline, an art, and a science. The mechanics of the chosen language, coupled with the parameters of the operating system and the CGI environment, make up the science. The conception, the execution, and the elegance (if any) can be either art or science. But the discipline isn't subject to artistic fancy and is platform-independent. This section deals mostly with programming discipline, concentrating on how to apply that discipline to your CGI scripts.
When your script is invoked by the server, the server passes information to the script via environment variables and, in the case of POST, via STDIN. GET and POST are the two most common request methods you'll encounter, and probably the only ones you'll need to deal with. (HEAD and PUT are also defined but seldom used for CGI.) The request method tells your script how it was invoked; based on that information, the script can decide how to act. The request method used is passed to your script via the environment variable called, appropriately enough, REQUESTMETHOD.
| URL Encoding |
The HTTP 1.0 specification calls for URL data to be encoded in such a way that it can be used on almost any hardware and software platform. Information specified this way is called URL-encoded; almost everything passed to your script by the server will be URL-encoded. Parameters passed as part of QUERYSTRING or PATHINFO will take the form variable1=value1&variable2=value2 and so forth, for each variable defined in your form. Variables are separated by the ampersand. If you want to send a real ampersand, it must be escaped, that is, encoded as a two-digit hexadecimal value representing the character. Escapes are indicated in URL-encoded strings by the percent sign. Thus, %25 represents the percent sign itself. (25 is the hexadecimal representation of the ASCII value for the percent sign.) All characters above 127 (7F hexidecimal) or below 33 (21 hexidecimal) are escaped by the server when it sends information to your CGI program. This includes the space character, which is escaped as %20. Also, the plus sign needs to be interpreted as a space character. Before your script can deal with the data, it must parse and decode it. Fortunately, these are fairly simple tasks in most programming languages. Your script scans through the string looking for an ampersand. When found, your script chops off the string up to that point and calls it a variable. The variable's name is everything up to the equal sign in the string; the variable's value is everything after the equal sign. Your script then continues parsing the original string for the next ampersand, and so on, until the original string is exhausted. After the variables are separated, you can safely decode them, as follows: When the server passes data to your form with the POST method, check the environment variable called CONTENTTYPE. If CONTENTTYPE is application/x-www-form-urlencoded, then your data needs to be decoded before use. |
The basic structure of a CGI application is simple and straightforward: initialization, processing, output, and termination. Because this section deals with concepts, flow, and programming discipline, I'll use pseudocode rather than a specific language for the examples.
Ideally, a script follows these steps in order (with appropriate subroutines for do-initialize, do-process, and do-output):
Real life is rarely this simple, but I'll give the nod to proper form while acknowledging that you'll seldom see it.
Initialization The first thing your script must do when it starts is determine its input, environment, and state. Basic operating-system environment information can be obtained the usual way: from the system registry in Windows NT, from standard environment variables in UNIX, from .Ini files in Windows, and so forth.
State information will come from the input rather than the operating environment or static variables. Remember: Each time CGI scripts are invoked, it's as if they've never been invoked before. The scripts don't stay running between calls. Everything must be initialized from scratch, as follows:
| NOTE |
Although GET and POST are the only currently defined operations that apply to CGI, you may encounter PUT or HEAD from time to time if your server supports it and the user's browser uses it. PUT was offered as an alternative to POST but never received approved RFC status and isn't in general use. HEAD is used by some browsers to retrieve just the headers of an HTML document and isn't applicable to CGI programming. Other oddball request methods may be out there too. Your code should check explicitly for GET and POST and refuse anything else. Don't assume that if the request method isn't GET then it must be POST or vice versa |
The following is the initialization phase in pseudocode:
retrieve any operating system environment values desired allocate temporary storage for variables if environment variable REQUESTMETHOD equals "GET" then retrieve contents of environment variable QUERYSTRING; if QUERYSTRING is not null, parse it and decode it; else if REQUESTMETHOD equals "POST" then retrieve contents of environment variable QUERYSTRING; if QUERYSTRING is not null, parse it and decode it; retrieve value of environment variable CONTENTLENGTH; if CONTENTLENGTH is greater than zero, read CONTENTLENGTH bytes from STDIN; parse STDIN data into separate variables; retrieve contents of environment variable CONTENTTYPE; if CONTENTTYPE equals application/x-www-form-urlencoded then decode parsed variables; else if REQUESTMETHOD is neither "GET" nor "POST" then report an error; deallocate temporary storage; terminate end if
Processing After initializing its environment by reading and parsing its input, the script is ready to get to work. What happens in this section is much less-rigidly defined that during initialization. During initialization, the parameters are known (or can be discovered), and the tasks are more or less the same for every script you'll write. The processing phase, however, is the heart of your script, and what you do here will depend almost entirely on the script's objectives.
| Row, Row, Row Your Script |
In the UNIX world, a character stream is a special kind of file. STDIN and STDOUT are character streams by default. The operating system helpfully parses streams for you, making sure that everything going through is proper seven-bit ASCII or an approved control code. Seven-bit? Yes. For HTML, this doesn't matter. However, if your script sends graphical data, using a character-oriented stream means instant death. The solution is to switch the stream over to binary mode. In C, you do this with the setmode function: setmode(fileno(stdout), O_BINARY). You can change horses in mid-stream with the complementary setmode(fileno(stdout), O_TEXT). A typical graphics script will output the headers in character mode and then switch to binary mode for the graphical data. In the Windows NT world, streams behave the same way for compatibility reasons. A nice simple \n in your output gets converted to \r\n for you when you write to STDOUT. This doesn't happen with regular Windows NT system calls, such as WriteFile(); you must specify \r\n explicitly if you want CRLF. Those who speak mainly UNIX will frown at the term CRLF, while those who program on other platforms might not recognize \n or \r\n. CRLF meet \r\n. \r is how C programmers specify a carriage return (CR) character. \n is how C programmers specify a line feed (LF) character. (That's Chr$(10) for LF and Chr$(13) for CR to you Basic programmers.) Alternate words for character mode and binary mode are cooked and raw, respectively; those in the know will use these terms instead of the more common ones. Whatever words you use and on whatever platform, there's another problem with streams: by default, they're buffered. Buffered means that the operating system hangs onto the data until a line-terminating character is seen, the buffer fills up, or the stream is closed. This means that if you mix buffered printf() statements with unbuffered fwrite()or fprintf() statements, things will probably come out jumbled even though they may all write to STDOUT. Printf() writes buffered to the stream; file-oriented routines output directly. The result is an out-of-order mess. You may lay the blame for this at the feet of backward compatibility. Beyond the existence of many old programs, streams have no reason to default to buffered and cooked. These should be options that you turn on when you want them, not turn off when you don't. Fortunately, you can get around this problem with the statement setvbuf(stdout, NULL, _IONBF, 0), which turns off all buffering for the STDOUT stream. Another solution is to avoid mixing types of output statements; even so, that won't make your cooked output raw, so it's a good idea to turn off buffering anyway. Many servers and browsers are cranky and dislike receiving input in drabs and twaddles. |
The following is a pseudocode representation of a simple processing phase whose objective is to recapitulate all the environment variables gathered in the initialization phase:
output header "content-type: text/html\n" output required blank line to terminate header "\n" output "<HTML>" output "<H1>Variable Report</H1>" output "<UL>" for each variable known output "<LI>" output variable-name output "=" output variable-value loop until all variables printed output "</UL>" output "</HTML>"
This has the effect of creating a simple HTML document containing a bulleted list. Each item in the list is a variable, expressed as name=value.
Termination Termination is nothing more than cleaning up after yourself and quitting. If you've locked any files, you must release them before letting the program end. If you've allocated memory, semaphores, or other objects, you must free them. Failure to do so may result in a "one-shot wonder" of a script: one that works only the first time. Worse yet, your script may hinder-or even break-the server itself or other scripts by failing to free up resources and release locks.
On some platforms, most noticeably Windows NT and to a lesser extent UNIX, your file handles and memory objects are closed and reclaimed when your process terminates. Even so, it's unwise to rely on the operating system to clean up your mess. For instance, under Windows NT, the behavior of the file system is undefined when a program locks all or part of a file and then terminates without releasing the locks.
Make sure that your error-exit routine, if you have one (and you should), knows about your script's resources and cleans up just as thoroughly as the main exit routine does.
Now that you've seen a script's basic structure, you're ready to learn how to plan a script from the ground up:
Step 1, of course, is this section's topic, so let's look at that process in more depth:
Here's a brief overview of the standard environment variables you're likely to encounter. Each server implements the majority of them consistently, but there are variations, exceptions, and additions. In general, you're more likely to find a new, otherwise undocumented variable rather than a documented variable omitted. The only way to be sure, though, is to check your server's documentation.
This section is taken from the NCSA specifications and is the closest thing to "standard" as you'll find. The following environment variables are set each time the server launches an instance of your script and are private and specific to that instance:
CGI programmers face two portability issues: platform independence and server independence. By platform independence, I mean the capability of the code to run without modification on a hardware platform or operating system different from the one for which it was written. Server independence is capability of the code to run without modification on another server using the same operating system.
Platform Independence The best way to keep your CGI script portable is to use a commonly available language and avoid platform-specific code. It sounds simple, right? In practice, this means using either C or Perl and not doing anything much beyond formatting text and outputting graphics.
Does this leave Visual Basic, AppleScript, and UNIX shell scripts out in the cold? Yes, I'm afraid so, for now. However, platform independence isn't the only criterion to consider when selecting a CGI platform. There's also speed of coding, ease of maintenance, and ability to perform the chosen task.
Certain types of operations simply aren't portable. If you develop for 16-bit Windows, for instance, you'll have great difficulty finding equivalents on other platforms for the VBX and DLL functions you use. If you develop for 32-bit Windows NT, you'll find that all your asynchronous Winsock calls are meaningless in a UNIX environment. If your shell script does a system() call to launch grep and pipe the output back to your program, you'll find nothing remotely similar in the Windows NT environment. And AppleScript is good only on Macintoshes.
If one of your mandates is the capability to move code among platforms with a minimum of modification, you'll probably have the best success with C. Write your code using the standard functions from the ANSI-C libraries and avoid making other operating system calls. Unfortunately, following this rule will limit your scripts to very basic functionality. If you wrap your platform-dependent code in self-contained routines, however, you minimize the work needed to port from one platform to the next. As you saw in the section "Planning Your Script," when talking about encapsulation, a properly designed program can have any module replaced in its entirety without affecting the rest of the program. Using these guidelines, you may have to replace a subroutine or two, and you'll certainly have to recompile; however, your program will be portable.
Perl scripts are certainly easier to maintain than C programs, mainly because there's no compile step. You can change the program quickly when you figure out what needs to be changed. And there's the rub: Perl is annoyingly obtuse, and the libraries tend to be much less uniform-even between versions on the same platform-than do C libraries. Also, Perl for Windows NT is fairly new and still quirky, although the most recent versions are much more stable.
Server Independence Far more important than platform independence (unless you're writing scripts only for your own pleasure) is server independence. Server independence is fairly easy to achieve, but for some reason seems to be a stumbling block to beginning script writers. To be server independent, your script must run without modification on any server using the same operating system. Only server-independent programs can be useful as shareware or freeware, and without a doubt, server independence is a requirement for commercial software.
Most programmers think of obvious issues, such as not assuming that the server has a static IP address. The following are some other rules of server independence that, although obvious once stated, nevertheless get overlooked time and time again:
When you talk about CGI libraries, there are two possibilities: libraries of code you develop and want to reuse in other projects, and publicly available libraries of programs, routines, and information.
Personal Libraries If you follow the advice given earlier in this chapter in the "Planning Your Script" section about writing your code in a black box fashion, you'll soon discover that you're building a library of routines that you'll use over and over. For instance, after you puzzle out how to parse out URL-encoded data, you don't need to do it again. And when you have a basic main() function written, it will probably serve for every CGI program you ever write. This is also true for generic routines, such as querying a database, parsing input, and reporting runtime errors.
How you manage your personal library depends on the programming language you use. With C and assembler, you can precompile code into actual .Lib files, with which you can then link your programs. Although possible, this likely is overkill for CGI and doesn't work for interpreted languages, such as Perl and Visual Basic. (Although Perl and VB can call compiled libraries, you can't link with them in a static fashion the way you can with C.) The advantage of using compiled libraries is that you don't have to recompile all of your programs when you make a change to code in the library. If the library is loaded at runtime (a DLL), you don't need to change anything. If the library is linked statically, all you need do is relink.
Another solution is to maintain separate source files and simply include them with each project. You might have a single, fairly large, file that contains the most common routines while putting seldom used routines in files of their own. Keeping the files in source format adds a little overhead at compile time but not enough to worry about, especially when compared to the time savings you gain by writing the code only once. The disadvantage of this approach is that when you change your library code, you must recompile all your programs to take advantage of the change.
Nothing can keep you from incorporating public-domain routines into your personal library either. As long as you make sure that the copyright and license allow you to use and modify the source code without royalties or other stipulations, then you should strip out the interesting bits and toss them into your library. Well-designed and well-documented programs provide the basis for new programs. If you're careful to isolate the program-specific parts into subroutines, there's no reason not to cannibalize an entire program's structure for your next project.
You can also develop platform-specific versions of certain subroutines and, if your compiler will allow it, automatically include the correct ones for each type of build. At the worst, you'll have to manually specify which subroutines you want.
The key to making your code reusable this way is to make it as generic as possible. Not so generic that, for instance, a currency printing routine needs to handle both yen and dollars, but generic enough that any program that needs to print out dollar amounts can call that subroutine. As you upgrade, swat bugs, and add capabilities, keep each function's inputs and outputs the same, even when you change what happens inside the subroutine. This is the black box approach in action. By keeping the calling convention and the parameters the same, you're free to upgrade any piece of code without fear of breaking older programs that call your function.
Another technique to consider is using function stubs. Say that you decide eventually that a single routine to print both yen and dollars is actually the most efficient way to go. But you already have separate subroutines, and your old programs wouldn't know to pass the additional parameter to the new routine. Rather than go back and modify each program that calls the old routines, just "stub out" the routines in your library so that the only thing they do is call the new, combined routine with the correct parameters. In some languages, you can do this by redefining the routine declarations; in others, you actually need to code a call and pay the price of some additional overhead. But even so, the price is far less than that of breaking all your old programs.
Public Libraries The Internet is rich with public-domain sample code, libraries, and precompiled programs. Although most of what you'll find is UNIX-oriented (because it has been around longer), there's nevertheless no shortage of routines for Windows NT.
Here's a list of some of the best sites on the Internet with a brief description of what you'll find at each site. This list is far from exhaustive. Hundreds of sites are dedicated to or contain information about CGI programming. Hop onto your Web browser and visit your favorite search engine. Tell it to search for "CGI" or "CGI libraries" and you'll see what I mean. To save you the tedium of wading through all the hits, I've explored many them for you. The following are the ones that struck me as most useful:
I could go on listing sites forever it seems, but that's enough to get you started.
By far, the biggest limitation of CGI is its statelessness. As you learned at the beginning of this chapter, an HTTP Web server doesn't remember callers between requests. In fact, what appears to the user as a single page may actually be made up of dozens of independent requests: either all to the same server or to many different servers. In each case, the server fulfills the request and then hangs up and forgets the user ever dropped by.
The capability to remember what a caller was doing the last time through is called remembering the user's state. HTTP, and therefore CGI, doesn't maintain state information automatically. The closest things to state information in a Web transaction are the user's browser cache and a CGI program's cleverness. For example, if a user leaves a required field empty when filling out a form, the CGI program can't pop up a warning box and refuse to accept the input. The program's only choices are to either output a warning message and ask the user to hit the browser's back button or output the entire form again, filling in the value of the fields that were supplied and letting the user try again, either correcting mistakes or supplying the missing information.
There are several workarounds for this problem, none of them terribly satisfactory. One idea is to maintain a file containing the most recent information from all users. When a new request comes through, hunt up the user in the file and assume the correct program state based on what the user did the last time. The problems with this idea are that it's very hard to identify a Web user, and a user may not complete the action, yet visit again tomorrow for some other purpose. An incredible amount of effort has gone into algorithms to maintain state only for a limited time period, a period that's long enough to be useful, but short enough not to cause errors. However, these solutions are terribly inefficient and ignore the other problem, identifying the user in the first place.
You can't rely on the user to provide his identity. Not only do some want to remain anonymous, but even those who want you to know their names can misspell it from time to time. Okay, then, what about using the IP address as the identifier? Not good. Everyone going through a proxy uses the same IP address. Which particular employee of Large Company, Ltd., is calling at the moment? You can't tell. Not only that, but many people these days get their IP addresses assigned dynamically each time they dial in. You certainly don't want to give Joe Blow privileges to Jane Doe's data just because Joe got Jane's old IP address this time.
The only reliable form of identity mapping is that provided by the server, using a name-and-password scheme. Even so, users simply won't put up with entering a name and password for each request, so the server caches the data and uses one of those algorithms mentioned earlier to determine when the cache has gone invalid.
Assuming that the CEO of your company hasn't used his first name or something equally guessable as his password and that no one has rifled his secretary's drawer or looked at the yellow sticky note on his monitor, you can be reasonably sure that when the server tells you it's the CEO, then it is the CEO. So then what? Your CGI program still has to go through hoops to keep your CEO from answering the same questions repeatedly as he queries your database. Each response from your CGI program must contain all the information necessary to go backward or forward from that point. It's ugly and tiresome, but necessary.
The second main limitation inherent in CGI programs is related to the way the HTTP specification is designed around delivery of documents. HTTP was never intended for long exchanges or interactivity. This means that when your CGI program wants to do something like generate a server-pushed graphic, it must keep the connection open. It does this by pretending that multiple images are really part of the same image.
The poor user's browser keeps displaying its "connection active" signal, thinking it's still in the middle of retrieving a single document. From the browser's point of view, the document just happens to be extraordinarily long. From your script's point of view, the document is actually made up of dozens, perhaps hundreds, of separate images, each one funneled through the pipe in sequence and marked as the next part of a gigantic file that doesn't really exist anywhere.
Perhaps when the next iteration of the HTTP specification is released, and when browsers and servers are updated to take advantage of a keep-alive protocol, we'll see some real innovation. In the meantime, CGI is what it is, warts and all. Although CGI is occasionally inelegant, it's nevertheless still very useful and a lot of fun.
The tips, techniques, examples, and advice this book gives you will get you going immediately with your own scripts. You should be aware, however, that the CGI world is in a constant state of change, more so perhaps, than most of the computer world. Fortunately, most servers will stay compatible with existing standards, so you won't have to worry about your scripts not working. Here's a peek at the brave new world coming your way.
Java comes from Sun Microsystems as an open specification designed for platform-independence. Java code is compiled by a special Java compiler to produce bytecodes that can run on a Java Virtual Machine. Rather than produce and distribute executables as with normal CGI (or most programs), Java writers distribute instructions that are interpreted at runtime by the user's browser. The important difference here is that whereas CGI scripts execute on the server, a Java applet is executed by the client's browser. A browser equipped with a Java Virtual Machine is called a Java Browser. Netscape Navigator 2.0 and later, among other browsers, supports Java.
If you're interested in reading the technical specifications, you'll find that http://java.sun.com/whitePaper/java-whitepaper-1.html has pages worth of mind-numbingly complete information:
Following the incredible popularity of the Internet and the unprecedented success of companies such as Netscape, Microsoft has entered the arena and declared war. With their own Web server, their own browsers, and a plethora of backend services-and don't forget unparalleled marketing muscle and name recognition-Microsoft is going to make an impact on the way people look at and use the Internet.
Along with some spectacular blunders, Microsoft has had its share of spectacular successes. One such success is Visual Basic, the all-purpose, anyone-can-learn-it Windows programming language. VB was so successful that Microsoft made it the backbone of their office application suite. Visual Basic for Applications (VBA) has become the de facto standard scripting language for Windows. While not as powerful as some other options (Borland's Delphi in some regards, or C programs in general), VB nevertheless has two golden advantages: It's easy to learn, and it has widespread support from third-party vendors and users.
When Microsoft announced it was getting into the Web server business, no one was terribly surprised to learn that they intended to incorporate VB or that they wanted everyone else to incorporate VB, too. VB Script, a subset of VBA, is now in prerelease design, but thousands of developers are feverishly busy playing with it and getting ready to assault the Internet with their toys.
You can get the latest technical specifications from http://www.microsoft.com/vbscript/vbsmain.htm.
VB Script, when it obtains Internet community approval and gets
implemented widely, will remove many of the arcane aspects from
CGI programming. No more fussing with C++ constructors or worrying
about stray pointers. No concerns about a crash bringing the whole
system down. No problems with compatibility. Distribution will
be a snap because everyone will already have the DLLs or will
be able to get them practically anywhere. Debugging can be done
on the fly, with plain-English messages and help as far away as
the F1 key. Code runs both server-side and client-side, whichever
makes the most sense for your application. Versions of the runtimes
will soon be available for Sun, HP, Digital, and IBM flavors of
UNIX, and are already available to developers for Windows 95 and
Windows NT. What's more, Microsoft is licensing VB Script for
free to browser developers and application developers. They want
VB Script to become a standard.
| VB Script |
On the CD-ROMs accompanying this book, you'll find Aclist.exe and Vbsdoc.exe. Aclist.exe is a self-extracting archive file containing all the runtime DLLs, source code examples, and ActiveX controls currently available for VB Script. Vbsdoc.exe is a self-extracting archive containing all the documentation for VB Script. |
So where's the rub? All that, if true, sounds pretty good-even wonderful. Well, yes; it is, but VB applications of whatever flavor have a two-fold hidden cost: RAM and disk space. With each release, GUI-based products tend to become more powerful and more friendly but also take up more disk space and more runtime memory. And don't forget that managing those resources in a GUI environment also racks up computing cycles, mandating a fast processor. Linux users with a 286 clone and 640K of RAM won't see the benefits of VB Script for a long, long time.
Although text-only UNIX machines don't comprise a large share of the paying market, they do nevertheless make up a large percentage of Internet users. Historically, the Internet community has favored large, powerful servers rather than large, powerful desktops. In part, this is due to the prevalence of UNIX on those desktops. In a text-based environment where the most demanding thing you do all day is the occasional grep, processing power and RAM aren't constant worries. As much as early DOS machines were considered "loaded" if they had 640K RAM, UNIX machines in use today often use that amount-or even less-for most applications. Usually, only high-end workstations for CAD-CAM or large LAN servers come equipped with substantial RAM and fast processors.
In the long run, of course, such an objection is moot. Within a few years, worries about those with 286s will be ludicrous; prices keep falling while hardware becomes more powerful. Anyone using less than a Pentium or fast RISC chip in the year 2000 won't get anyone's sympathy. But my concern isn't for the long run. VB Script will be there, along with a host of other possibilities as yet undreamed, and we'll all have the microprocessor horsepower to use and love it. But in the meantime, developers need to keep current users in mind and try to keep from disenfranchising them. The Internet thrives on its egalitarianism. Just as a considerate Webmaster produces pages that can be read by Lynx or Netscape Navigator, developers using Microsoft's fancy-and fascinating-new tools must keep in mind that many visitors won't be able to see their work for now.
The Virtual Reality Modeling Language (VRML) produces some spectacular effects. VRML gives you entire virtual worlds, at least interactive, multi-participant, real-time simulations thereof. Or rather, it will give you those things someday. Right now, the 1.0 specification can only give you beautiful 3-D images with properties such as light source direction, reactions to defined stimuli, levels of detail, and true polygonal rendering.
VRML isn't an extension to HTML but is modeled after it. Currently, VRML works with your Web browser. When you click a VRML link, your browser launches a viewer (helper application) to display the VRML object. Sun Microsystems and others are working on integrating VRML with Java to alleviate the awkwardness of this requirement.
The best primer on VRML I've found is at http://vrml.wired.com/vrml.tech/vrml10-3.html. When you visit, you'll find technical specifications, sample code, and links to other sites. Also of interest is a theoretical paper by David Raggett at Hewlett-Packard. You can find it at http://vrml.wired.com/concepts/raggett.html.
You'll also want to visit the VRML Repository at http://www.sdsc.edu/vrml. This well-maintained and fascinating site offers demos, links, and technical information you won't find elsewhere.
Objects in VRML are called nodes and have characteristics: perspective, lighting, rotation, scale, shape hints, and so on. The MIME type for VRML files is x-world/x-vrml; you'll need to find and download viewers for your platform and hand-configure your browser to understand that MIME type.
VRML objects aren't limited to graphics. Theoretically, VRML can be used to model anything: MIDI data, waveform audio data, textures, and even people, eventually.
Of particular interest in the area of VRML is the notion of location independence. That is, when you visit a virtual world, some bits of it may come from your own computer, some objects from a server in London, another chunk from NASA, and so forth. This already happens with normal Web surfing; sometimes the graphics for a page come from a different server than does the text or only the page counter might be running on another server. While handy, this capability doesn't mean much for standard Web browsing. For processor-intensive applications such as virtual reality modeling, however, this type of independence makes client/server computing sensible and practical. If your machine needs only the horsepower to interpret and display graphics primitives, while a hundred monster servers are busy calculating those primitives for you, it just might be possible to model aspects of reality in real-time.
Process Software has proposed a standard called ISAPI (Internet Server Application Programming Interface), which promises some real advantages over today's CGI practices. You can read the proposal for yourself at http://www.microsoft.com/intdev/inttech/isapi.htm or contact Process Software directly at http://www.process.com.
In a nutshell, the proposal says that it doesn't make sense to spawn external CGI tasks the traditional way. The overhead is too high, the response time too slow, and coordinating the tasks burdens the Web server. Instead of using interpreted scripts or compiled executables, Process proposes using DLLs (dynamic link libraries). DLLs have a number of advantages:
Process Software has gone beyond proposing the specification; they've implemented it in Purveyor, their own server software. I've tried it, and they're right: CGI done through an ISAPI DLL performs much faster than CGI done the traditional way. There are even ISAPI wrappers: DLLs that let established CGI programs use the new interface.
Microsoft's Internet Information Server (IIS) uses ISAPI DLLs. There are already dozens of third-party freeware and shareware ISAPI DLLs available, and Microsoft provides several tutorials, examples, and guidelines at http://www.microsoft.com/developer/tech/internet/server/isfilter.htm.
My guess is that it won't be long before you see ISAPI implemented on all Windows NT servers. Eventually, it will become available for UNIX-based servers, too.