by Eric Ladd
One of the new sensations on the Web today is an individualized page, set up according to your own specifications-but, how do you let the server know what your specs are? Is there a way that users can provide information to servers and get a personalized response in return?
The answer, of course, is "yes," and the way to do it is with World Wide Web forms. Forms gather data from Web surfers using a variety of different input fields or controls, many of which are similar to controls found in Windows and Macintosh operating systems. The server receives data and then hands it off to a separate program for processing. The output of the separate program is typically an HTML page constructed with information provided on the form. The custom-generated HTML page is sent back to the user's client program via the server.
This chapter examines how to create Web forms and gives an overview of some of the behind-the-scenes activity that has to occur to produce the custom pages that Web users have come to love.
Forms are the visible or "front-end" portion of interactive pages. Users enter information into form fields and click a button to submit the data. The browser then packages the data, opens an HTTP connection, and sends the data to a server. Things then move to the transparent or "back-end" part of the process.
Web servers are programs that know how to distribute Web pages. They are not programmed to be able to process data from every possible form, so the best they can do is to hand off the form data to a program that does know what to do with it. This hand-off occurs with the help of the Common Gateway Interface or CGI-a set of standards by which servers communicate with external programs.
The program that processes the form data is typically called a CGI script or a CGI program. The script or program performs some manipulations of the data and composes a response-typically an HTML page. The response page is handed back to the server (via CGI) which, in turn, passes it along to the browser that initiated the request.
Forms and CGI are opposite sides of the same coin. Both are essential
to create interactive pages, but it is the forms side of the coin
that the user sees.
| NOTE |
When a CGI script or program composes an HTML page, it is said to be generating HTML on the fly. The ability to generate pages on the fly is what makes custom responses to database and forms submission possible. |
HTML's form support is simple and complete. A handful of HTML tags create the most popular elements of modern graphical interfaces, including text windows, check boxes and radio buttons, pull-down menus, and push buttons.
Composing HTML forms might sound like a complex task, but you
need to master surprisingly few tags to do it. All form-related
tags occur between the <FORM> and </FORM>
container tags. If you have more than one form in an HTML document,
the closing </FORM> tag is essential for distinguishing
between the multiple forms.
| TIP |
Adding a </FORM> tag immediately after creating a <FORM> tag is a good practice; then you can go back to fill in the contents. Following this procedure helps you avoid leaving off the closing tag once you've finished. |
Each HTML form has three main components: the form header, one or more named input fields, and one or more action buttons.
The form header and the <FORM> tag are actually
one and the same. The <FORM> tag takes the three
attributes shown in Table 11.1. The ACTION attribute
is required in every <FORM> tag.
| Attribute | Purpose |
| ACTION | Specifies the URL of the processing script |
| ENCTYPE | Supplies the MIME type of a file used as form input |
| METHOD=GET|POST | Tells the browser how it should send the form data to the server |
ACTION ACTION is set equal to the URL of the processing script so that the browser knows where to send the form data once it is entered. Without it, the browser would have no idea where the form data should go.
The ACTION URL can also contain extra path information at the end of it. The extra path information passes on to the script so that it can correctly process the data. The extra path information is not found anywhere on the form so it is transparent to the user. Allowing for the possibility of extra path information, an ACTION URL has the following form:
protocol://server/path/script_file/extra_path_info
You can use the extra path information to pass an additional file name or directory information to a script. For example, on some servers, the image map facility uses extra path information to specify the name of the map file. The name of the map file follows the path to the image map script. A sample URL might be http://www.your_firm.com/cgi-bin/imagemap/homepage.
The name of the script is imagemap, and homepage is the name of the map file used by the image map.
METHOD=GET|POST METHOD specifies the HTTP method to use when passing the data to the script and can be set to values of GET or POST. When you're using the GET method, the browser appends the form data to the end of the URL of the processing script. The POST method sends the form data to the server in a separate HTTP transaction.
METHOD is not a mandatory attribute of the <FORM>
tag. In the absence of a specified method, the browser uses the
GET method.
| CAUTION |
Some servers may have operating environment limitations that prevent them from processing an URL that exceeds a certain number of characters-typically 1 kilobyte of data. This limitation can be a problem when you're using the GET method to pass a large amount of form data. Because the GET method appends the data to the end of the processing script URL, you run a greater risk of passing an URL that's too big for the server to handle. If URL size limitations are a concern on your server, you should use the POST method to pass form data. |
ENCTYPE The ENCTYPE attribute was introduced by Netscape for the purpose of providing a file name to be uploaded as form input. You set ENCTYPE equal to the MIME type expected for the file being uploaded. ENCTYPE does not create the input field for the file name; rather, it just gives the browser a heads-up as to what kind of file it is sending. When prompting for a file to upload, you'll need to use an <INPUT> tag with TYPE set equal to FILE.
As an example of the three <FORM> tag attributes, examine the following HTML:
<FORM ACTION="process_it.cgi" METHOD=POST ENCTYPE="text/html"> Enter the name of the HTML file to validate: <INPUT TYPE="FILE" NAME="html_file"> <INPUT TYPE="SUBMIT" VALUE="Validate it!"> </FORM>
The form header of this short form instructs the server to process the form data using the program named process_it.cgi. Form data is passed using the POST method and the expected type of file being submitted is an HTML file.
The named input fields typically comprise the bulk of a form.
The fields appear as standard GUI controls such as text boxes,
check boxes, radio buttons, and menus. You assign each field a
unique name that eventually becomes the variable name used in
the processing script.
| TIP |
If you are not coding your own processing scripts, be sure to sit down with your programmer to agree on variable names. The names used in the form should exactly match those used in coding the script. |
You can use several different GUI controls to enter information
into forms. The controls for named input fields appear in Table
11.2.
| Field Type | HTML Tag |
| Text Box | <INPUT TYPE="TEXT"> |
| Password Box | <INPUT TYPE="PASSWORD"> |
| Checkbox | <INPUT TYPE="CHECKBOX"> |
| Radio Button | <INPUT TYPE="RADIO"> |
| Hidden Field | <INPUT TYPE="HIDDEN"> |
| Images | <INPUT TYPE="IMAGE"> |
| File | <INPUT TYPE="FILE"> |
| Text Window | <TEXTAREA>...</TEXTAREA> |
| Menu | <SELECT>...<OPTION>...</SELECT> |
You'll notice in Table 11.2 that the <INPUT> tag
handles the majority of named input fields. <INPUT>
is a stand-alone tag that, thanks to the many values of its TYPE
attribute, can place most of the fields you need on your forms.
<INPUT> also takes other attributes, depending
on which TYPE is in use. These additional attributes
are covered for each type, as appropriate, over the next several
sections.
| NOTE |
The <INPUT> tag and other tags that produce named input fields just create the fields themselves. You, as the form designer, must include some descriptive text next to each field so that users know what information to enter. You may also need to use line breaks, paragraph breaks, and non-breaking space to create the spacing you want between form fields. |
| TIP |
Because browsers ignore white space, lining up the left edges of text input boxes on multiple lines is difficult because the text to the left of the boxes is of different lengths. In this instance, HTML tables are invaluable. By setting up the text labels and input fields as cells in the same row of an HTML table, you can produce a nicely formatted form. To learn more about forms using table conventions, consult Chapter 12, "Tables." |
Text and Password Fields Text and password
fields are simple data entry fields. The only difference between
them is that text typed into a password field appears
on-screen as asterisks (*).
| CAUTION |
Using a password field may protect users' passwords from the people looking over their shoulders, but it does not protect the password as it travels over the Internet. To protect password data as it moves from browser to server, you need to use some type of encryption or similar security measure. Authentication of both the server and client by using signed digital certificates are two other steps you can take to keep Internet transactions secure. |
The most general text or password field is produced by the HTML (attributes in square brackets are optional):
<INPUT TYPE="{TEXT|PASSWORD}" NAME="Name" [VALUE="default_text"]
[SIZE="width"] [MAXLENGTH="wmax_idth"]>
The NAME attribute is mandatory because it provides a unique identifier for the data entered into the field.
The optional VALUE attribute allows you to place some default text in the field, rather than have it initially appear blank. This capability is useful if a majority of users will enter a certain text string into the field. In such cases, you can use VALUE to put the text into the field, thereby saving most users the effort of typing it.
The optional SIZE attribute gives you control over how
many characters wide the field should be. The default SIZE
is typically 20 characters, although this number can vary from
browser to browser. MAXLENGTH is also optional and allows
you to specify the maximum number of characters that can be entered
into the field.
| NOTE |
Previously, the SIZE attribute took the form SIZE="width,height" where setting a height (other than 1) produced a multiline field. With the advent of the <TEXTAREA>...</TEXTAREA> tag pair for creating multiline text windows, height has become something of a vestige and is ignored by most browsers. |
Figure 11.1 shows a form used to prompt a login ID and password. Notice how password text appears as asterisks. The corresponding HTML is shown in Figure 11.2.
Figure 11.1 : Text and password fields enable you to create a login page for your site.
Check Boxes Check boxes are used to provide users with several choices. Users can select as many choices as they want. An <INPUT> tag that is used to produce a check box option has the following syntax:
<INPUT TYPE="CHECKBOX" NAME="Name" VALUE="Value" [CHECKED]>
Each check box option is created by its own <INPUT> tag and must have its own unique NAME. If you give multiple check box options the same NAME, the script has no way to determine which choices the user actually made.
The VALUE attribute specifies which data is sent to the server if the corresponding check box is chosen. This information is transparent to the user. The optional CHECKED attribute preselects a commonly selected check box when the form is rendered on the browser screen.
Figure 11.3 shows a page with several check box options. The HTML that produces the check boxes is shown in Figure 11.4.
| NOTE |
If they are selected, check box options show up in the form data sent to the server. Options that are not selected do not appear. |
Radio Buttons When you set up options with a radio button format, you should make sure that the options are mutually exclusive so that a user won't try to select more than one.
The HTML code to produce a set of three radio button options is as follows:
<INPUT TYPE="RADIO" NAME="Name" VALUE="VALUE1" [CHECKED]>Option 1<P> <INPUT TYPE="RADIO" NAME="Name" VALUE="VALUE2">Option 2<P> <INPUT TYPE="RADIO" NAME="Name" VALUE="VALUE3">Option 3<P>
The VALUE and CHECKED attributes work exactly the same as they do for check boxes, although you should have only one preselected radio button option. A fundamental difference with a set of radio button options is that they all have the same NAME. This is permissible because the user can select only one of the options.
An application of radio buttons is demonstrated in Figure 11.5; the corresponding HTML is in Figure 11.6.
Figure 11.6 : Each radio button option is created by an tag with TYPE set to RADIO.
Hidden Fields Technically, hidden fields are not meant for data input. You can send information to the server about a form without displaying that information anywhere on the form itself. The general format for including hidden fields is as follows:
<INPUT TYPE="HIDDEN" NAME="name" VALUE="value">
One possible use of hidden fields is to allow a single general script to process data from several different forms. The script needs to know which form is sending the data and a hidden field can provide this information without requiring anything on the part of the user.
Another application of hidden fields is for carrying input from
one form to another. This lets you split up a long form into several
smaller forms and still keep all of the user's input in one place.
| NOTE |
Because hidden fields are transparent to users, it doesn't matter where you put them in your HTML code. Just make sure they occur between the <FORM> and </FORM> tags that define the form that contains the hidden fields. |
Files You can upload an entire file to a server by using a form. The first step is to include the ENCTYPE attribute in the <FORM> tag. To enter a file name in a field, the user needs the <INPUT> tag with TYPE set equal to FILE:
<FORM ACTION="whatever.cgi" ENCTYPE="application/x-www-form-urlencoded"> What file would you like to submit: <INPUT TYPE="FILE" NAME="your_file"> ... </FORM>
Being able to send an entire file is useful when submitting a document produced by another program-for example, an Excel spreadsheet, a rsum in Word format, or just a plain Notepad text file.
Text and password boxes are used for simple, one-line input fields. You can create multiline text windows that function in much the same way by using the <TEXTAREA> and </TEXTAREA> container tags. The HTML syntax for a text window is as follows:
<TEXTAREA NAME="Name" [ROWS="rows"] [COLS="columns"]> Default_window_text </TEXTAREA>
The NAME attribute gives the text window a unique identifier just as it does with the variations on the <INPUT> tag. The optional ROWS and COLS attributes allow you to specify the dimensions of the text window as it appears on the browser screen. The default number of rows and columns varies by browser. For example, Internet Explorer uses three rows and thirty columns as defaults.
The text that appears between the <TEXTAREA> and </TEXTAREA> tags shows up in the input window by default. To type in something else, users need to delete the default text and enter their text.
Multiline text windows are ideal for entry of long pieces of text such as feedback comments or e-mail messages (see Figures 11.7 and 11.8). Some corporate sites on the Web that collect information on potential employees may ask you to copy and paste your entire r...sum... into multiline text windows!
The final technique for creating a named input field is to use the <SELECT> and </SELECT> container tags to produce pull-down or scrollable option menus (see Figures 11.9 and 11.10). The HTML code used to create a general menu is as follows:
Figure 11.10 : Each menu item is created with an
<SELECT NAME="Name" [SIZE="size"] [MULTIPLE]> <OPTION [SELECTED]>Option 1</OPTION> <OPTION [SELECTED]>Option 2</OPTION> <OPTION [SELECTED]>Option 3</OPTION> ... <OPTION [SELECTED]>Option n</OPTION> </SELECT>
In the <SELECT> tag, the NAME attribute
again gives the input field a unique identifier. The optional
SIZE attribute lets you specify how many options should
be displayed when the menu renders on the browser screen. If you
have more options than you have space to display them, you can
access them either by using a pull-down window or by scrolling
through the window with scroll bars. The default SIZE
is 1. If you want to let users choose more than one menu option,
include the MULTIPLE attribute. When MULTIPLE
is specified, users can choose multiple options by holding down
the Control key and clicking the options they want.
| NOTE |
If you specify the MULTIPLE attribute and SIZE=1, a one-line scrollable list box displays instead of a drop-down list box. This box appears because you can select only one item (not multiple items) in a drop-down list box. |
Each option in the menu is specified inside of its own <OPTION> container tag. If you want an option to be preselected, include the SELECTED attribute in the appropriate <OPTION> tag. The value passed to the server is the menu item that follows the <OPTION> tag unless you supply an alternative using the VALUE attribute. For example:
<SELECT NAME="STATE"> <OPTION VALUE="NY">New York</OPTION> <OPTION VALUE="DC">Washington, DC</OPTION> <OPTION VALUE="FL">Florida</OPTION> ... </SELECT>
In the preceding menu, the user clicks a state name, but it is the state's two-letter abbreviation that passes to the server.
The handy <INPUT> tag returns to provide an easy way of creating the form action buttons you see in many of the preceding figures. Buttons can be of two types: Submit and Reset. Clicking a Submit button instructs the browser to package the form data and send it to the server. Clicking a Reset button clears out any data entered into the form and sets all the named input fields back to their default values.
Regular Submit and Reset Buttons Any form you compose should have a Submit button so that users can submit the data they enter. The one exception to this rule is a form containing only one input field. For such a form, pressing Enter automatically submits the data. Reset buttons are technically not necessary but are usually provided as a user courtesy.
To create Submit or Reset buttons, use the <INPUT> tags as follows:
<INPUT TYPE="SUBMIT" VALUE="Submit Data"> <INPUT TYPE="RESET" VALUE="Clear Data">
Use the VALUE attribute to specify the text that appears on the button. You should set VALUE to a text string that concisely describes the function of the button. If VALUE is not specified, the button text reads Submit Query for Submit buttons and Reset for Reset buttons.
Using Images as Submit Buttons You can create a custom image to be a Submit button for your forms and you can set up the image so that clicking it instructs the browser to submit the form data (see Figures 11.11 and 11.12). To do this, you set TYPE equal to IMAGE in your <INPUT> tag and you provide the URL of the image you want to use with the SRC attribute:
Figure 11.12 : An tag with TYPE set to IMAGE is the key to creating your own Submit button.
<INPUT TYPE="IMAGE" SRC="images/submit_button.gif">
You can also use the ALIGN attribute in this variation of the <INPUT> tag to control how text appears next to the image (TOP, MIDDLE, or BOTTOM), or to float the image in the left or right margins (LEFT or RIGHT).
| Multiple Submit Buttons |
It's possible to have more than one Submit button on a form (see Figure 11.13), although there is not yet consistent browser support for multiple Submit buttons. You distinguish between Submit buttons by using the NAME attribute in the <INPUT> tags used to create the buttons. For example, you might have: <INPUT TYPE="SUBMIT" NAME="SEARCH" VALUE="Conduct Search"> to produce buttons that allow users to search the information they've entered or add the information they've entered to a database. Because there is only tentative support for multiple Submit buttons, you may want to hold off on implementing them until they are standard. |
It's possible to put more than one form on a single HTML page. The customized Microsoft Network page, shown in Figure 11.14, shows single-field forms that query various Web search engines. Each of these has its own form header, named input fields, and action buttons (see Figure 11.15). Closing off each form with a </FORM> tag is critical so that the browser distinguishes between one form and another.
Figure 11.15 : An individual form is required for each search field in Figure 11.14.
Once a user enters some form data and clicks a submit button, the browser does two things. First, it packages the form data into a single string, a process called encoding. Then it sends the encoded string to the server by either the GET or POST HTTP method. The next two sections provide some details on each of these steps.
When a user clicks the Submit button on a form, his or her browser gathers all the data and strings it together in NAME=VALUE pairs, each separated by an ampersand (&) character. This process is called encoding. It is done to package the data into one string that is sent to the server.
Consider the following HTML code:
<FORM ACTION="http://www.your_firm.com/cgi-bin/form.cgi" METHOD="POST">
<INPUT TYPE="TEXT" NAME="first">
<INPUT TYPE="TEXT" NAME="last">
<INPUT TYPE="SUBMIT">
</FORM>
If a user named Joe Schmoe enters his name into the form produced by the preceding HTML code, his browser creates the following data string and sends it to the CGI script:
first=Joe&last=Schmoe
If the GET method is used instead of POST, the same string is appended to the URL of the processing script, producing the following encoded URL:
http://www.server.com/cgi-bin/form.cgi?first=Joe&last=Schmoe
A question mark (?) separates the script URL from the
encoded data string.
| Storing Encoded URLs |
As you learned in the previous discussion of URL encoding, packaging form data into a single text string follows a few simple formatting rules. Consequently, you can fake a script into believing that it is receiving form data without using a form. To do so, you simply send the URL that would be constructed if a form were used. This approach may be useful if you frequently run a script with the same data set. For example, suppose you frequently search the Web index Yahoo for new documents related to the scripting language JavaScript. If you are interested in checking for new documents several times a day, you could fill out the Yahoo search query each time. A more efficient way, however, is to store the query URL as a bookmark. Each time you select that item from your bookmarks, a new query generates as if you had filled out the form. The stored URL would look like the following: http://search.yahoo.com/bin/search?p=JavaScript |
Further encoding occurs with data that is more complex than a single word. Such encoding simply replaces spaces with the plus character and translates any other possibly troublesome character (control characters, the ampersand and equal sign, some punctuation, and so on) to a percent sign, followed by its hexadecimal equivalent. Thus, the following string:
I love HTML!
becomes:
I+love+HTML%21
You have two ways to read the form data submitted to a CGI script, depending on the METHOD the form used. The type of METHOD the form used-either GET or POST-is stored in an environment variable called REQUEST_METHOD and, based on that, the data should be read in one of the following ways:
Web designers have discovered many ways to use forms to enhance users' experiences. This chapter closes with a quick look at some examples of creative uses of Web forms.
AltaVista has quickly become one of the most prolific online search indexes on the Web. AltaVista searches frequently return tens of thousands of results and can include Web documents and posts to Usenet newsgroups.
Figure 11.16 shows AltaVista's advanced search query form. The form uses two drop-down menus, two multiline text windows, two text fields, and a submit button to support users in composing their queries.
Figure 11.16 : How many results?AltaVista uses several form elements on its Advanced Query page.
If you went to Macromedia's User Conference in September 1996, you might have registered using the form shown in Figure 11.17. The extensive form collects attendee information, which registration option you'd like, what seminar you want to attend, and how you want to pay.
Figure 11.17 : Many technology industry conferences now permit online registration via a Web form.
Microsoft, Netscape, and other companies now offer customized pages each time users visit their sites. Users can supply information on how to configure the page and the company's server uses this information to generate a fresh and tailored page to the user at each visit. Figure 11.18 shows the form you fill out to set up your custom page with Excite.
| Creating Custom Pages |
Web sites support custom pages through the use of cookies. Cookies are bits of information that are stored on your hard drive by your browser at the instruction of a server. When a server needs the cookie information, it tells the browser to send the file on your hard drive. In the case of custom pages, the cookie file contains the customization parameters. When you contact a server where you have a custom page, the server asks your browser for your customization information and uses it to compose a page for you. While cookies are a compelling idea, they also create serious security issues. After all, you are letting another computer write to your hard drive. Both Netscape Navigator and Microsoft Internet Explorer can be set up to notify you whenever a cookie transmission is requested. You can choose to accept or reject the request based on your comfort level with the server making the request. As a hypothetical example of what a destructive cookie can do, consider the following. When a cookie is stored, it is accompanied by information that specifies a range of URLs for which the cookie is valid. Suppose a software company plants a cookie on your hard drive that is valid for URLs from a competitor's site and that will interfere with the cookies you would ordinarily receive from the competitor. This harms you because you're not able to see what the competitor firm has to offer and it harms the competitor firm because it can't get the word out about its products. While you can argue that the destructive cookie isn't erasing your files, it is probably setting up its creator company for a monopoly lawsuit! |
As Web surfers gain more confidence in the security of business transactions over the Internet, you'll see more and more online stores cropping up. Figure 11.19 takes you to Paramount Studios' Studio Store where you can purchase T-shirts, sweatshirts, hats, jackets, and mugs. As you select items, you carry them with you in a virtual shopping bag. When you're finished, click your way to the checkout counter where the items in your bag are tabulated.
You have a couple of options when it comes to paying for your purchases. One approach is to supply your credit card number, though many users are reluctant to do this for fear of having their card number intercepted by an ill-intentioned hacker. Another payment method is to use a service like CyberCash, which provides you with "digital money" to spend on the Web. Before CyberCash will transfer the digital money to the vendor, it authenticates both you and the vendor to make sure the right money is going to the right place. This approach is much less risky than providing a credit card number and you will see this idea really take off in the coming months.