Chapter 17

Proposed Additions to HTML

by Eric Ladd


CONTENTS

HTML has been continuously evolving since its introduction in the late 1980s. The HTML standard is an open standard which means, in part, that members of the entire Internet community are welcome to submit proposals for additional HTML tags. For the most part, however, companies such as Netscape and Microsoft, who produce browser software, have been the driving forces behind the introduction of many new tags.

Netscape and Microsoft aren't the only organizations that are extending HTML. Many other individuals and corporations are making proposals to the World Wide Web Consortium (W3C) for consideration in future releases of the standard. Some of these include the following:

In fact, it was James Seidman at Spyglass who developed the proposal for using the <MAP> and <AREA> tags for client-side image maps. Sun, the company that created Java, is interested in advancing the <APPLET> tag and other tags used to embed Java programs in Web pages. Even if they

are not producing browser software, each of these firms has a stake in the evolution of HTML as the Web content developer's primary tool.

When the W3C released HTML 3.2, it was made clear that there were still many issues to consider for later releases of the standard. Some of these revolve around proposals made for HTML 3.0 but which were not incorporated into the 3.2 standard. Others concern broader content-related issues like how to render mathematical characters on a browser screen or how to embed objects into documents. The resolution of these issues will continue to drive HTML to new heights.

This chapter examines some of the proposals still "on the drawing board," a number of which were proposed for HTML 3.0 but not adopted. Reading this chapter gives you a glimpse into HTML's future and an idea of some of the issues that standards bodies such as the W3C face when developing a standard.

NOTE
The W3C maintains an Activity Statement on HTML that you can read at http://www.w3.org/pub/WWW/MarkUp/Activity.

What Happened to HTML 3.0?
HTML 3.0 refers to a set of proposals made to the W3C for consideration in a new HTML standard. The proliferation of so many browser-specific HTML tags made the 3.0 draft so large that the W3C considered full implementation of all of the proposals unwise. In the W3C's words, "standardization and deployment of the whole proposal [would prove] unwieldy."
So as not to delay the advancement of the HTML standard, the W3C released HTML 3.2-a standard that included all of the functionality of HTML 2.0 plus some of the already widely deployed HTML extensions such as those for tables and floating images.
The HTML 3.0 draft has been allowed to expire and will not be maintained. However, many of the 3.0 proposals are still under consideration and may find their way into later versions of HTML.

Setting Up Search Ranges

The <RANGE> tag was proposed for HTML 3.0 but was not made part of the HTML 3.2 standard. According to the proposal, placing a <RANGE> tag in the document head allows you to set up a range in the document for searching. <RANGE> takes the CLASS attribute, which is set equal to SEARCH to set up a search range, and the FROM and UNTIL attributes, which designate the beginning and end of the search range. A sample <RANGE> tag might look like the following:

<RANGE CLASS=SEARCH FROM="start" UNTIL="finish">

The start and finish markers are set up in the body of the document using the <SPOT ID="start"> and <SPOT ID="finish"> tags at the points where you want the search range to begin and end, respectively.

NOTE
The proposal of the <RANGE> tag is a testament to how important it is that your documents be searchable. Make sure you do everything you can to make your documents friendly to robots and other indexing programs.

Setting Up Tab Stops

A number of HTML 3.0 proposals called for authors to have greater control over page layout. One interesting idea that was not made part of HTML 3.2 proposed the addition of a <TAB> tag, which allows you to set up your own tab stops in a document. To use a tab stop, you need first to define it using the ID attribute:

My first tab stop is <TAB ID="first">here, followed by some other text.

The preceding HTML sets up the first tab stop in front of the letter "h" in the word "here." To use the tab stop, you use the <TAB> tag with the TO attribute:

<TAB TO="first">This sentence starts below the word "here."

On the browser screen, the "T" in the word "This" is aligned directly below the "h" in the word "here."

With the implementation of cascading style sheets, which permit good control over indentation and other layout attributes, it is unclear as to whether the <TAB> tag will receive consideration for later standards.

Logical Text Styles

While there are no new logical styles in HTML 3.2, several of them were proposed as part of HTML 3.0. The styles are shown in Table 17.1. Because many of these proposals are still under consideration, it's still possible that you'll see any or all of these tags used in the future. All of the tags shown in Table 17.1 are container tags. The closing tags are left off in the interest of space.

Table 17.1  New Logical Styles Proposed in HTML 3.0

Style NameTag
Abbreviation<ABBREV>
Acronym<ACRONYM>
Author's name<AU>
Deleted text<DEL>
Inserted text<INS>
Person's name<PERSON>
Short quotation<Q>

NOTE
Recall that logical styles are often rendered differently on different browsers

Most of the new physical and logical styles are self-explanatory. Text marked with the <Q> style will appear in quotation marks appropriate to the document's language context. The <INS> and <DEL> styles are expected to be useful in the context of legal documents. The <PERSON> and <AU> styles mark a person's name for easier extraction by indexing programs.

Establishing the Document's Language Context

The World Wide Web is truly global, though it's easy to forget that if you're always preparing documents in the same language. If you've coded other-than-English-language HTML, you're probably aware of the challenge in creating a proper context for the language when you are using an English-language-based keyboard.

HTML 3.0 called for two different ways to change the language context of a document. The first is the <LANG> container tag. Text between <LANG> and </LANG> is modified by the browser to match the language context of the document (presumably set somewhere in the document head or in the <BODY> tag). For example, if the document's language context were set to Spanish, the Spanish greeting

<LANG>Hola!</LANG>

is rendered as

¡Hola!

Additionally, many HTML 3.0 tags were proposed to support the LANG attribute, which changes the language context over the effect of a tag. For example, the <Q> tag, discussed in the previous section, renders a quotation in double quotation marks if the language context is English. Because «and» are the quotation containers in Spanish, if you needed to change to Spanish, you use

El Presidente dijo <Q LANG="es">Gracias por votar por mi.</Q>

to produce

El Presidente dijo «Gracias por votar por mi.»

Expanded List Support

Some HTML 3.0 proposals also called for greater support for creating lists. Suggested improvements included the following:

List Headers

Lists frequently have titles over them, and the only way to put one there is with boldface type or a heading style. The HTML 3.0 draft included an <LH> container tag that encloses a list's title and automatically renders it in boldface over the list. The tag is also helpful for indexing programs by giving them an easy way to pluck off the list's title.

A list with a list header looks like the following:

<UL>
<LH>Web Browsers</LH>
<LI>Microsoft Internet Explorer</LI>
<LI>NCSA Mosaic</LI>
<LI>Netscape Navigator</LI>
<LI>Spyglass Mosaic</LI>
</UL>

Figure 17.1 shows what the list header looks like on the Internet Explorer screen.

Figure 17.1 : A boldface heading over your lists is easy to do with the tag.

Unordered List Extensions

The <UL> tag was to be greatly extended under HTML 3.0 to give more precise control over what bullet character to use when rendering a list. The most flexible option called for an SRC attribute that points to an image file containing a custom bullet character. Presumably, this permits the use of a custom bullet while sparing the author all of the alignment issues that can arise when using your own bullet character.

Another proposed bullet-related attribute was DINGBAT, taken from the name of the icon-based font. DINGBAT is set to a standard value representing one of the characters from the Dingbats character set. For example, you could set DINGBAT="QUESTION" to use the question mark icon as your bullet. This might be appropriate in an FAQ marked up as an unordered list.

The <UL> tag was to have a number of other interesting attributes under HTML 3.0. These are summarized in Table 17.2.

Table 17.2  Extended Attributes of the <UL> Tag

AttributePurpose
CLEAR=LEFT|RIGHT|ALLStarts the list clear of left, right, or both margins
PLAINSuppresses printing of bullet characters
WRAP=HORIZ|VERTUsed to create multicolumn lists either horizontally or vertically

NOTE
According to the proposal, you can also use the CLEAR attribute to compel browsers to leave a certain amount of space between many page elements and surrounding items. For example, CLEAR="5 en" leaves five en spaces between page elements and any other items around it.

Admonishments

The <NOTE> tag proposed for HTML 3.0 lets you set up admonishments such as notes, warnings, and cautions on your pages. The text of the admonishment appears between the <NOTE> and </NOTE> tags. Additionally, you can include an image with your admonishment using the SRC attribute of the <NOTE> tag. SRC and other attributes of <NOTE> are summarized in Table 17.3.

Table 17.3  Attributes of the <NOTE> Tag

AttributePurpose
CLASS=NOTE|CAUTION|WARNINGSpecifies the type of admonishment
SRC="url" Provides the URL of an image to precede the admonishment text
CLEAR=LEFT|RIGHT|ALLStarts the admonishment clear of left, right, or both margins

A sample admonishment might look like the following:

<NOTE CLASS=WARNING SRC="images/hand.gif" CLEAR=ALL>WARNING! You are
about to provide your credit card number to a non-secure server!</NOTE>

As compelling as it may be to have admonishments on your Web pages, the <NOTE> tag was not made part of the HTML 3.2 standard.

Footnotes

One HTML 3.0 proposal called for an <FN> tag used to define footnotes. To set up a footnote, you use the <FN> tag together with its ID attribute, as follows:

<FN ID="footnote1">SGML = Standard Generalized Markup Language</FN>

Then, you must tag the footnoted text with an <A> container tag that includes an HREF pointing to the footnote. For "footnote1" in the preceding example, we could tag every instance of the acronym SGML with

<A HREF="#footnote1">SGML</A> is the parent language of HTML.

When users click SGML, they see the footnote telling them what SGML stands for. The proposal calls for footnotes to be displayed in pop-up windows, though it isn't clear that all browsers will be able to support this. For example, a text-only browser such as Lynx would have to implement footnotes in a different way.

Non-Scrolling Banners

Banners are defined as regions in a document that should not scroll. The HTML 3.0 draft pointed to many of the same applications that frames are good for-logos, disclaimers and copyright notices, and navigation aids-as possible banners (see Figure 17.2).

Figure 17.2 : Navigational image maps are good choices for non-scrolling regions such as frames or banners.

HTML 3.0 called for two ways to place a banner in a document. You could reference externally defined banners by using the <LINK> tag in the document head. The REL attribute is set to BANNER and the HREF attribute is set to the URL of the document containing the banner information. For example,

<LINK REL="BANNER" HREF="http://www.your_firm.com/navigation.html">

Referencing an external banner provides the advantage of having to update only one file if changes need to be made.

You could also define a banner right in your document by using the <BANNER> and </BANNER> tags. Any text or graphics between these two tags become banner elements for your page.

NOTE
While these two approaches can provide the non-scrolling elements that frames can, it is not clear whether they permit good control over the placement of the elements. Remember that with frames, you can place multiple non-scrolling elements virtually anywhere on the screen. However, banners, as proposed, are much easier to implement and maintain than framed layouts.

Figures

The <FIG> tag was proposed as an alternative to the <IMG> tag for larger graphics, though it was not included in the HTML 3.2 specification. As you might expect, <FIG> requires the SRC attribute to specify the URL of the image file to be loaded. <FIG> can also take the attributes shown in Table 17.4. The BLEEDLEFT and BLEEDRIGHT values of the ALIGN attribute align the figure all the way to the left and right edges of the browser window, respectively.

Table 17.4  Attributes of the <FIG> Tag

AttributePurpose
SRC="url" Gives the URL of the image file to load
NOFLOWDisables the flow of text around the figure
ALIGN=LEFT|RIGHT|CENTER|
JUSTIFY|BLEEDLEFT|BLEEDRIGHT
Specifies an alignment for the figure
UNITS=unit_of_measureSpecifies a unit of measure for the WIDTH and HEIGHT attributes (default is pixels)
WIDTH=widthSpecifies the width of the image in units designated by the UNITS attribute
HEIGHT=heightSpecifies the height of the image in units designated by the UNITS attribute
IMAGEMAPDenotes the figure as an image map

The <FIG> tag is different from the <IMG> tag in that it has a companion </FIG> tag. Together, <FIG> and </FIG> can contain text, including captions and photo credits, which are rendered with the figure. Captions are enclosed with the <CAPTION> and </CAPTION> tags, and photo credits are enclosed with the <CREDIT> and </CREDIT> tags. Regular text found between the <FIG> and </FIG> tags wraps around the figure unless the NOWRAP attribute is specified.

Figure 17.3 shows an example of a photo with a caption, photo credit, and surrounding text. To accomplish the layout you see in the figure, the HTML author had to use a two-column table and a floating image. Once the <FIG> tag is fully implemented, layouts with figure captions, credits, and wrapping text will be much easier to create.

Figure 17.3 : Photos, along with their captions and credits, will be easier to place with the tag than with the <IMG> tag.

Another feature proposed for the <FIG> and </FIG> tag pair is the capability to overlay two images. This is accomplished with the <OVERLAY> tag, which specifies a second image to overlay the image given in the <FIG> tag. HTML to produce an overlay might look like the following:

<FIG SRC="main_image.gif" WIDTH=250 HEIGHT=186 ALIGN=LEFT>
     <OVERLAY SRC="overlay.gif">
     <P>The image to the left is actually two images,
     one on top of the other.</P>
</FIG>

According to the proposal, the <FIG> tag provides another method for implementing client-side image maps. The key to using the <FIG> and </FIG> tags for a client-side image map is that these tags can contain text that acts as an alternative to the image being placed by them. Thus, any text between the <FIG> and </FIG> tags is much like text assigned to the ALT attribute of the <IMG> tag. For example, the HTML

<IMG SRC="logo.gif" ALT="Company Logo" WIDTH=120 HEIGHT=80>

and

<FIG SRC="logo.gif" WIDTH=120 HEIGHT=80>
Company Logo
</FIG>

essentially do the same thing.

To implement a client-side image map with the <FIG> and </FIG> tags, you need to place the information previously found in the map file between these tags. This is done with the <A> tag as follows:

<FIG SRC="images/main.gif" IMAGEMAP>
<B>Select a portion of the site to visit:</B>
<UL>
<LI><A HREF="http://www.your_firm.com/geninfo.html"
SHAPE="rect 6,7,102,86">General Information</A></LI>
<LI><A HREF="http://www.your_firm.com/press.html"
SHAPE="circle 283,118,320,155">Press Releases</A></LI>
<LI><A HREF="http://www.your_firm.com/annrept.html" SHAPE="polygon 77,181,59,142,
  156,145,156,233,134,213,79,233,30,206"> Annual Report</A></LI> </UL> </FIG>

The HREF attribute in each <A> tag contains the URL to load when the user clicks a hot region, and the SHAPE attribute contains the information needed to define each hot region. SHAPE is assigned to the shape of the hot region, followed by a space, and then followed by the coordinates that specify the region. Each number in the coordinate list is separated by a comma.

SHAPE also has a secondary function in this setting. If the image file specified in the SRC attribute of the <FIG> tag is placed on the page, then the browser ignores any HTML between the <FIG> and </FIG> tags unless it is an <A> tag with a SHAPE attribute specified.

On the other hand, if the image is not placed, then the browser renders the HTML between the two tags. The result for the preceding HTML example is a bulleted list of links that can act as a text alternative for your image map. This is an important feature of client-side image maps done with the <FIG> and </FIG> tags: They degrade into a text alternative for non-graphical browsers, for browsers with image loading turned off, for browsers that don't support the <FIG> and </FIG> tags, or when the desired image file cannot be loaded.

TIP
If the <FIG> tag does become part of standard HTML, make sure that the alternative text between the <FIG> and </FIG> tags is formatted nicely into something like a list or a table. Users will appreciate this extra effort.

NOTE
The <FIG> tag approach to client-side image maps was passed over in HTML 3.2 in favor of the <MAP> and <AREA> tag approach.

Mathematical Symbols

The rendering of mathematical symbols and equations has always been tricky on the Web. Authors used to have to place symbols, Greek letters, and other mathematical characters as separate images. When you consider that a browser has to open a separate HTTP connection to download an image, it becomes easy to imagine how long it might take to download a page with heavy mathematical content. Clearly, then, a better way to publish mathematical documents on the Web is needed.

Tags and entities to support mathematical content were proposed for HTML 3.0, but they were not adopted into the 3.2 standard. They are still under consideration, though, and you should expect to see them incorporated into a later version of the standard. Indeed, W3C has formed an HTML Math Editorial Review Board to continue working on the math proposals. The board is comprised of representatives from symbolic computation software vendors, scientific publishers, and the American Mathematical Society.

The next two sections discuss the high points of mathematical HTML as proposed in the HTML 3.0 draft. Because the proposals are continually being updated and improved, the actual implementation may differ slightly from what is presented here, but any differences are likely to be minor.

NOTE
Mathematical HTML draws heavily from the LaTeX mathematical typesetting language. LaTeX users will find the conventions in the HTML math proposals to be very familiar.

Mathematical Tags

All mathematical content is enclosed between the <MATH> and </MATH> tags. <MATH> can take the CLASS attribute if the mathematical content is restricted to a certain mathematical sub-discipline:

<MATH CLASS="ALGEBRA.LINEAR">

Or it can take the CLASS attribute if the content is restricted to another branch of scientific study:

<MATH CLASS="PHYSICS">

A number of other tags are valid inside the <MATH> and </MATH> tags. These are summarized in Table 17.5.

Table 17.5  Mathematical HTML Tags

TagPurpose
<ABOVE>Places a line, arrow, or symbol over an expression
<ARRAY>Used to create matrices
<BAR>Places a bar over an expression
<BELOW>Places a line, arrow, or symbol under an expression
<BOX>Used for hidden grouping symbols
<DOT>Places a single dot over an expression
<DDOT>Places a double dot over an expression
<HAT>Places a hat (^) over an expression
<OVER>Places one expression over another
<ROOT>Used to render a root other than the square root
<SQRT>Used to render a square root sign
<SUB>Used to create a subscript
<SUP>Used to create a superscript
<TEXT>Inserts plain text inside a math element
<TILDE>Places a tilde (~) over an expression
<VEC>Denotes an expression as a vector by placing an arrow over it

Additionally, there are tags you can use to override the default text formatting inside the <MATH> and </MATH> tags. The <B> container tag renders its contents in boldface and the <T> container tag renders its contents in an upright font. <BT> combines the effects of the <B> and <T> tags.

TIP
A number of the tags in Table 17.5 have been abbreviated using SGML's SHORTREF capability. You can use underscore (_) for <SUB>, a caret (^) for <SUP>, an opening brace ({) for <BOX>, and a closing brace (}) for </BOX>. So to render x 2, you could use x<SUP>2</SUP> or, more simply, x^2.

Mathematical Entities

One of the greater obstacles to rendering mathematical content on a browser screen is all of the special characters needed. Even though HTML can handle any character in the ISO-Latin1 character set, there is still a need for characters to represent the following:

Each of these special characters has an HTML entity proposed to represent it in an HTML document. Recall that entities begin with an ampersand and end with a semicolon.

For example, you could use the HTML

&int;2x - 1 dx = x^2 - x + c

to produce

"2x - 1 dx = x 2 - x + c

Greek letter entities would be represented by spelling out their names. You can distinguish between uppercase and lowercase Greek letters by capitalizing the first letter in the spelled out name. For example, &pi; would be a lowercase pi (p) and &Pi; would be an uppercase pi (P).

What's Next

In its Activity Statement on HTML released when HTML 3.2 was released, the W3C identified the several areas where it will be concentrating its efforts. You should expect to see future releases of HTML that incorporate the following: