Chapter 27

Adding Audio to Your Web Site

by Tom Lockwood


CONTENTS

Whether you are putting together Web sites for musicians, radio stations, or just want to use audio background sounds, sooner or later you will be asked to add audio to a Web site. This chapter is designed as a good starting point; and if your use of audio is modest, you will probably find all the information necessary for creating a nice Music Page in this chapter.

Audio files of various formats are only now becoming common on the Web. Yes, Music students and scientists have been putting audio data on the Internet for quite some time to share with their colleagues, but they have been the exception. The lack of audio on the Internet is directly related to the fact that audio is used minimally on computer systems in general; except for "system sounds," your computer is probably mute. If audio was a more intrinsic part of the way that people use application programs, then audio would have found its way onto the Web much earlier with a much more defined role.

Today, most sound files found on the Web are:

Recently a new category of music has appeared. Both Microsoft's Internet Explorer and Netscape Navigator support background music in one way or another. Just as it is difficult to go to a mall or your dentist without hearing Muzak, it will soon be difficult to visit a Web page without hearing WebMuzak.

How and Where to Acquire Audio Content

There are a variety of places you can acquire high quality audio content. Much of it is labeled "public domain." If it is not public domain, you can frequently get permission from the author to use a sample of their music.

Audio Content from the Internet

By using any search engine and keywords such as "audio clips," "MIDI files," and "aiff Beetles," you will quickly find many pages, with the type of music that you want, already in the proper format. Just because someone's Web site has a folder labeled "public_domain," does not relieve you of your responsibility to make sure the piece of music you are going to use does not fall under Copyright protection. If the audio clip you are considering is a sound effect or recording of birds singing, you are probably safe. However, beware of audio clips from Star Trek or TV shows that are labeled public domain.

Downloading audio content from Web pages is as easy as downloading images or other content. Audio files can be very large, so beware of long download times.

CAUTION
You should get copyright permission before placing content on the Internet. Copyright laws do apply to the Internet. Do not think that using audio clips will go unnoticed. ASCAP has been charging youth camps $500.00 per year to sing songs like "Happy Birthday To You" around the campfire; they have actually been enforcing these fees. If you use unauthorized content, you can expect to be the target of copyright litigation in the future (especially if you or your company has some money).

From CDs or DAT Tapes

A wealth of music is available on CD. There are many musicians and music compilation houses that will sell you a CD full of public domain music and sounds-generally for a very reasonable price ($50.00). In addition, you can "rent audio" on a "needle drop" basis from some sources. This is common in the television business, where video producers are always looking for background music and jingles for commercials. When you rent audio on a needle drop basis, you are actually buying it per "cut." That is, three 10 second segments will cost you three times more that a one minute segment. Pricing scales and practices vary.

Recorded from "Live Sources"

Narration, of course, is something that you can record yourself. Make sure that it meets the quality standards that people throughout your target audience will find acceptable. Creating professional quality audio is a complicated business. Don't think that by plugging a microphone into the back of your computer that you will be able to create a narration that sounds as good as something done in a recording studio, monitored by trained audio technicians. On the other hand, if all that you are concerned with is annotating an image, this low-quality approach may be completely acceptable to your audience.

Creating Your Own Music

If you are a musician, any of the previous concerns become less important. You can create your own music with old-fashioned analog instruments, then have the music digitized for use on your Web site. Or you can create and edit your own MIDI files using computer software, synthesizer, or a combination of the two. MIDI music files are now playable by most browsers and have many advantages over traditional audio files.

Choosing Appropriate Audio Formats

There are many file formats available to choose from. The first three formats described below (Au, Aiff, Wav) are all very similar in their characteristics and applications. MPEG, MIDI, and MOD, discussed at the end of this section, are interesting because of their differences and the way they represent audio information. As browsers become more powerful, the need for the user to be aware of these different formats becomes less important. It is still critical that you know something about the audio formats so that you make the correct choices when putting audio on Web pages.

Au

Au files use µlaw compression; this is a 2:1 compression ratio and is similar in its quality and file size to the Aiff and Wav formats. Au is a format originally employed on NEXT and Sun workstations. It is a very popular format and is supported by most browsers.

Aiff and Aifc

Aiff is a format common on Silicon Graphics UNIX Workstations and Apple Macintosh computer systems. It allows a variety of recording rates and bits. This format is supported by most browsers. Aifc (also referred to as Aiff-c) is a compressed version of Aiff.

Wav

Wav is a format found commonly on the Windows platform. It is supported by most browsers and has file sizes very comparable to Aiff and Au formats.

MPEG

MPEG is a format originally introduced as a video compression algorithm. It is interesting as an audio format because it provides significant compression with a minimal loss in audio quality. The process of compressing an audio file for MPEG is called encoding. Encoding is processor intensive. Once a file is encoded, it must then be decoded on the client side; and again, decoding is processor intensive. The advantage of MPEG is that you can get 8:1 audio compression that sounds very good (most people will never notice the difference between a compressed and uncompressed file), or 16:1 audio compression, which does not sound bad (audiophiles may swoon, but the rest of us may notice that the compressed file sounds slightly "hollow" or "stuffy").

Should You Use MPEG?
Should you use MPEG? Do you have the tools and the desire to encode audio files (the tools are available as shareware)? Do your clients have machines that are fast enough to decode MPEG files on-the-fly? Some testing is probably in order. If particular browsers do not support MPEG decoding, many decoders do exist that can be configured as helper applications for the browser.

With the growth of all multimedia content on the Web, you can expect that MPEG players and MPEG support will increase rapidly. After all, an MPEG audio file is simply a MPEG movie file without the video information.

MIDI

MIDI is altogether different from any of the previously discussed audio formats. MIDI contains the "notes" of a song rather than digitized sound. MIDI also indicates which "instruments" should play these notes. The advantage of MIDI is that the file size is very small in comparison to the other formats. The disadvantage is that the MIDI file will sound different depending on which audio samples are included with each system on the client side. For example, Beethoven's Moonlight Serenade may sound like it is being played on a grand piano on your computer, but sound like it is being played on a PlaySkool xylophone on my computer. If you are going to use MIDI files, it is a good idea to hear them on several platforms to make sure that you are, indeed, conveying the appropriate mood. How efficient is MIDI? The entire Moonlight Serenade, which last over five minutes, is a 12K MIDI file.

In order for MIDI to sound good, you need good "samples" of the instruments that are specified in the MIDI file. There are relatively inexpensive soundcards for Windows machines that provide extensive MIDI support. On the Macintosh, high quality MIDI sound is available in software using the QuickTime 2.0 Musical Instruments extension.

NOTE
MIDI files can be used "outside" of the computer system. Some users will have synthesizers connected to their computer systems, and in other cases, the synthesizers can accept MIDI files on a computer disk

.It is not unusual to find archives of MIDI formatted files that are a megabyte in size and that contain hundreds of songs.

Of course there are some obvious limitations to MIDI:

TIP
If you can find a MIDI sound that you like, you may consider using it as a background sound. MIDI files download very quickly and sound very clear.

MOD

Mod is a format much like MIDI; it is a series of notes and instruments. In addition, Mod files include the samples of the instruments that are being used. Because of this, Mod files are much larger than MIDI files. Mod does guarantee what the audio will sound like. At this time, Mod is not supported as a plug-in for browsers.

Saving Disk Space and Transfer Time

Audio files can be huge. One minute of CD quality sound will require 10M of disk storage space. Not only are you looking at storage problems at your end, but also unreasonable download times on you client's side. To save significant space, you must throw away some audio information (read quality). There is only one way that you can determine how much you can reduce the quality of the audio without bothering your audience: by listening to the audio at different rates. In general, audio files are very large. The following chart shows how you can reduce audio file size. All of the following files were created in the Aiff format (similar results will occur with µlaw and Wav files). The "Original File" is a 30 second audio file recorded from CD.

Rate
Channels
Bits
File Size
44.1 kHz
Stereo
16
5,160K
11.025 kHz
Stereo
16
1,290K
11.025 kHz
Stereo
 8
645K
11.025 kHz
Mono
 8
322K

Which Setting Should You Use?

You must be the judge. Listen to the audio quality and make the following decisions for your audience. Some inexpensive computer audio systems can handle no more than 8kHz mono 8-bit audio. Who is your client? Are they audiophiles or people looking for chocolate chip cookie recipes? One way around this problem is to include a "low quality" sample, then let the very interested visitor download a higher quality file.

Determining Audio File Size

There is a simple formula that you can use to calculate audio rate:

Bytes/channel * Channels/sample * Samples/sec. = bytes/sec.

Here is an example for CD quality stereo sound:

2 bytes/channel * 2 channels/sample * 44,100 samples/sec = 176,400bps

One second of CD quality audio requires 172.26K of storage space. A five minute song requires 50M of storage space. Most of your clients will be viewing these pages with 28.8Kbps modems or less. A good connection with a modem of this type can transfer 1M of data in about five minutes. This results in a transfer time of about four hours for a five minute CD quality file. Therefore, you must make some compromises between the audio quality you would like to provide and the amount of time it takes to download that quality. Silicon Graphics provides the following suggestions as a place to start in making your tough quality/file size decisions:

Content
Channels
Bits/Bytes per Channel
Rate
Bytes per Second
Speech
1
8/1
8 kHz
8,000bps
Monaural music
1
8/1
16 kHz
16,000bps
Stereo music
1
8/1
16 kHz
16,000bps

Putting Audio Clips on Your Web Pages

Placing audio clips on your Web pages can be accomplished in one of two ways. You can either include the clip as a selectable Hyperlink or you can include the clip as background music that loads and plays by default.

Including the Audio File as a Hyperlink

You can include an audio file as a hyperlink, just like any other type of file. To include audio make sure that the audio clip is in a format that can be played by all important browsers.

Include the audio clip's name in the <HREF> tag example:

<A HREF="openingdoor.aiff">Opening Door Sound Effect</A>

Make sure that the audio clip loads and plays properly on different platforms and with the versions of browsers that you are targeting.

Optimizing Your Web Site

Place all audio clips on your site in the same audio format (this may not be desirable if you want to share MIDI and other audio clips). Choose the format that you like and stick with it. This will be particularly welcome for visitors who have older browsers that do not support the file types you have on your site. To require users to get one plug-in (or helper application) to listen to content on your site is acceptable. To make them get two or three plug-ins is inconsiderate. If a new format becomes popular, you may want to include the file in two formats (Au and MPEG, for example).

Include as detailed a description as you think is necessary. Again, why is the clip there? With a good description, you may be able to inform the user if this is the file they want, or assure them that the long download will be worth the wait (see Figure 27.1).

Figure 27.1 : Always label your audio content clearly.

List the file size. Some visitors using a slow modem will not download a 10M audio file no matter how interesting it may be.

As browsers become more sophisticated, they can handle more and more different formats automatically. However, some of your visitors may not have the latest browsers, helper applications, or plug-ins installed. Because of this, you may want to include references to places where the appropriate plug-ins can be found, or at least a note about the file format.

Netscape Navigator and Internet Explorer Extensions

Netscape Navigator and Microsoft's Internet Explorer provide tags that extend and enhance the use of audio on Web pages.

Adding Audio as a Background Sound

Microsoft started the background music phenomena by introducing the nonstandard (enhanced) tag BGSOUND (background sound).The following code simply points to a sound file that is read and played:

<BGSOUND SRC="myvoice.wav">

Implementing audio in this way works only for the Internet Explorer browser. There is a more versatile tag, EMBED, that gives you a great deal of flexibility regarding how a background sound is played.

The EMBED tag is supported by most browsers. So, unless there is something that you only want Internet Explorer clients to hear, you should probably use the EMBED tag instead of the BGSOUND tag.

The EMBED tag can be used to play an audio file in the background by including it like this

<EMBED SRC="http://www.my.home/audiofile.aiff" HIDDEN=TRUE AUTOSTART=TRUE>

In the previous example, SRC points to the file's URL, the argument HIDDEN=TRUE indicates that there will be no controls present, and AUTOSTART=TRUE indicates that the audio will start playing in the background as soon as the file is loaded.

CAUTION
Make sure that your background audio files are small. Once a client leaves the page the audio file is on, it stops loading (he won't hear a thing). Don't go to a lot of work creating beautiful background sounds only to have no one hear them.

Adding Controls to Your Audio Files

With Netscape's LiveAudio and Microsoft's ActiveX, you can easily place audio controls on your Web pages. This section is not intended exclusively for programmers, as the steps for placing audio controls on your Web pages are straightforward.

There are several good reasons to add controls to your audio files:

To add an audio control panel to your Web page, use the following tag:

<EMBED SRC="URL to audio file" HEIGHT=60 WIDTH=144 CONTROLS=CONSOLE>

An actual control panel is implemented like this:

<EMBED SRC="moonlight.aiff" HEIGHT=60 WIDTH=144 CONTROLS=CONSOLE>

this command opens an audio file with a full size control console, which can be seen in Figure 27.2.

Figure 27.2 : A full size control console as it appears on the user's screen.

You can also add a smaller control console with the following lines:

<EMBED SRC="moonlight.aiff" HEIGHT=15 WIDTH=144 
CONTROLS=SMALLCONSOLE AUTOSTART=TRUE>

You will notice that HEIGHT is 15 instead of 60 and that SMALLCONSOLE is specified; this creates the control panel shown in the lower left hand corner of Figure 27.3. In addition, AUTOSTART is set to TRUE, which causes the audio from this clip to load and play at startup.

Figure 27.3 : A smaller control console is created by changing HEIGHT and adding SMALLCONSOLE.


NOTE
Placing an audio control for a background sound is very considerate. It is possible that the client does not want to hear the audio clip that you selected. If you do put a series of audio controls on one page, make sure that only one has AUTOSTART=TRUE unless you explicitly want more than one audio clip playing in the background simultaneously.

CAUTION
Each file that is specified with an EMBED command will be downloaded to the client when the Web page is read. This can lead to extreme download times. Please see the next section for a solution.

How to Keep Download Times Reasonable

If you follow the previous examples, you can create your own interactive audio Web page. But each clip may take a significant amount of time to download. So, what happens if you would like to put 25 audio clips on a Web page, each with its own small control console? Download time could be a problem. In order to work around this difficulty, you can include a JavaScript function that will let you defer the loading of the audio file until the play button is pushed.

This is a simple feature to implement as long as you pay attention to a couple of details.

First, create a file with the OnPlay function calling an audio file, for example,

<SCRIPT LANGUAGE=SoundScript>
OnPlay (http://URL/audiofile.mid);
</SCRIPT>

Save this script and name it as if it were an audio file (Newworldscript.mid for example).

Then, in your HTML document, include the script's name in place of an audio file's name in the EMBED command. The following is an example of how to do this.

The audio file that you want to play is newworld.mid.

  1. Create the three line text file:
    <SCRIPT LANGUAGE=SoundScript>
    OnPlay(newworld.mid);
    </SCRIPT>
  2. Save this file as Newworldscript.mid.

TIP
The name is not important (except for the file's extension). It is just easier to keep track of all of these files if you have a naming convention that you can recognize.

CAUTION
You must use the same extension on this file as the type of file that this script will call. Otherwise, the browser will get ready to play one file type, read a second type, and give you an "Invalid file type" error.

  1. In your HTML document, include an EMBED command similar to the ones shown earlier.
<EMBED SRC="newworldscript.mid" HEIGHT=15 WIDTH=144 CONTROLS=SMALLCONSOLE>

In this example, the audio file's controls will be present, but the actual file will not begin to download until you press the Play button.

Differences between Netscape and Internet Explorer

At the time of this writing, including a single EMBED command that references an audio file will have different effects in Netscape and Internet Explorer. The following is a standard EMBED tag:

<EMBED SRC="groovietune.au">

Internet Explorer recognizes the file format and provides default audio control for you (see Figure 27.4). In Netscape, nothing appears; Navigator requires that you specify a CONSOLE type; otherwise, no controls will appear. Therefore, make sure to include CONSOLE and a console type. That way, both Netscape Navigator and Internet Explorer clients will have an audio panel.

Figure 27.4 : Internet Explorer provides a default audio control for you when you use the EMBED tag.

Internet Explorer ignores CONSOLE and SMALLCONSOLE-it recognizes the file type by the extension used and provides a control panel.

Editing Audio Files and Format Conversion

The term Audio Editor covers a wide range of application programs that are used, in some way, to modify audio files. In some cases, they are simply conversion utilities letting you change between file formats. In many cases, Audio Editors provide a rich set of features for modifying sounds (see Figure 27.5). Effects like echoes, reverbs, and fades can be applied. Others serve as multitrack mixers.

Figure 27.5 : Sound Sculptor II by Jeff Smith is a very powerful Shareware multi-track audio editing application that runs on the Macintosh.

Many of these applications are Shareware or Freeware, meaning that you can get a lot of experience editing audio at a very reasonable price. Of course, there are commercial packages available, many of which are part of multimedia packages. After spending some time with these applications, you will be in a very good position to determine exactly what you need for your day-to-day use.

Avoiding that Pop

Each audio card is different. Some of these cards make a "snap" or "pop" when they are either initially powered or, more commonly, when the audio stream ends abruptly (see Figure 27.6). You can "fade out" your audio files to help reduce this problem. Most audio editing applications provide controls that let you perform a number of effects on your audio files. Let's look at how we can correct this problem.

Figure 27.6 : This audio file may cause a "pop" when your computer finishes playing it. Notice the abrupt end of the audio signal.

By making some adjustments in the audio editor we can modify the audio so that it fades out very smoothly (see Figure 27.7).

Figure 27.7 : Using Goldwave, by Cris S. Craig, the audio clip in Figure 27.6 has been modified so that it now fades out. With this simple modification the "pop" will be greatly reduced or eliminated.

Considering Real-time "Streaming" Audio

The process of playing audio or video over a network in real time is referred to as streaming. With a fast network and a dedicated server, streaming is not a problem. However, when network traffic is variable or network speed is slow, streaming becomes a tremendous challenge.

Why is Streaming Audio Important?

Streaming audio is being used in many ways. It is possible to net-cast live events like concerts and speeches. If you could distort time slightly, you could be a "Net Dead Head," traveling from San Francisco to Chicago to New Orleans, without ever leaving your living room.

Being able to transmit audio live provides a tremendous diversity of content: every radio station in the country, street sounds from your favorite corner in New Orleans, every keynote address at every industry meeting. All of them live.

Another advantage of streaming has nothing to do with the "live" part of the equation. Let's assume that you are doing research on Bill Clinton's speeches, and their topics, according to region. You could go to the President Clinton virtual museum (if one exists) and listen to all of his speeches. That would be possible today with non-streaming audio, but consider the problems. You must first download very large audio files to your hard disk drive, then play and listen to them, and, because of space constraints, delete them-only to find one week later that you need the information from the August 26th speech that you just deleted from your disk. With a library built upon streaming data, you simply connect to the virtual museum, find the file that you are looking for, then start to play it. When you have heard enough, you can stop it; if you want to replay part of the speech, you can rewind it, or you can fast forward to the section and topic that you want to listen to. A good museum would provide hyperlinks to different parts of the audio by topic and keyword.

Streaming audio changes the way we think about and use audio. Storage and transmission time, our two biggest problems, have been solved.

RealAudio

RealAudio is the pioneer in popularizing audio streaming over the Internet. Using RealAudio, encoding a one minute long 2.6M .Wav or .Au file can be reduced to either a 60K audio file, designed to be streamed over a 14.4Kbps connection, or a 113K audio file, designed to be streamed over a 28.8Kbps connection. There is no way that any compression scheme can get this magnitude of compression without sacrificing significant audio information. However, in the case of the 60K file, you can clearly hear a speaker's voice and, with the 113K file, you can hear mono music that sounds like a good AM radio station. RealAudio accomplishes the compression by throwing out a lot of information contained in the original audio clip. For example, the 60K files are optimized for the dynamic range of the human voice; dynamic range is intentionally limited so that the "important" audio information is not discarded.

In addition to encoding, RealAudio uses a different network protocol. RealAudio can use TCP (Transmission Control Protocol), but in most cases uses UDP (User Datagramm Protocol). While TCP assures complete transmission of data (the emphasis is on data integrity), it is rather poor at maintaining a constant rate of transmission (critical if you are trying to listen to audio). UDP, on the other hand, looses data frequently, but is much better than TCP at maintaining a constant rate. For streamed audio, this constant rate is more important than an occasional "drop out." Heavy network traffic will wreak havoc on any streamed data-You need a RealAudio Server. You cannot just place files in RealAudio format on your site and have them stream in real-time.

Macromedia's Shockwave

Shockwave is more than streamed audio. It is actually designed as a solution for delivering multimedia content over the network. Macromedia's Director product is wildly popular for creating and organizing multimedia presentations. It is only natural that streaming audio should be part of that solution. Shockwave provides compression rates that allow audio streaming. Shockwave operates using TCP so audio is transmitted completely. In order to combat some of the dropouts that you would invariably encounter using a TCP connection, Shockwave has incorporated features that let you specify how much of the audio file you want to download before the audio starts playing. That is, Shockwave creates an audio buffer protecting itself from network delays. Shockwave does not require special server software; however, it does require the purchase of the authoring software.

Which Product Is Better?

At the time of this writing, RealAudio is the more mature product and provides a greater feature set than does Shockwave (for example, the ability to pause, rewind, and do live broadcasts). RealAudio does require a special server. If you do not have access to a RealAudio server, you will have to find one that you can use (at least for your audio links). Shockwave does not require a special server. Both products require software purchases. If you are serious about putting together a site with lots of audio content, you should carefully evaluate these as well as other steaming audio products that may appear in the market. Preparing this type of content is a considerable time investment and you certainly want to make the best decision for your clients and your business model. Both RealAudio and Macromedia are constantly improving their products. For example, RealAudio has just added stereo support. So check around and see what is new. Some of these new features make dramatic improvements to your listening experience.

To get started using Real Audio, you need to invest almost $800.00 to provide a server that five people can access simultaneously. Providing a server that can stream up to one hundred simultaneous users will cost over $11,000.00. In addition, there are hardware concerns; you need a tremendously wide connection for your server once you start streaming audio. Please check the appropriate specifications at www.realaudio.com.

What Is the Future of Audio on the Internet?

You can expect that all common (and even some uncommon) audio formats will be supported by browsers in the future. This is a very competitive industry. Supporting additional audio formats is fairly easy and gives the browser companies something to talk about. As transmission moves from phone lines to cable, satellite, and other high bandwidth technologies, the size and format of audio files will become less important. What will remain are the two most important audio considerations: