Introducing the World Wide Web

The Documents

What kind of documents can a client retrieve and display? Web documents can contain text, images, animations, audio, PostScript, and other file formats. Most Web documents consist primarily of text with other types of files embedded inside.

HTML

For client software to format a document, a file must contain appropriate HyperText Markup Language (HTML) codes. HTML is a subset of the Standard Generalized Markup Language (SGML). An HTML file is an ASCII file with the HTML codes included. Formatting a document means that the client reads the HTML codes, interprets them, and determines how the document should look online. Clients vary on how they depict different elements on the screen, although most clients follow standard display conventions. (Most clients give the user the option of altering the display defaults so online presentation can be customized.)

HTML codes identify the elements of a file such as headings, paragraphs, or lists. Character elements (e.g., emphasis, variable) are also available. The elements of a file are identified with HTML tags--codes enclosed between angle brackets (< >). Most tags are paired with the closing tag including a slash. For example, <STRONG>important</STRONG> usually makes a word bold online (important); <H1>NCSA Mosaic</H1> may be displayed as NCSA Mosaic depending on the browser you are using.

A hyperlink to another file is created with a tag that includes complete information about the document's location. Images are included with another special tag.

URL

How does a client know what server to contact for a file? The document information such as server location and filename is included using a specific format--called a URL--that is embedded with a special HTML tag. URL stands for Uniform Resource Locator, which is a Web standard for providing the name of the server as well as the document's path and filename on the server. URLs take the form:

protocol://hostname.domain[:port]/path/filename

where protocol is one of the following (and port is optional):

Protocol
Server Type
file
your local system or an anonymous FTP (File Transfer Protocol) server
ftp
an anonymous FTP server
http
a World Wide Web server
gopher
a Gopher server
news
an NNTP news server
telnet
opens a telnet session
WAIS
a WAIS (Wide Area Information Server) server

Making the Connection

The URL is included in a document in a special HTML hyperlink tag:

<A href=http://www.cs.trinity.edu>Introducing the World Wide Web</a>
This coding makes the words "Introducing the World Wide Web" the online hyperlink that connects to the file index.html on the Trinity University Computer Science Web server www.cs.trinity.edu. The file is located in the /About/The_Courses/cs301/html/intro directory. (All directories are relative to the HTTP server software.)

The creation of links is handled by the author of an HTML file. Hyperlinks are inserted using the <A HREF> tag shown in the example above. In addition, when a URL is known (perhaps one colleague forwards a URL to another), it may be entered directly in a special field in the client. The client software then requests the new document.

Most clients display the URLs of hyperlinks before the reader clicks on the anchor to activate the retrieval process. Seeing the URL can be helpful because some important elements of the URL (particularly the hostname and domain) can be deciphered before activating the link. The host www.csi.uottawa.ca, for example, is located in Ontario, Canada. Your client will likely take longer to access distant servers, and the files may take longer to load. Image files (with a .gif or .xbm file extension) are usually large and also may take longer to load.

HTML Editors

How are HTML documents created? HTML files are actually just ASCII files with the special HTML codes. A file can be created with any software that can save a document as an ASCII file (e.g., your favorite word processor).

Fortunately, there are many software packages (called editors) available to help create HTML files. Editors designed specifically for this purpose eliminate the need for typing HTML tags by hand--usually HTML tags are on a menu or in a dialog box, and the software inserts the codes.



mosaic@ncsa.uiuc.edu