Introduction

This document is an introduction to the World Wide Web. It is intended to be a gentle primer for users who have heard of the Web and wish to learn more. It explains the concepts underlying the Web, and explains how to try it out for yourself. It is not intended to be a guide to providing information on the Web.

This document is available on the Web, as well as being posted fortnightly to the Usenet newsgroups comp.infosystems.www.users, alt.hypertext, comp.infosystems.www.providers and comp.infosystems.www.misc. It is available via anonymous ftp. For instructions on retrieving the latest version of this document, consult the last section, called ``How to obtain this document''.

This document was last revised on Mon Dec 12 17:31:17 NZD 1994 by Nathan Torkington.

Table of Contents

  1. Introduction
  2. Table of Contents
  3. The Vision of the Web
  4. What is in the Web
  5. How to See More
  6. Providing Information
  7. See Also
  8. Books with More Information
  9. How to obtain this document

The Vision of the Web

The World Wide Web is the vision of programs that can understand the numerous different information-retrieval protocols (FTP, Telnet, NNTP, WAIS, gopher, ...) in use on the Internet today as well as the data formats of those protocols (ASCII, GIF, Postscript, DVI, TeXinfo, ...) and provide a single consistent user-interface to them all. In addition, these programs would understand a new protocol (HTTP) and a new data format (HTML) both geared toward hypermedia.

The programs already exist --- ``Netscape'', ``Lynx'', ``Mosaic'', ``TkWWW'', ``Cello'' and CERN's ``LineMode Browser'' are in use at hundreds, if not thousands, of sites around the Internet today. The ability of the programs to understand existing protocols means that they can access the huge body of gopherspace, FTP files, WAIS databases and news articles already extant. In addition to this, large amounts of new hypertext is being introduced through HTTP and HTML.

What is in the Web

gopher is a similar system to the Web, but not as powerful. The gopher software implements its own protocol, with limited access to other protocols. gopherspace, as the information accessable through gopher is called, consists of menus which can contain text files, binary files, images, keyword-search items, or more menus. The principle limitation of gopher is that it can't exploit hypertext. See the entry on ``gopher'' in the section ``See Also'' for information on obtaining the gopher software.

Hypertext is a term first coined by Ted Nelson. It is the logical combination of computers and text --- a computer interface to text which allows cross-references to be followed. In a graphical situation, the user can follow cross-references by clicking with their mouse on the cross-referenced phrase. This would bring up the document at the ``other end'' of the cross-reference. Hypermedia is the extension of this to include graphics and audio as things which can be selected or viewed.

WAIS is a full-text database system produced by Thinking Machines Corp, and placed in the public domain. Full-text databases allow retreival of documents by specifying any of the words which occur in them. WAIS also gives document ranking and (with the appropriate extensions) boolean searches. WAIS servers communicate with users' programs via the ANSI standard Z39.50 protocol. See the entries on ``WAIS'' and ``Z39.50'' in the section ``See Also'' for information on obtaining the WAIS software.

FTP is the standard Internet protocol for copying files between computers. A very large amount of information is available via anonymous FTP, a variant of FTP where a set of files is made available for public access. See the entry on ``FTP'' in the section ``See Also'' for information on obtaining the source code to an FTP server. See the entry ``FTP by Mail'' for instructions on doing FTP through e-mail.

NNTP is a protocol used for moving around Usenet News. This is like the bulletin-board of the Internet (although plenty of non-Internet users also contribute), with articles being contributed on a wide variety of subjects. The articles are grouped into newsgroups depending on their content --- the author of an article specifies which newsgroup(s) it is to go in. See the entry on ``NNTP'' in the section ``See Also'' for information on obtaining the source code to an NNTP server.

Documents on the Web are referred to using URLs (Uniform Resource Locators). A URL looks like http://www.vuw.ac.nz/~gnat/ideas/www-primer.html. It consists of three parts --- the method of retrieving the document (http), an option machine name (www.vuw.ac.nz) and a pathname (/~gnat/ideas/www-primer.html). The URL format is an Internet standard --- see the entry ``URL'' in the section ``See Also'' to find more information on URLs.

How to See More

Several computers on the Internet have public-access World Wide Web clients accessable by telnet. Here is the current list:
telnet.w3.org
A telnettable browser provided by the W3 coalition.
ukanaix.cc.ukans.edu
Offers Lynx, a full screen browser which requires a vt100 terminal. Log in as www. Does not allow users to "go" to arbitrary URLs, so GET YOUR OWN COPY of Lynx and install it on your system if your administrator has not done so already. The best plain-text browser, so move mountains if necessary to get your own copy of Lynx!
www.njit.edu
(or telnet 128.235.163.2) Log in as www. A full-screen browser in New Jersey Institute of Technology. USA.
www.huji.ac.il
A dual-language Hebrew/English database, with links to the rest of the world. The line mode browser, plus extra features. Log in as www. Hebrew University of Jerusalem, Israel.
sun.uakom.cs
Slovakia. Has a slow link, only use from nearby.
www.tky.hut.fi
(or telnet 128.214.6.102). Log in as www. Offers several browsers, including Lynx (goto option is disabled there also).
fserv.kfki.hu
Hungary. Has slow link, use from nearby. Log in as www.
If you are interested in the Web, compile or FTP one of the browsers so you can browse from your own machine. Browsers exist for Amigas, IBM PCs, Unix, VMS, Macintoshes, and the X Window System. The entry on ``Browsers'' in the section ``See Also'' has a list of which browsers are available for which computers.

Providing Information

To add information to the Web, you need either a HTTP server, a gopher server, an FTP server, or a WAIS database server. These are all available in source-code via anonymous FTP (see the relevant entries in the section ``See Also'' for information on obtaining the source code for these servers). Which server you choose depends on your needs.

If you are only wanting to serve plain-ASCII databases, then install a WAIS server. If you want to serve unformatted text, with the option for WAIS searching, install a gopher server and WAIS. If you want to deliver hypertext, and speed is unimportant, use an FTP server (beware, though --- FTP is very slow for this). If you want to deliver hypertext with reasonable speed, use an HTTP server.

A thorough discussion of the merits and disadvantages of the various HTTP servers appears in the companion document ``An Information Provider's Guide to Web Servers''. See the section ``How to obtain this document'' for more information. What follows is a summarised version of that document.

There are four HTTP servers available without restriction to academic users. They are:

The CERN Server
This has mapping (ability to redirect requests), a security filter, and can act as a gateway to most things.
The NCSA Server
This is a small and simple server, with the ability to act as an annotation server as well. It can also understands the gopher setup, and can run on top of the same data.
Plexus
This is written in Perl (see the entry on ``Perl'' in the section ``See Also'' for more information on Perl) by Tony Sanders (sanders@bsdi.com). It comes with ArchiePlex, an archie gateway (see the entry on ``archie'' in the section ``See Also'' for more information on archie) and various calendar, manual page and finger gateways. It even has a converter from setext to HTML (see the entry on ``setext'' in the section ``See Also'' for more information on setext).
gn
This is a small server, written in C, able to provide both gopher and HTTP/1.0 access to the same data.

See the entry on ``Obtaining the servers'' in the section ``See Also'' for instructions on obtaining these HTTP servers. The newsgroup comp.infosystems.www.providers is a good place to ask questions for help on compilation and setup.

If you are serving hypertext to the Web, you will need to know about HTML (the HyperText Markup Language) and the converters that exist between HTML and RTF, LaTeX and others. See the document ``An Information Provider's Guide to HTML'', posted fortnightly to comp.infosystems.www.providers, for more information on HTML, and HTTP servers. The entry on ``HTML'' in the section ``See Also'' has more information on obtaining this document. Note that you don't need to know about HTML if you're not serving hypertext.

See Also

Archie
Archie is a database of files available via anonymous ftp. You can specify a filename, part of a filename, or a regular expression, and archie will give you the name of the computers that have the filename you asked for available via anonymous FTP. For more information on archie, see the file README available through anonymous FTP in the directory pub/archie/doc on archie.ans.net.
Browsers
MS-DOS users have several choices, depending on their software installation. Windows users, with an appropriate Winsock-compliant TCP/IP stack, should use Netscape, WinWeb, Cello or NCSA Mosaic. PC-NFS users with NIS and a suitably configured NIS server can run any of the above clients.

Macintosh users should try MacWeb, MacMosaic or MacWWW.

X Window System users should try Netscape or XMosaic --- XMosaic requires the Motif libraries to compile, but precompiled binaries are available for many platforms. Netscape is only provided in binary form. A similar interface is provided by TkWWW --- TkWWW uses the tcl/tk language and graphics libraries.

Unix users can obtain CERN's simple LineMode Browser. The browser Lynx, is harder to compile but looks better.

VMS users can use the LineMode browser or Lynx, or the VMS WWW browser (see ``VMS'' for information on obtaining this browser).

Cello
Cello is available via anonymous FTP from fatty.law.cornell.edu in the directory /pub/LII/Cello/.
CERN Server
The CERN server is available via anonymous FTP from info.cern.ch, in the directory /pub/www/src/ as WWWLineMode_XXX.tar.Z where XXX is a version number.
gn
gn is available via anonymous FTP from ftp.acns.nwu.edu in the directory /pub/gn/.
gopher
The gopher software is available via anonymous FTP from boombox.micro.umn.edu in the /pub/gopher/ directory. The gopher protocol is documented in RFC 1436 (see the entry ``RFC'' to find out how to obtain copies of RFCs).
FTP
The source for a reliable and useful FTP server is available via anonymous FTP from ftp.uu.net in the directory /networking/ftp/wuarchive-ftpd/
FTP By Mail
Send e-mail to mail-server@rtfm.mit.edu with ``send usenet/news.answers/finding-sources'' in the body.
HTML
HTML, the HyperText Markup Language, is document in the file html-spec.txt.Z, available via anonymous FTP from info.cern.ch in the directory /pub/www/doc/ or via anonymous FTP from ftp.uu.net in the directory /networking/info-service/www/doc/. The document ``An Information Provider's Guide to HTML'' is posted fortnightly to the same Usenet newsgroups as this document, and is available via FTP from the same places (see the section ``How to obtain this document'' for more information).
LineMode Browser
The PC-NFS version is available via anonymous FTP from info.cern.ch in the directory /pub/www/bin/pc-nfs/wwwpcnfs.zip. The source-code (which compiles under Unix and VMS) is available via anonymous FTP from info.cern.ch in the directory /pub/www/src/ as WWWLineMode_XXX.tar.Z, where XXX is a version number.
Lynx
Lynx requires the ``curses'' full-screen library, and is available via anonymous FTP from ftp2.cc.ukans.edu.
MacWWW
MacWWW is available via anonymous FTP from info.cern.ch in the directory /pub/www/bin/mac/.
Mosaic
The Mosaic suite are NCSA's clients, and they are leading the field at the moment. They are all GUI-based, and available via anonymous FTP from ftp.ncsa.uiuc.edu in /Web/. Versions exist for Macintoshes (System 7 needed), Microsoft Windows endowed PCs, and X11. The source to the X version is available, but requires Motif. Precompiled binaries are available for FTP for a number of architectures, as well as for the microcomputers.
NCSA Server
The NCSA server is available via anonymous FTP from ftp.ncsa.uiuc.edu in the directory /Web/httpd/ both as Unix source code and as a Microsoft Window binary.
Netscape
Try ftp.mcom.com and ftp2.mcom.com in the directory /netscape/.

Mirror sites are available --- if you cannot connect to ftp.com.com, you will be given a list of alternative sites to try.

NNTP
The Net-News Transfer Protocol (NNTP) is described in RFC 977 (see the subsection ``RFC'' for information on obtaining RFCs). Several implementations are available, the latest and most efficient is INN (available via anonymous FTP from ftp.uu.net in the directory /networking/news/nntp/inn/).
Obtaining the servers
See the relevant sections: CERN server, NCSA Server, gn and Plexus.
Perl
Perl is an interpreted language, especially good for text handling. It is available for anonymous FTP from ftp.uu.net in the directory /pub/languages/perl/ as perl.tar.gz.
Plexus
Plexus Version 2.2.1 is available via ftp to austin.bsdi.com, login: ftp, password: yourid, in plexus/2.2.1/dist/Plexus-2.2.1.tar.Z. Version 3.0i-beta (pre-release, not for production use) is available via ftp to austin.bsdi.com, login: ftp, password: yourid, in plexus/3.0-beta/prerelease/Plexus-3.0i.tar.Z.
Provider's Guide
The document ``An Information Provider's Guide to Web Servers'' is posted fortnightly to the same Usenet newsgroups as this document, and is available via FTP from the same places (see the section ``How to obtain this document'' for more information).
RFC
RFC stands for ``Request for Comments''. Internet RFCs are documentation of protocols, proposals, pipe-dreams and plans. They are numbered sequentially from 1, and are available for anonymous FTP from nic.ddn.mil in the directory /rfc/.
setext
setext stands for Structure Enhanced Text, and is a markup system that provides a way to format ASCII documents with visually unobtrusive anchors to parts of it above the paragraph level. More information is available on the Web as http://www.bsdi.com/setext/
TkWWW
TkWWW is available via anonymous FTP from any X11 site in the contrib/ directory --- TkWWW uses the tcl/tk language and graphics libraries. It has a rudimentary HTML editing system, and is actively under development.
URL
The draft URL specification is available via anonymous FTP from info.cern.ch in the directory /pub/www/doc/ as urlX.txt where X is a version number.
VMS
The Hebrew University of Jerusalem have a VMS browser tested under UCX/Multinet and UCX_APX (Alpha). It uses the VMS/SMG screen routines and is available via anonymous FTP from www.huji.ac.il in the directory /www/vms_client/.
WAIS
The Thinking Machines release of WAIS is available via anonymous FTP from ftp.uu.net in the directory /networking/info-service/wais/ as wais-8-b5.1.tar.Z. The CNIDR release of freeWAIS is available via anonymous FTP from ftp.cnidr.org in the directory /pub/NIDR.tools/ as freeWAIS-0.2.tar
Z39.50
ANSI standard Z39.50 is a standard for communication for information retrieval. The draft specification is available via anonymous FTP in the same place as either version of the WAIS source. The file is probably called z3950-spec.txt. ANSI charge for paper copies of the real standard.

How to obtain this document

The latest version of this document is always available on the Web as http://www.vuw.ac.nz/~gnat/ideas/www-primer.html, and the most recently posted ASCII version will be available via anonymous FTP from wuarchive.wustl.edu in the directory /doc/misc/www/.

This document is part of a series: ``World Wide Web Primer'', ``An Information Provider's Guide to HTML'', and ``An Information Provider's Guide to Web Servers''. The other documents in the series are available from the archives above.

Please send feedback to the author, Nathan Torkington, at the e-mail address Nathan.Torkington@vuw.ac.nz --- all discussion will be treated as public domain and may be used in future versions of this document.