This document is an introduction to the World Wide Web. It is
intended to be a gentle primer for users who have heard of the Web and
wish to learn more. It explains the concepts underlying the Web, and
explains how to try it out for yourself. It is not intended to be a
guide to providing information on the Web.
This document is available on the Web, as well as being posted
fortnightly to the Usenet newsgroups
comp.infosystems.www.misc. It is available via anonymous ftp.
For instructions on retrieving the latest version of this document,
consult the last section, called ``How to obtain this document''.
This document was last revised on Mon Dec 12 17:31:17 NZD 1994
by Nathan Torkington.
Table of Contents
- Table of Contents
- The Vision of the Web
- What is in the Web
- How to See More
- Providing Information
- See Also
- Books with More Information
- How to obtain this document
The Vision of the Web
The World Wide Web is the vision of programs that can understand the
numerous different information-retrieval protocols (FTP, Telnet, NNTP,
WAIS, gopher, ...) in use on the Internet today as well as the data
formats of those protocols (ASCII, GIF, Postscript, DVI, TeXinfo, ...)
and provide a single consistent user-interface to them all. In
addition, these programs would understand a new protocol (HTTP) and a
new data format (HTML) both geared toward hypermedia.
The programs already exist --- ``Netscape'', ``Lynx'', ``Mosaic'',
``TkWWW'', ``Cello'' and CERN's ``LineMode Browser'' are in use at
hundreds, if not thousands, of sites around the Internet today. The
ability of the programs to understand existing protocols means that
they can access the huge body of gopherspace, FTP files, WAIS
databases and news articles already extant. In addition to this,
large amounts of new hypertext is being introduced through HTTP and
What is in the Web
gopher is a similar system to the Web, but not as powerful.
The gopher software implements its own protocol, with limited access
to other protocols. gopherspace, as the information accessable
through gopher is called, consists of menus which can contain text
files, binary files, images, keyword-search items, or more menus. The
principle limitation of gopher is that it can't exploit hypertext.
See the entry on ``gopher'' in the section ``See Also'' for
information on obtaining the gopher software.
Hypertext is a term first coined by Ted Nelson. It is the
logical combination of computers and text --- a computer interface to
text which allows cross-references to be followed. In a graphical
situation, the user can follow cross-references by clicking with their
mouse on the cross-referenced phrase. This would bring up the
document at the ``other end'' of the cross-reference.
Hypermedia is the extension of this to include graphics and
audio as things which can be selected or viewed.
WAIS is a full-text database system produced by Thinking
Machines Corp, and placed in the public domain. Full-text databases
allow retreival of documents by specifying any of the words which
occur in them. WAIS also gives document ranking and (with the
appropriate extensions) boolean searches. WAIS servers communicate
with users' programs via the ANSI standard Z39.50 protocol. See the
entries on ``WAIS'' and ``Z39.50'' in the section ``See Also'' for
information on obtaining the WAIS software.
FTP is the standard Internet protocol for copying files between
computers. A very large amount of information is available via
anonymous FTP, a variant of FTP where a set of files is made
available for public access. See the entry on ``FTP'' in the section
``See Also'' for information on obtaining the source code to an FTP
server. See the entry ``FTP by Mail'' for instructions on doing FTP
NNTP is a protocol used for moving around Usenet News.
This is like the bulletin-board of the Internet (although plenty of
non-Internet users also contribute), with articles being contributed
on a wide variety of subjects. The articles are grouped into
newsgroups depending on their content --- the author of an
article specifies which newsgroup(s) it is to go in. See the entry on
``NNTP'' in the section ``See Also'' for information on obtaining the
source code to an NNTP server.
Documents on the Web are referred to using URLs (Uniform Resource
Locators). A URL looks like
http://www.vuw.ac.nz/~gnat/ideas/www-primer.html. It consists
of three parts --- the method of retrieving the document
(http), an option machine name (www.vuw.ac.nz) and a
pathname (/~gnat/ideas/www-primer.html). The URL format is an
Internet standard --- see the entry ``URL'' in the section ``See
Also'' to find more information on URLs.
How to See More
Several computers on the Internet have public-access World Wide Web
clients accessable by telnet. Here is the current list:
If you are interested in the Web, compile or FTP one of the browsers
so you can browse from your own machine. Browsers exist for Amigas,
IBM PCs, Unix, VMS, Macintoshes, and the X Window System. The entry
on ``Browsers'' in the section ``See Also'' has a list of which
browsers are available for which computers.
A telnettable browser provided by the W3 coalition.
Offers Lynx, a full screen browser which requires a vt100
terminal. Log in as www. Does not allow users to "go" to arbitrary
URLs, so GET YOUR OWN COPY of Lynx and install it on your system if
your administrator has not done so already. The best plain-text
browser, so move mountains if necessary to get your own copy of Lynx!
(or telnet 188.8.131.52) Log in as www. A full-screen browser in New
Jersey Institute of Technology. USA.
A dual-language Hebrew/English database, with links to the rest of the
world. The line mode browser, plus extra features. Log in as
www. Hebrew University of Jerusalem, Israel.
Slovakia. Has a slow link, only use from nearby.
(or telnet 184.108.40.206). Log in as www. Offers several
browsers, including Lynx (goto option is disabled there also).
Hungary. Has slow link, use from nearby. Log in as www.
To add information to the Web, you need either a HTTP server, a gopher
server, an FTP server, or a WAIS database server. These are all
available in source-code via anonymous FTP (see the relevant entries
in the section ``See Also'' for information on obtaining the source
code for these servers). Which server you choose depends on your
If you are only wanting to serve plain-ASCII databases, then install a
WAIS server. If you want to serve unformatted text, with the option
for WAIS searching, install a gopher server and WAIS. If you want to
deliver hypertext, and speed is unimportant, use an FTP server
(beware, though --- FTP is very slow for this). If you want to
deliver hypertext with reasonable speed, use an HTTP server.
A thorough discussion of the merits and disadvantages of the various
HTTP servers appears in the companion document ``An Information
Provider's Guide to Web Servers''. See the section ``How to obtain
this document'' for more information. What follows is a summarised
version of that document.
There are four HTTP servers available without restriction to academic
users. They are:
- The CERN Server
- This has mapping (ability to redirect
requests), a security filter, and can act as a gateway to most things.
- The NCSA Server
- This is a small and simple server, with the
ability to act as an annotation server as well. It can also
understands the gopher setup, and can run on top of the same data.
- This is written in Perl (see the entry on ``Perl'' in
the section ``See Also'' for more information on Perl) by Tony Sanders
(email@example.com). It comes with ArchiePlex, an archie gateway (see
the entry on ``archie'' in the section ``See Also'' for more
information on archie) and various calendar, manual page and finger
gateways. It even has a converter from setext to HTML (see the entry
on ``setext'' in the section ``See Also'' for more information on
- This is a small server, written in C, able to provide both
gopher and HTTP/1.0 access to the same data.
See the entry on ``Obtaining the servers'' in the section ``See Also''
for instructions on obtaining these HTTP servers. The newsgroup
comp.infosystems.www.providers is a good place to ask questions for
help on compilation and setup.
If you are serving hypertext to the Web, you will need to know about
HTML (the HyperText Markup Language) and the converters that exist
between HTML and RTF, LaTeX and others. See the document ``An
Information Provider's Guide to HTML'', posted fortnightly to
comp.infosystems.www.providers, for more information on HTML, and HTTP
servers. The entry on ``HTML'' in the section ``See Also'' has more
information on obtaining this document. Note that you don't need
to know about HTML if you're not serving hypertext.
Archie is a database of files available via anonymous ftp. You can
specify a filename, part of a filename, or a regular expression, and
archie will give you the name of the computers that have the filename
you asked for available via anonymous FTP. For more information on
archie, see the file README available through anonymous FTP in the
directory pub/archie/doc on archie.ans.net.
MS-DOS users have several choices, depending on their software
installation. Windows users, with an appropriate Winsock-compliant
TCP/IP stack, should use Netscape, WinWeb, Cello or NCSA Mosaic.
PC-NFS users with NIS and a suitably configured NIS server can run any
of the above clients.
Macintosh users should try MacWeb, MacMosaic or MacWWW.
X Window System users should try Netscape or XMosaic --- XMosaic
requires the Motif libraries to compile, but precompiled binaries are
available for many platforms. Netscape is only provided in binary
form. A similar interface is provided by TkWWW --- TkWWW uses the
tcl/tk language and graphics libraries.
Unix users can obtain CERN's simple LineMode Browser. The browser
Lynx, is harder to compile but looks better.
VMS users can use the LineMode browser or Lynx, or the VMS WWW browser
(see ``VMS'' for information on obtaining this browser).
Cello is available via anonymous FTP from fatty.law.cornell.edu in the
- CERN Server
The CERN server is available via anonymous FTP from info.cern.ch, in
the directory /pub/www/src/ as WWWLineMode_XXX.tar.Z where XXX is a
gn is available via anonymous FTP from ftp.acns.nwu.edu in the
The gopher software is available via anonymous FTP from
boombox.micro.umn.edu in the /pub/gopher/ directory. The gopher
protocol is documented in RFC 1436 (see the entry ``RFC'' to find out
how to obtain copies of RFCs).
The source for a reliable and useful FTP server is available via
anonymous FTP from ftp.uu.net in the directory
- FTP By Mail
Send e-mail to firstname.lastname@example.org with ``send
usenet/news.answers/finding-sources'' in the body.
HTML, the HyperText Markup Language, is document in the file
html-spec.txt.Z, available via anonymous FTP from info.cern.ch in the
directory /pub/www/doc/ or via anonymous FTP from ftp.uu.net in the
directory /networking/info-service/www/doc/. The document ``An
Information Provider's Guide to HTML'' is posted fortnightly to the
same Usenet newsgroups as this document, and is available via FTP from
the same places (see the section ``How to obtain this document'' for
- LineMode Browser
The PC-NFS version is available via anonymous FTP from info.cern.ch in
the directory /pub/www/bin/pc-nfs/wwwpcnfs.zip. The source-code
(which compiles under Unix and VMS) is available via anonymous FTP
from info.cern.ch in the directory /pub/www/src/ as
WWWLineMode_XXX.tar.Z, where XXX is a version number.
Lynx requires the ``curses'' full-screen library, and is available via
anonymous FTP from ftp2.cc.ukans.edu.
MacWWW is available via anonymous FTP from info.cern.ch in the
The Mosaic suite are NCSA's clients, and they are leading the field at
the moment. They are all GUI-based, and available via anonymous FTP
from ftp.ncsa.uiuc.edu in /Web/. Versions exist for Macintoshes
(System 7 needed), Microsoft Windows endowed PCs, and X11. The source
to the X version is available, but requires Motif. Precompiled
binaries are available for FTP for a number of architectures, as well
as for the microcomputers.
- NCSA Server
The NCSA server is available via anonymous FTP from ftp.ncsa.uiuc.edu
in the directory /Web/httpd/ both as Unix source code and as a
Microsoft Window binary.
Try ftp.mcom.com and ftp2.mcom.com in the directory /netscape/.
Mirror sites are available --- if you cannot connect to ftp.com.com,
you will be given a list of alternative sites to try.
The Net-News Transfer Protocol (NNTP) is described in RFC 977 (see the
subsection ``RFC'' for information on obtaining RFCs). Several
implementations are available, the latest and most efficient is INN
(available via anonymous FTP from ftp.uu.net in the directory
- Obtaining the servers
See the relevant sections: CERN server, NCSA Server,
gn and Plexus.
Perl is an interpreted language, especially good for text handling.
It is available for anonymous FTP from ftp.uu.net in the directory
/pub/languages/perl/ as perl.tar.gz.
Plexus Version 2.2.1 is available via ftp to austin.bsdi.com, login:
ftp, password: yourid, in plexus/2.2.1/dist/Plexus-2.2.1.tar.Z.
Version 3.0i-beta (pre-release, not for production use) is available
via ftp to austin.bsdi.com, login: ftp, password: yourid, in
- Provider's Guide
The document ``An Information Provider's Guide to Web Servers'' is
posted fortnightly to the same Usenet newsgroups as this document, and
is available via FTP from the same places (see the section ``How to
obtain this document'' for more information).
RFC stands for ``Request for Comments''. Internet RFCs are
documentation of protocols, proposals, pipe-dreams and plans. They
are numbered sequentially from 1, and are available for anonymous FTP
from nic.ddn.mil in the directory /rfc/.
setext stands for Structure Enhanced Text, and is a markup system that
provides a way to format ASCII documents with visually unobtrusive
anchors to parts of it above the paragraph level. More information is
available on the Web as http://www.bsdi.com/setext/
TkWWW is available via anonymous FTP from any X11 site in the contrib/
directory --- TkWWW uses the tcl/tk language and graphics libraries.
It has a rudimentary HTML editing system, and is actively under
The draft URL specification is available via anonymous FTP from
info.cern.ch in the directory /pub/www/doc/ as urlX.txt where X is a
The Hebrew University of Jerusalem have a VMS browser tested under
UCX/Multinet and UCX_APX (Alpha). It uses the VMS/SMG screen routines
and is available via anonymous FTP from www.huji.ac.il in the
The Thinking Machines release of WAIS is available via anonymous FTP
from ftp.uu.net in the directory /networking/info-service/wais/ as
wais-8-b5.1.tar.Z. The CNIDR release of freeWAIS is available via
anonymous FTP from ftp.cnidr.org in the directory /pub/NIDR.tools/ as
ANSI standard Z39.50 is a standard for communication for information
retrieval. The draft specification is available via anonymous FTP
in the same place as either version of the WAIS source. The file is
probably called z3950-spec.txt. ANSI charge for paper copies of the
How to obtain this document
The latest version of this document is always available on the Web as
http://www.vuw.ac.nz/~gnat/ideas/www-primer.html, and the most
recently posted ASCII version will be available via anonymous FTP from
wuarchive.wustl.edu in the directory /doc/misc/www/.
This document is part of a series: ``World Wide Web Primer'', ``An
Information Provider's Guide to HTML'', and ``An Information
Provider's Guide to Web Servers''. The other documents in the series
are available from the archives above.
Please send feedback to the author, Nathan Torkington, at the e-mail
address Nathan.Torkington@vuw.ac.nz --- all discussion will be treated
as public domain and may be used in future versions of this document.