Never has a technical subject been so popularized so quickly
by the press, and adopted by the public, as the information
highway of the future - the Internet (1). With a growth rate of
some 10-15% per month, trying to write about the resources
available to the analytical chemistry community, is like mud
wrestling - a real mess. While there are many changes taking
place in the use of computers in chemistry, particularly in how
chemists work, obtain information and publish (2), this
presentation will emphasize only one - the Internet.
Once upon a time, some two decades ago, the US Department of
Defense's (DOD) Advanced Research Projects Agency (ARPA) created
a computer network to connect DOD researchers around the USA.
From this limited and humble beginning, the Internet has
developed. It is now a nebulous collection of 10,000+
computers hooked together connecting 15 million (plus another
10,000 by the time you finish reading this article) academic,
industrial, and government users of computers throughout most of
the world (1a). And yes, even Siberia is on the network! The
Internet is a resource which would have been science fiction just
a few years ago, but today is a reality for sharing documents,
data, databases, information, software, ideas. If you thought
dealing with your local government or any large bureaucracy was
difficult, wait to go into the cyberspace of the Internet.
The purpose of this article is both to whet your appetite
for what is available on the Internet, as well as to provide some
very basic information and suggestions as to how to access the
Internet, locate and retrieve this information. A complete list
of resources is both impossible (as the Internet resources seem
to grow faster than a cancer cell multiplies and change faster
than the AIDS virus mutates ) as well as too overpowering for
anyone to handle, or for TrAC to publish. The goal of this
article is to provide enough starting points for one to go off
into the Internet (called cyberspace) to explore for yourself.
For further information about the Internet the recent book by Ed
Krol is an very good starting point (3).
Most universities and government offices are on the
Internet, as well as a growing number of commercial firms (4).
The resources on the Internet are not all free (actually none are
free - someone has to pay for all the connections, the hardware
and software to interface to the other computers), but for all
practical purposes, for better and/or worse, they appear free.
Certainly the bulk of what I will describe here is "effectively"
free. Very valuable commercial database resources for analytical
chemists, such as Chemical Abstracts's STN network and DIALOG,
both available on the Internet, require paid accounts to access
these computer systems. Two important points to make at this
time is that the resources on the Internet come and go, as local
decisions as to what is available changes like the tides.
Secondly, and most importantly, there is no accepted quality
control procedures for Internet resources. Some are good, some
are not. Often there are two versions of the same information,
so be sure you locate the most recent source. Some sources are
copyright and should not be accessible or accessed, but they are,
so use proper judgement.
Before going any further please look at Table 1 where a few critical terms are defined. As electronic information is used more widely as we go further into the electronic information age, there is now a book on how to properly cite such information (5).
Condensed Internet Glossary
ARCHIE: A database, originally developed at McGill University in Canada, which keep tracks of the millions of files on the over 900 ftp'able computers on the Internet.
E-Mail: Allows one to electronically send and receive messages via computer.
FTP: Is the file transfer protocol by which one can electronically request or send a file (of data or a software program) from a remote computer to be transferred or sent to another computer. Normally one uses the login name "anonymous" and your Internet address (e.g., firstname.lastname@example.org") as the password to get into the computer from which you to get a file.
Gopher: A University of Minnesota developed software package to allow someone to "go-for" files and readily search information physically located on other computers on the Internet. (Actually the name comes the university mascot.)
Internet: A collection of computers located throughout the world which are connected together and allows for four major protocols: electronic mail (e-mail), bulletin boards/discussion groups, file transfers between machines (ftp), and remote logins from any machine on the network (telnet)
USENET: A group of computer systems that exchange news. Most of the USENET bulletins boards are of a very general nature and there over 3500 of these news groups.
VERONICA: Named for ARCHIE's girlfriend. Veronica, developed at the University of Nevada, helps to access gopher directories and the files they contain.
WAIS: The Wide Area Information Server is a tool to quickly and easily retrieve database information from computers on the Internet. It is likely to be replaced by a similar but more powerful search tool the WWW (described next).
WWW: Similar to the WAIS but different to the extent that the
World Wide Web is a Swiss developed software search system
allowing one to search databases in a "hypertext" (where data is
directly linked to other data) fashion.
Internet addresses are usually given a name, which is an alias for the real numeric address. In as many cases as possible I will try to give both, as some computer system software requires the numeric address, while others are more tolerant of simple names.
ARCHIE (a shortened version of "ARCHIvE") and is a
continuously updated database of about 900 FTP'able sites for
innumerable software items and files. There are a number of
identical database ARCHIE sites around the world:
quiche.cs.mcgill.ca 22.214.171.124 (The original ARCHIE)
archie.funet.fi 126.96.36.199 (Europe)
archie.doc.ic.ac.uk 188.8.131.52 (UK)
archie.au 184.108.40.206 (Australia)
To access any of these systems, login as "ARCHIE", and no password is required. The two most useful commands are WHATIS (e.g., whatis chemistry) and PROG (e.g. prog ampac). WHATIS locates software by subject, and PROG locates software by ftp site. You will need to use PROG to find one (usually of many) possible ftp addresses.
EXAMPLE OF INTERNET ACCESS VIA ftp
There are a number of chemistry resources on the Internet,
and the best way to find them is from lists of such resources
collected by someone and put on the network. One source of
chemistry related Internet sites can be found on the computer
host machine in Greece called leon.nrcps.ariadne-t.gr (or
220.127.116.11). On this machine there is a file called
"/pub/chemistry/sites.chem" which lists locations of chemistry
related resources. On the same computer is a list of chemistry
related mailing lists, bulletin boards, and new groups which are
found in a file called "/pub/chemistry/m_lists.chem".
To obtain these files (some of which are simple text or ascii and some of which are binary) one can simply ftp them. The following example shows how I was able to ftp (connect) to the computer (in this case a computer with a UNIX operating system) which has the file called "sites.chem", transfer that file back to my computer's disk area (a VAX computer with a VMS operating in my case) and then type out the first few records of this file showing a number of sites where information can be obtained. What I typed is shown in bold, the remainder is the computer response. (There is a good deal of verbiage, which is typical of such activity.) Basically all that one does is ftp to the computer you want to access, login in as "anonymous", go to the directory/sub-directory in which the file is that you want, use the command "get" to get or transfer the file to your computer (which took 58.11 seconds to get a file of 53633 bytes from the computer in Greece), and then once you logoff the remote computer, type the first few lines of the file you just moved to your computer. The last part of the output in Figure 1 is lists of the first seven sources of the names of the files from machines in England (.uk), Germany (.de), Japan (.jp), Germany (.de), Texas (utexas.edu), Australia(.au), and Michigan (umich.edu). Hopefully the names provided by the authors of the files are descriptive enough to let you know if you are interested in actually obtaining copies of them. It should be noted that different computer systems (VAX, Sun, HP, etc.) often have different commands, so blindly repeating what is shown below at another ftp site may not work, in part or in whole.
asrr.arsusda.gov Wollongong FTP User Process (Version 5.1)
Using 8-bit bytes.
220 leon FTP server (SunOS 4.1) ready.
Name (leon.nrcps.ariad)ne-t.gr:srheller): anonymous
331 Guest login ok, send ident as password.
230 Guest login ok, access restrictions apply.
250 CWD command successful.
200 PORT command successful.
150 ASCII data connection for /bin/ls (18.104.22.168,3900) (0 bytes).
drwxr-xr-x 2 root 1 512 Apr 30 1993 coll_chem.doc
drwxr-xr-x 2 root 1 512 Jul 19 08:02 ftp.sites.chem
drwxr-xr-x 2 root 1 512 Jul 19 08:01 m_lists.chem
-r--r--r-- 1 root 1 1332 Jul 19 07:59 readme.07-93
drwxr-xr-x 2 root 1 512 Jul 19 08:03 related_fields
226 ASCII Transfer complete.
348 bytes transfered in 1.50 seconds (0.23 Kbytes/second)
250 CWD command successful.
200 PORT command successful.
150 ASCII data connection for /bin/ls (22.214.171.124,3923) (0 bytes). total 68
-r--r--r-- 1 root 1 14877 Jul 19 08:02 prgms-dtbs.07-93
-r--r--r-- 1 root 1 53633 Jul 19 08:02 sites.07-93
226 ASCII Transfer complete.
146 bytes transfered in 0.62 seconds (0.23 Kbytes/second)
200 PORT command successful.
150 Binary data connection for sites.07-93 (126.96.36.199,3925) (53633 bytes).
226 Binary Transfer complete.
53633 bytes transfered in 58.11 seconds (0.90 Kbytes/second)
$ type sites.07-93;1
Directory: /pub/chemistry/ftp.sites.chem, file sites.07-93
*Note* Refer to file prgms-dtbs.07-93 at directory
~~~~~~ /pub/chemistry/ftp.sites.chem, from this host.
/tex/aston/digests/texline/no12 file chemist.tex
/tex/aston/digests/texline/no6 file chemistry.tex
/msdos/mswindows3/demos file chemwin.zip
/tex file ChemTeX.sh.Z
/pub/mac/graphics file mac-molecule-15.hqx
/pub/mac/graphics file mac-molecule-17.hqx
/pub/mac/graphics file mac-molecule-images.hqx
/pub/mac/graphics/quicktime file macmolecule-movie.hqx
/pub/mac/graphics/quicktime file molecule-movie.hqx
/pub/TeX/dhdurz1 file chemstrt.zoo
/graphics/gif/m file molecule
/micros/mac/info-mac/app file mac-molecule-17.hqx
/micros/mac/info-mac/app file mac-molecule-images.hqx
/micros/mac/info-mac/art file 3d-molecule.hqx
/micros/mac/info-mac/art/qt file macmolecule-movie.hqx
/micros/mac/info-mac/Old/app file mac-molecule-15.hqx
/micros/mac/info-mac/Old/app file mac-molecule-part1.hqx.Z
/micros/mac/info-mac/Old/app file mac-molecule-part2.hqx.Z
/micros/mac/info-mac/Old/art/qt file molecule-movie.hqx
/micros/mac/info-mac/Old/card file chemical-inventory-201.hqx.Z
/micros/mac/info-mac/Old/card file chemical-inventory-21.hqx
/micros/pc/oak/education file chemical.arc
/micros/pc/simtel-20/education file chemical.arc
/usenet/comp.sources.misc/volume2 file molecule.Z
/mac/misc/chemistry file acidsandbasespartone.sit.hqx
/mac/misc/chemistry file acidsandbasesparttwo.sit.hqx
/mac/misc/chemistry file aminoacid.sit.hqx
/mac/misc/chemistry file atomicstructurepartone.sit.hqx
/mac/misc/chemistry file atoms.sit.hqx
/mac/misc/chemistry file ballandstick3.04demo.cpt.hqx
/mac/misc/chemistry file bonding.sit.hqx
/mac/misc/chemistry file chem101.sit.hqx
/mac/misc/chemistry file chemequilibrium.sit.hqx
/mac/misc/chemistry file chemicalkinetics.sit.hqx
/mac/misc/chemistry file chemistrychapter1.sit.hqx
/mac/misc/chemistry file chemquiz.sit.hqx
/mac/misc/chemistry file chemriddles.sit.hqx
/mac/misc/chemistry file crystaltutordemo.cpt.hqx
/mac/misc/chemistry file elements.sit.hqx
/mac/misc/chemistry file elementtwo.sit.hqx
/mac/misc/chemistry file gas.sit.hqx
/mac/misc/chemistry file gasspectra.cpt.hqx
/mac/misc/chemistry file heatofvap.sit.hqx
(The list continues but I stopped the printing at this point.)
Another source of information, which is likely to increase
in the future, is ftp sites from publishers. One such system,
which is that of Springer-Verlag. Their machine,
trick.ntp.springer.de., has a variety of information, ranging
from the table of contents of their journals, to demonstration
versions of programs, such as the MOBY molecular modeling system.
Just recently ISI has provided access to The Scientist, via ftp,
at ds.intercis.net in the pub/the-scientist directory. No doubt
other publishers will soon begin to create similar resources.
BULLETIN BOARDS AND LIST SERVERS
The best way to find out more about what is available is to join one (or more) of the many bulletin boards, discussion groups, or list servers, which are available on Internet. A sample of these are listed, in alphabetic order, below in Table 2 (6). To gain access (and often 10-20+ e-mail messages a day, giving rise to a new meaning to the phrase "junk-mail") just send a e-mail to the Internet address for the list and as the subject heading and.or content of the mail message type "subscribe". (If you find the discussion group verbiage to much or not of interest, the reverse command is "unsubscribe" for either a period of time (while you are on vacation) or forever.)
Examples of Chemistry List-servers
ACS COMP Newsletter
The Newsletter of ACS Computer Division (COMP) is now available in electronic form from the Computational Chemistry List archives. To retrieve it, ftp to kekule.osc.edu and look in the pub/chemistry/comp_news directory or mail the command send ./comp_news/vol17.txt from chemistry to email@example.com.
Buckminster Fullerene mailer.
To subscribe, send the message: INTRO or HELP to: firstname.lastname@example.org. To get the bibliography, send the keyword BIBLIO to the same address. The service is maintained by Jack Fischer's group at the University of Pennsylvania.
This fullerene database is accessible via anonymous
ftp at: physics.arizona.edu. login: ftp passwd:your e-mail
address. It is found in: /usr/ftp/asc. The site has a program
called PCBIB that allows searches of the database by keywords.
The resource is based on materials from Professor Richard E.
Smalley's BuckyBall Bibliography. One can telnet and search
for this information at sabio1.library.arizona.edu. Login is
"sabio". Choose "Other databases" from the menu.
Computational Chemistry List.
To subscribe, send the message: send help from
chemistry to: OSCPOST@oscsunb.osc.edu
CHEM-COMP, Computational Chemistry.
To subscribe, send the message: join chem-comp firstname lastname to: email@example.com
CHEMED-L, Chemistry Education Discussion List.
To subscribe, send the message: subscribe chemed-l firstname lastname to: firstname.lastname@example.org
CHMINF-L, the Chemical Information Sources Discussion List.
Indiana University. CHMINF-L may be joined by sending
the message: subscribe chminf-l to:
email@example.com CHMINF-L covers all information
sources that can be used to answer questions a chemist might
CMTS-L, the Chemical Management and Tracking Systems List.
CMTS-L serves as a forum for the exchange of ideas on the
establishment of computerized systems to manage chemical
inventories. To subscribe, send the message: subscribe CMTS-L
firstname lastname to: firstname.lastname@example.org
CORROS-L, The Corrosion Interest List.
To subscribe, send the message: subscribe corros-l firstname lastname to: email@example.com
FORENS-L, Forensic Sciences Discussion Group.
To subscribe, send the message: subscribe firstname
lastname to: FORENS-REQUEST@ACC.FAU.EDU
HIRIS-L, High Resolution IR Spectroscopy List.
Send the message SUB HIRIS-L firstname lastname
ICS-L, International Chemometrics Society.
To subscribe, send the message: subscribe ICS-L
JACS Supplementary Material. ; January 1993-.
JACS supplementary material is available as TIFF images,
or when supplied by the author, as ASCII files. Retrieval can be
done by ftp, electronic mail, or direct connection to the ACS
information server. This is not a free service.
List-server for exchange of information on microscale
materials. To subscribe, send the message: subscribe
Molecular Modeling lists:
MOSSBA-L, Mossbauer Spectroscopy, Software & Forum.
Send the message: subscribe mossba-l firstname lastname
REACTIVE: A discussion list about air sampling and monitoring of short-lived pollutants. Send the message: subscribe reactive firstname lastname to: firstname.lastname@example.org
The laboratory safety list can be joined by sending the message: subscribe safety to: email@example.com
Springer-Verlag Journals Preview Service.
Tables of contents, titles (article heads), and summaries (abstracts) of papers from 30 Springer journals in the life sciences and radiology are available three to six weeks before the appearance of the printed version. There is a modest annual fee for the abstract information, but the other data can be had at no cost. To subscribe, send the message: help to: svjps@dhdspri6
SVSERV, Springer-Verlag demonstration files.
The server has new books published by Springer-Verlag, demonstrations of their software and databases, tables of contents, etc. Send the command: subscribe inf to: firstname.lastname@example.org
USENET News Groups: sci.chem; sci.chem.organomet; sci.engr.chem; sci.polymers. The USENET News Groups allow easy access to a variety of topics. Check with your local computer service to see how to access these services.
This quick tour of the Internet has provided a limited
snapshot of what is available. Only by taking the time to
explore the vast number of storehouses of information connected
to the network will you be able to make take advantage of the
Internet and provide a useful resource for your everyday
activities. But one thing is certain, this is a resource which
you will not be able to neglect, as electronic information and
data become part of the daily life of the chemist.
1. For example, see a) The New York, page 1, November 3, 1993; b) The Wall Street Journal, page B1, September 3, 1993; c) The New York Times, page C1, May 18, 1993; and d) The Washington Post, Business Section, page 5, May 17,1993;
2. S. R. Heller,
"Chemical Information Activities: What the Future
Holds" , J. Chem. Inf. Comput.
Sci., 33, 284-291(1993).
3. THE WHOLE INTERNET - User's Guide & Catalog by Ed Krol, O'Reilly & Associates, 103 Morris Street, Suite A, Sebastopo, CA 95472 (Phone: 800-998-9938; FAX: 707-829-0104. The cost of this book is about $25. This company also has a number of other items about the Internet and related subjects. For details write to the company or get their latest product announcements by asking to "subscribe ora-news first name lastname organization name" and e-mail this to email@example.com.
4. For those who do not have access to Internet in their organization, a number of commercial companies are now providing Internet access for a fee One such service is DELPHI, which has over 600 local dial up access points in the USA. You can get a free trial of this service by instructing your modem to dial 1-800-365-4636. Press the carriage return (CR) once or twice and use "jiondelphi" as the user login name and "cpt31x" as the password. For additional information call 1-800-695-4005.
A similar service is being developed in Europe by EUnet Limited. They can be contacted by phone at 31-20-592-5109 (FAX: 592-5155) or by e-mail at info.eu.net
5. "Electronic Style: A Guide to Citing Electronic Information", Meckler Publishing, 11 Ferry Lane West, Westport, CT 06880. The price of this 65 page book is $15.
6. Much of this information was complied by Dr. Gary Wiggins (WIGGINS@UCS.INDIANA.EDU) at the University of Indiana, who runs the chemical information list-server. This list (of over 1400 subscribers) is an excellent place to ask for help on finding chemical information or chemical data sources.