An Experimental International Conversational Mass Spectral Search
System
By S. R. HELLER, H. M. FALES, G. W. A. MILNE
R. J. FELDMANN
(National Institutes of Health, Bethesda, Maryland, U.S.A.)
N. R. DALY, D. C. MAXWELL and A. McCORMICK
(The Mass Spectrometry Data Centre, A. W.R.E., Aldermaston,
Reading, U.K.)
An interactive, conversational mass spectral search (MSS) system,
available over ordinary telephone lines using teletypewriter
terminals has been used by over 200 scientists in the U.S.A. and
Canada since 1971. The system is used an average of 25 times per
day and was originally located on a PDP-10 computer in the
Division of Computer Research and Technology (D.C.R.T.) at the
National Institutes of Health (N.I.H.) in Bethesda, Maryland, and
supported by the National Heart and Lung Institute (N.H.L.I.).
The MSS is one component of a Chemical Information System being
developed at D.C.R.T. The system has recently been transferred to
the worldwide General Electric (G.E.) timesharing network which
allows the system to be used for the cost of a subscription fee,
the computer charge, and a local telephone call to any of over
300 cities in Japan, United States, Canada, Great Britain and
nine other countries on the European continent. Further local
service telephone facilities are expected. Most cities have both
low (10 Hz) and high (30 Hz) speed telephone service available.
At this meeting the system will be demonstrated by calling a
local Edinburgh telephone number.
Details of the system have been presented elsewhere (1-6) and the
use, value, and future of the system will be highlighted here.
The present program options on the system operating on the G.E.
network include:
1. Peak and intensity search
2. Molecular weight search
3. Complete and partial molecular formula search
4. Peak and molecular weight search
5. Peak and molecular formula search
6. Molecular weight and formula search
7. Dissimilarity index comparison
8. Spectrum printout
9. Automatic and manual microfiche retrieval
10. CRAB - comments and complaints
11. HARVEST - entering of new data
12. NEWS -news of the system
13. MSDC code list
At present, consideration is being given to implementing the
display of spectra on graphics terminals. This option, available
at N.I.H., was not transferred to the G.E. system due to software
and system incompatibilities. In addition, a reverse spectrum
search, that is, search for losses from the parent ion from O
(Parent Ion) to --100, is under development. M.S.D.C. codes and
Chemical Abstracts service (CAS) Registry Numbers (REGN) are to
be added to the file for future searching by structural and
functional groups.
FIGURE 1
In general, the response to the system has been favourable and
this positive reaction is the prime reason which has encouraged
us to make the system available on such a scale to the scientific
community. The main comments which users have made about the
system during its trial period on the N.I.H. computer have been
concerned with the size and nature of the spectral file. There
are a large number of exact replicate spectra, and in addition
there are in some cases several similar spectra of the same
substance obtained from different sources. For example, there are
7 benzene spectra, 3 hydrogen spectra, 3 thiophene spectra and 7
acetone spectra. Obviously it is desirable to delete exact
replicates. However, the consensus of opinion at a workshop
session on matching systems held at the A.S.M.S. meeting in San
Francisco was that holding several spectra of the same compound
obtained under different conditions could be useful, particularly
for more complex molecules, e.g. cholesterol, whose spectra may
be quite sensitive to instrumental parameters. In addition to
replication of spectra there are in some cases errors in peak
location and intensity. Many spectra were recorded starting at
m/e 40 or even as high as m/e 60; these spectra are often lost as
possible answers because of this fact. The file is admittedly
inadequate in such areas as drugs, steroids, pesticides,
organometallics and other biochemical materials in general (amino
acids, lipids, sugars, etc.). It is hoped that many of the
deficiencies of the data file will be put right in the very near
future and that subscribers to the search system will assist in
still further improving it by submitting new spectra either
through the on-line HARVEST option or by sending spectra to
the Mass Spectrometry Data Centre at Aldermaston. It
should be pointed out that maintaining and improving a
data base of this size and complexity is an expensive
procedure. In addition to producing the data base there
are costs involved in loading and storing it on the G.E.
computer system. It is for these reasons that it is
necessary to charge a fairly high subscription rate for
use of the system.
The remainder of this paper will be directed towards
examples of some of the search options. Figures 1 and 2
are examples of the peak and intensity search option and
are designed to show that only a few peaks are usually
needed to narrow the number of possible answers down to
just a few. Indeed in Fig. 2, three peaks lead to only one
answer.
Figure 3 shows the molecular weight search for a molecular
weight of 151, and indicates the presence of possible
duplicate spectra.
An example of one of the combination search options, the
molecular weight and peak search is shown in Fig. 4. In
this search only one peak along with the molecular weight
was needed to narrow the possible answers to three.
The complete molecular formula search option is shown in
Fig. 5. Again the presence of more than one spectrum for
the same compound is evident throughout the list.
After obtaining answers from searches such as those
illustrated, most users wish to see the spectrum from the
file for visual comparison. Since the spectra are all
stored on-line on direct access discs, this is a simple
matter, and Fig. 6 shows a sample spectrum printout for
the simple spectrum of HBr. In addition to this on-line
spectrum printout, microfiches of the file have been
computer generated for viewing and are expected to be made
available at cost to registered users of the system.
FIGURE 6
In summary, the MSS offers a number of options for searching a large on-line data
base, now available 24 hours a day, seven days a week on the G.E. computer
network. With the support of mass spectrometrists the system should improve in
depth, providing a valuable tool for the mass spectral information needs of the
worldwide scientific community.
REFERENCES
1. Heller, S. R., 'Conversational Mass Spectral Retrieval System and Its Use as an Aid in Structure
Determination', Anal. Chem., 1972, 44, 1951.
2. Heller, S. R., Fales, H. M. and Milne, G. W. A., 'An Interactive Mass Spectral Search System', J.
Chem. Ed., 1972, 49, 725.
3. Heller, S. R., Fales, H. M. and Milne, G. W. A., 'A Conversational Mass Spectral Search and
Retrieval System. II. Combined Search Options', Org. Mass Spectrom., 1972, 7, 107.
4. Heller, S. R., Koniver, D. A. and Milne, G. W. A., 'A Conversational Mass Spectral Search System.
III. Display and Plotting of Spectra and Dissimilarity Comparisons', Anal. Chem., submitted.
5. Heller, S. R., Feldmann, R. J., Fales, H. M. and Milne, G. W. A., 'A Conversational Mass Spectral
Search System. IV. The Evolution of a System for the Retrieval of Mass Spectral Information', J.
Chem. Soc., submitted.
6. Heller, S. R., 'DCRT/CIS Mass Spectral Search System User's Manual', November 1972, D.C.R.T., N.I.H., Bethesda, Maryland.