An Experimental International Conversational Mass Spectral Search

System

By S. R. HELLER, H. M. FALES, G. W. A. MILNE

R. J. FELDMANN

(National Institutes of Health, Bethesda, Maryland, U.S.A.)

N. R. DALY, D. C. MAXWELL and A. McCORMICK

(The Mass Spectrometry Data Centre, A. W.R.E., Aldermaston, Reading, U.K.)

An interactive, conversational mass spectral search (MSS) system, available over ordinary telephone lines using teletypewriter terminals has been used by over 200 scientists in the U.S.A. and Canada since 1971. The system is used an average of 25 times per day and was originally located on a PDP-10 computer in the Division of Computer Research and Technology (D.C.R.T.) at the National Institutes of Health (N.I.H.) in Bethesda, Maryland, and supported by the National Heart and Lung Institute (N.H.L.I.).

The MSS is one component of a Chemical Information System being developed at D.C.R.T. The system has recently been transferred to the worldwide General Electric (G.E.) timesharing network which allows the system to be used for the cost of a subscription fee, the computer charge, and a local telephone call to any of over 300 cities in Japan, United States, Canada, Great Britain and nine other countries on the European continent. Further local service telephone facilities are expected. Most cities have both low (10 Hz) and high (30 Hz) speed telephone service available. At this meeting the system will be demonstrated by calling a local Edinburgh telephone number.

Details of the system have been presented elsewhere (1-6) and the use, value, and future of the system will be highlighted here. The present program options on the system operating on the G.E. network include:

1. Peak and intensity search

2. Molecular weight search

3. Complete and partial molecular formula search

4. Peak and molecular weight search

5. Peak and molecular formula search

6. Molecular weight and formula search

7. Dissimilarity index comparison

8. Spectrum printout

9. Automatic and manual microfiche retrieval

10. CRAB - comments and complaints

11. HARVEST - entering of new data

12. NEWS -news of the system

13. MSDC code list

At present, consideration is being given to implementing the display of spectra on graphics terminals. This option, available at N.I.H., was not transferred to the G.E. system due to software and system incompatibilities. In addition, a reverse spectrum search, that is, search for losses from the parent ion from O (Parent Ion) to --100, is under development. M.S.D.C. codes and Chemical Abstracts service (CAS) Registry Numbers (REGN) are to be added to the file for future searching by structural and functional groups.

FIGURE 1

In general, the response to the system has been favourable and this positive reaction is the prime reason which has encouraged us to make the system available on such a scale to the scientific community. The main comments which users have made about the system during its trial period on the N.I.H. computer have been concerned with the size and nature of the spectral file. There are a large number of exact replicate spectra, and in addition there are in some cases several similar spectra of the same substance obtained from different sources. For example, there are 7 benzene spectra, 3 hydrogen spectra, 3 thiophene spectra and 7 acetone spectra. Obviously it is desirable to delete exact replicates. However, the consensus of opinion at a workshop session on matching systems held at the A.S.M.S. meeting in San Francisco was that holding several spectra of the same compound obtained under different conditions could be useful, particularly for more complex molecules, e.g. cholesterol, whose spectra may be quite sensitive to instrumental parameters. In addition to replication of spectra there are in some cases errors in peak location and intensity. Many spectra were recorded starting at m/e 40 or even as high as m/e 60; these spectra are often lost as possible answers because of this fact. The file is admittedly inadequate in such areas as drugs, steroids, pesticides, organometallics and other biochemical materials in general (amino acids, lipids, sugars, etc.). It is hoped that many of the deficiencies of the data file will be put right in the very near future and that subscribers to the search system will assist in still further improving it by submitting new spectra either through the on-line HARVEST option or by sending spectra to the Mass Spectrometry Data Centre at Aldermaston. It should be pointed out that maintaining and improving a data base of this size and complexity is an expensive procedure. In addition to producing the data base there are costs involved in loading and storing it on the G.E. computer system. It is for these reasons that it is necessary to charge a fairly high subscription rate for use of the system.

The remainder of this paper will be directed towards examples of some of the search options. Figures 1 and 2 are examples of the peak and intensity search option and are designed to show that only a few peaks are usually needed to narrow the number of possible answers down to just a few. Indeed in Fig. 2, three peaks lead to only one answer.

Figure 3 shows the molecular weight search for a molecular weight of 151, and indicates the presence of possible duplicate spectra.

An example of one of the combination search options, the molecular weight and peak search is shown in Fig. 4. In this search only one peak along with the molecular weight was needed to narrow the possible answers to three.

The complete molecular formula search option is shown in Fig. 5. Again the presence of more than one spectrum for the same compound is evident throughout the list.

After obtaining answers from searches such as those illustrated, most users wish to see the spectrum from the file for visual comparison. Since the spectra are all stored on-line on direct access discs, this is a simple matter, and Fig. 6 shows a sample spectrum printout for the simple spectrum of HBr. In addition to this on-line spectrum printout, microfiches of the file have been computer generated for viewing and are expected to be made available at cost to registered users of the system.



FIGURE 6

In summary, the MSS offers a number of options for searching a large on-line data base, now available 24 hours a day, seven days a week on the G.E. computer network. With the support of mass spectrometrists the system should improve in depth, providing a valuable tool for the mass spectral information needs of the worldwide scientific community.

REFERENCES

1. Heller, S. R., 'Conversational Mass Spectral Retrieval System and Its Use as an Aid in Structure Determination', Anal. Chem., 1972, 44, 1951.

2. Heller, S. R., Fales, H. M. and Milne, G. W. A., 'An Interactive Mass Spectral Search System', J. Chem. Ed., 1972, 49, 725.

3. Heller, S. R., Fales, H. M. and Milne, G. W. A., 'A Conversational Mass Spectral Search and Retrieval System. II. Combined Search Options', Org. Mass Spectrom., 1972, 7, 107.

4. Heller, S. R., Koniver, D. A. and Milne, G. W. A., 'A Conversational Mass Spectral Search System. III. Display and Plotting of Spectra and Dissimilarity Comparisons', Anal. Chem., submitted.

5. Heller, S. R., Feldmann, R. J., Fales, H. M. and Milne, G. W. A., 'A Conversational Mass Spectral Search System. IV. The Evolution of a System for the Retrieval of Mass Spectral Information', J. Chem. Soc., submitted.

6. Heller, S. R., 'DCRT/CIS Mass Spectral Search System User's Manual', November 1972, D.C.R.T., N.I.H., Bethesda, Maryland.