NBS Mass Spectral Database, PC Version 1.02 (Database 1-A). Program by Dr. Stephen E. Stein, National Bureau of Standards, Office of Standard Reference Data, Building 221, Room A-325, Gaithersburg, MD 20899. List Price $750.00



In the fall of 1973 the first version of the Mass Spectral Search System (MSSS), was made publicly available to the scientific community on the General Electric Mark III computer network (1). At that time it consisted of slightly under 9,000 non-unique spectra and required a large timesharing system to store the database and programs. Now, some 15 years later, the database has increased to about 44,000 unique spectra, and the entire system can be searched almost as quickly and flexibility on an IBM PC as it was searchable on that million dollar computer system.

The National Bureau of Standards (NBS), Office of Standard Reference Data (OSRD) has created an excellent and inexpensive source of mass spectral data and search software for the scientific community. The NBS MS system is well designed to serve the researcher who needs access to a large mass spectral database for spectral identification or as an aid in structure determination of an unknown from mass spectral data. It is also a very useful tool for the classroom, both for a course on mass spectrometry or spectral interpretation, as well as for any course in organic chemistry.

The details of the database have been described previously (2-4) and consist of about 44,000 EI (electron impact) only spectra. Each spectrum has a complete list of masses (m/z values) and intensities, a chemical name (only the first 25 characters) for searching as well as a display option for showing synonyms, molecular formula (and partial formula), molecular weight, CAS Registry Number, source of the spectrum, and a Quality Index (QI) assigned to the spectrum (5). Regular updates of the database, which will consist of additional spectra as well as replacing existing spectra with higher quality data are planned. At present users cannot add their own spectra to the database.

The search software, written in FORTRAN (although that is really irrelevant as the user never gets near the actual source code) is almost as flexible and extensive as the same database and search system which is currently available on a number of commercial timesharing systems. The menus are easy to use, and the help messages are indeed helpful. One can search the database by ID number (an internal numbering system code found on each spectrum), by the CAS Registry Number, by the chemical name (using only the first 25 characters, but since very few of us can type more than 25 characters of a chemical without making an error this does not seem to be a severe limitation), by molecular formula (partial or complete), by molecular weight, by abundances of 10 major peaks, and by a complete sequential search of the entire database. About the only thing really missing is a search by neutral losses. The results of a search or spectrum look-up can be very quickly displayed (m/z values and abundances) or plotted. An example of the peaks and intensities in 1-Decene, printed out and plotted are shown in Figures 1 and 2 respectively, and a plot comparing 1-Decene and 3-Decene is shown in Figure 3. The plots shown in Figures 2 and 3 each took about four minutes (which is what the manual indicated it would take) to be printed on an Epson GQ-3500 laser printer.

Searches can be done either as a quick look-up (less than 1-2 seconds) using the 10 largest peaks or as a sequential search of the whole database (with filters or screens such as elements present or molecular weight) which usually takes under a minute of elapsed time. Some capabilities are not yet as good as one might desire. For example, using the 10 largest peaks in Cholesterol, I was able to retrieve the compound only if I put in the 10 largest peaks in the exact same order as found in the database. Considering that the intensities (abundances) of some masses were very close to one another (37.9 % and 37.3 % in one case and 22.7 % and 21.4 % and 21.0 % in another case), having a spectrum from an instrument and from the library have the exact same intensity order is not a highly probable event. Hence I feel some acceptable margin of difference needs to be built into this module for it to be useful and provide acceptable answers. Sequential searching worked well and no problems were encountered. While properly explained in the user manual, molecular formula searches posed a problem, with correct answers coming only when the proper use of upper and lower case were entered. For example, neither c7h4brn nor C7H4BRN are not in the database, but C7H4BrN is in the database. A very nice feature of the molecular formula search is that it does not matter in what order the elements are entered, so that C7H4NBr will be found in the database. It would be nice to see some consistency in ways to exit or quit a module. In some cases the response is Q for quit, in other cases it is E for end. Trying to type E when Q is mandated and vice versa quickly reminds one of this inconsistency.

Since the original system was released in September 1987 there have been software updates and improvements with a number of program errors corrected, thus it is likely that by the time this review is published there will further features in the system which are not mentioned here. The December 1987 release, Version 1.02 was the one reviewed here. The database and software are provided on floppy disks - the 5 1/4 inch 360K and 1.2 MB, or the new 3 1/2 inch 1.44 MB. The number of disks needed range from 42 (360K) to "only" 11 for the 1.44 MB. Loading the database and programs are easy, although it does take at least 15-20 minutes. Thus one should have sufficient storage capacity on your hard disk to leave the database on the system, else searching will be a considerable inconvenience. The system has been nicely modularized so that you can choose which system features you want to have available should you not have all of the needed 14 MB of disk storage available. The minimum system needs are 8 MB of hard disk space, which provides for the search program and the basic database. The system runs best on an AT class machine, a PS/2 machine, or a 386 class machine. The programs require 512K of RAM. EGA, CGA, and Hercules graphics, all of which are optional for the spectrum plotting, can be used. The system is quite flexible in allowing one to specify the disk drive for the data (other than the standard drive C), and allows one to have the database on different drive from the programs. Data can be printed on Epson and compatible printers, and the system does allow for high resolution plotting using an HP Laserjet+ or equivalent printer. I have installed and tested the system on a IBM AT, Epson Equity III+, and a number of no-name clones. I had no trouble with any machine I tried. Only my wrist became sore after loading all the floppy disks. The system comes with a xeroxed 12 page manual, and a 22 page Appendix which lacks a good set of examples, but is otherwise adequate. Lastly, there is a 32 page Appendix of spectral errors which indicates why spectra on the NBS magnetic tape, which is leased to scientists and instrument manufacturers, are not in the PC version of the database. I would rate the technical information and accuracy of the manual high marks.

As for searching the database, the system is quite easy to use and rates a good grade in the area of user friendly. Searches for CAS Registry Numbers, ID numbers, and molecular formula took less than a second of elapsed time on an Epson Equity III+. CAS Registry Numbers, normally found with hyphens are not needed in a search, and if you try to enter a hyphen, the program does not allow it, so only the actual numbers are accepted. I like features like that which "know" enough to let you enter something in more than one way and still get the right answer. A molecular weight search took from 1-3 seconds depending on what, if any, constraints were placed on the search.

The slowest search in the system, a sequential search of the entire database using 10 peaks, took about 5 minutes using a Epson Equity III+ with a 40MB hard disk operating at a clock speed of 12MHz. There are bugs in the system, as indicated above, but these are being corrected as they are reported, and users are being sent updated search programs.

At present there are no plans for a Macintosh version of the system. CD-ROM is being considered, but clearly it would be a lot slower, require the user to purchase a CD reader, and probably would allow for less frequent updating. Adding structure output to the plots is now underway, and full sub-structure search of the entire database on the PC is probably not too far away. All in all, this is a very impressive, well designed system, and for the low cost, this should be bought by every mass spec lab, organic chemistry lab, and organic chemistry/spectroscopy course. Useful, easy to use databases for the PC (and hopefully the Macintosh too) are the wave of the future. See the future now by getting this database and learn how to make effective use it. It surely will be followed by many others like it in the next few years.

Stephen R. Heller, Agriculture Research Service, Beltsville, MD 20705-2350.







References:

1. S. R. Heller, J. M. McGuire, and W. L. Budde, Envir. Sci. & Tech., 9, 210(1975).

2. S. R. Heller and G. W. A. Milne, "EPA/NIH Mass

Spectral Data Base", in five volumes, part of the

NBS' National Standard Reference Data Series of

Critical Data Compilations (GPO# SN 003-003-01987-9,

#NSRDS-NBS 63), 4634 pages, US Government Printing

Office (1978 and reprinted 1980).

3. S. R. Heller and G. W. A. Milne, "EPA/NIH Mass

Spectral Data Base", Supplement Number One, in two

volumes, part of the NBS' National Standard Reference

Data Series of Critical Data Compilations, (GPO # SN

003-003-02268-3, NSRDS-NBS 63, Suppl. 1, 2151 pages,

December 1980).

4. S. R. Heller, G. W. A. Milne, and L. H. Gevantman,

"EPA/NIH Mass Spectral Data Base", Supplement Number

Two, in two volumes, part of the NBS' National

Standard Reference Data Series of Critical Data

Compilations, (GPO # SN 003-003-02268-3, NSRDS-NBS

63, Suppl. 2, 1110 pages, 1983).

5. J. G. Dillard, S. R. Heller, F. W. McLafferty, G. W.

A. Milne, and R. Venkataraghavan, Org. Mass Spectrom., 16, 48-49(1981).