NBS Mass Spectral Database, PC Version 1.02 (Database 1-A).
Program by Dr. Stephen E. Stein, National Bureau of Standards,
Office of Standard Reference Data, Building 221, Room A-325,
Gaithersburg, MD 20899. List Price $750.00
In the fall of 1973 the first version of the Mass Spectral
Search System (MSSS), was made publicly available to the scientific
community on the General Electric Mark III computer network (1).
At that time it consisted of slightly under 9,000 non-unique
spectra and required a large timesharing system to store the
database and programs. Now, some 15 years later, the database has
increased to about 44,000 unique spectra, and the entire system can
be searched almost as quickly and flexibility on an IBM PC as it
was searchable on that million dollar computer system.
The National Bureau of Standards (NBS), Office of Standard
Reference Data (OSRD) has created an excellent and inexpensive
source of mass spectral data and search software for the scientific
community. The NBS MS system is well designed to serve the
researcher who needs access to a large mass spectral database for
spectral identification or as an aid in structure determination of
an unknown from mass spectral data. It is also a very useful tool
for the classroom, both for a course on mass spectrometry or
spectral interpretation, as well as for any course in organic
chemistry.
The details of the database have been described previously (2-4) and consist of about 44,000 EI (electron impact) only spectra.
Each spectrum has a complete list of masses (m/z values) and
intensities, a chemical name (only the first 25 characters) for
searching as well as a display option for showing synonyms,
molecular formula (and partial formula), molecular weight, CAS
Registry Number, source of the spectrum, and a Quality Index (QI)
assigned to the spectrum (5). Regular updates of the database,
which will consist of additional spectra as well as replacing
existing spectra with higher quality data are planned. At present
users cannot add their own spectra to the database.
The search software, written in FORTRAN (although that is
really irrelevant as the user never gets near the actual source
code) is almost as flexible and extensive as the same database and
search system which is currently available on a number of
commercial timesharing systems. The menus are easy to use, and
the help messages are indeed helpful. One can search the database
by ID number (an internal numbering system code found on each
spectrum), by the CAS Registry Number, by the chemical name (using
only the first 25 characters, but since very few of us can type
more than 25 characters of a chemical without making an error this
does not seem to be a severe limitation), by molecular formula
(partial or complete), by molecular weight, by abundances of 10
major peaks, and by a complete sequential search of the entire
database. About the only thing really missing is a search by
neutral losses. The results of a search or spectrum look-up can
be very quickly displayed (m/z values and abundances) or plotted.
An example of the peaks and intensities in 1-Decene, printed out
and plotted are shown in Figures 1 and 2 respectively, and a plot
comparing 1-Decene and 3-Decene is shown in Figure 3. The plots
shown in Figures 2 and 3 each took about four minutes (which is
what the manual indicated it would take) to be printed on an Epson
GQ-3500 laser printer.
Searches can be done either as a quick look-up (less than 1-2 seconds) using the 10 largest peaks or as a sequential search of the whole database (with filters or screens such as elements present or molecular weight) which usually takes under a minute of elapsed time. Some capabilities are not yet as good as one might desire. For example, using the 10 largest peaks in Cholesterol, I was able to retrieve the compound only if I put in the 10 largest peaks in the exact same order as found in the database. Considering that the intensities (abundances) of some masses were very close to one another (37.9 % and 37.3 % in one case and 22.7 % and 21.4 % and 21.0 % in another case), having a spectrum from an instrument and from the library have the exact same intensity order is not a highly probable event. Hence I feel some acceptable margin of difference needs to be built into this module for it to be useful and provide acceptable answers. Sequential searching worked well and no problems were encountered. While properly explained in the user manual, molecular formula searches posed a problem, with correct answers coming only when the proper use of upper and lower case were entered. For example, neither c7h4brn nor C7H4BRN are not in the database, but C7H4BrN is in the database. A very nice feature of the molecular formula search is that it does not matter in what order the elements are entered, so that C7H4NBr will be found in the database. It would be nice to see some consistency in ways to exit or quit a module. In some cases the response is Q for quit, in other cases it is E for end. Trying to type E when Q is mandated and vice versa quickly reminds one of this inconsistency.
Since the original system was released in September 1987 there
have been software updates and improvements with a number of
program errors corrected, thus it is likely that by the time this
review is published there will further features in the system which
are not mentioned here. The December 1987 release, Version 1.02
was the one reviewed here. The database and software are provided
on floppy disks - the 5 1/4 inch 360K and 1.2 MB, or the new 3 1/2
inch 1.44 MB. The number of disks needed range from 42 (360K) to
"only" 11 for the 1.44 MB. Loading the database and programs are
easy, although it does take at least 15-20 minutes. Thus one
should have sufficient storage capacity on your hard disk to leave
the database on the system, else searching will be a considerable
inconvenience. The system has been nicely modularized so that you
can choose which system features you want to have available should
you not have all of the needed 14 MB of disk storage available.
The minimum system needs are 8 MB of hard disk space, which
provides for the search program and the basic database. The system
runs best on an AT class machine, a PS/2 machine, or a 386 class
machine. The programs require 512K of RAM. EGA, CGA, and
Hercules graphics, all of which are optional for the spectrum
plotting, can be used. The system is quite flexible in allowing
one to specify the disk drive for the data (other than the standard
drive C), and allows one to have the database on different drive
from the programs. Data can be printed on Epson and compatible
printers, and the system does allow for high resolution plotting
using an HP Laserjet+ or equivalent printer. I have installed and
tested the system on a IBM AT, Epson Equity III+, and a number of
no-name clones. I had no trouble with any machine I tried. Only
my wrist became sore after loading all the floppy disks. The
system comes with a xeroxed 12 page manual, and a 22 page Appendix
which lacks a good set of examples, but is otherwise adequate.
Lastly, there is a 32 page Appendix of spectral errors which
indicates why spectra on the NBS magnetic tape, which is leased to
scientists and instrument manufacturers, are not in the PC version
of the database. I would rate the technical information and
accuracy of the manual high marks.
As for searching the database, the system is quite easy to use and rates a good grade in the area of user friendly. Searches for CAS Registry Numbers, ID numbers, and molecular formula took less than a second of elapsed time on an Epson Equity III+. CAS Registry Numbers, normally found with hyphens are not needed in a search, and if you try to enter a hyphen, the program does not allow it, so only the actual numbers are accepted. I like features like that which "know" enough to let you enter something in more than one way and still get the right answer. A molecular weight search took from 1-3 seconds depending on what, if any, constraints were placed on the search.
The slowest search in the system, a sequential search of the entire
database using 10 peaks, took about 5 minutes using a Epson Equity
III+ with a 40MB hard disk operating at a clock speed of 12MHz.
There are bugs in the system, as indicated above, but these are
being corrected as they are reported, and users are being sent
updated search programs.
At present there are no plans for a Macintosh version of the
system. CD-ROM is being considered, but clearly it would be a lot
slower, require the user to purchase a CD reader, and probably
would allow for less frequent updating. Adding structure output to
the plots is now underway, and full sub-structure search of the
entire database on the PC is probably not too far away. All in
all, this is a very impressive, well designed system, and for the
low cost, this should be bought by every mass spec lab, organic
chemistry lab, and organic chemistry/spectroscopy course. Useful,
easy to use databases for the PC (and hopefully the Macintosh too)
are the wave of the future. See the future now by getting this
database and learn how to make effective use it. It surely will be
followed by many others like it in the next few years.
Stephen R. Heller, Agriculture Research Service, Beltsville, MD
20705-2350.
References:
1. S. R. Heller, J. M. McGuire, and W. L. Budde, Envir. Sci. &
Tech., 9, 210(1975).
2. S. R. Heller and G. W. A. Milne, "EPA/NIH Mass
Spectral Data Base", in five volumes, part of the
NBS' National Standard Reference Data Series of
Critical Data Compilations (GPO# SN 003-003-01987-9,
#NSRDS-NBS 63), 4634 pages, US Government Printing
Office (1978 and reprinted 1980).
3. S. R. Heller and G. W. A. Milne, "EPA/NIH Mass
Spectral Data Base", Supplement Number One, in two
volumes, part of the NBS' National Standard Reference
Data Series of Critical Data Compilations, (GPO # SN
003-003-02268-3, NSRDS-NBS 63, Suppl. 1, 2151 pages,
December 1980).
4. S. R. Heller, G. W. A. Milne, and L. H. Gevantman,
"EPA/NIH Mass Spectral Data Base", Supplement Number
Two, in two volumes, part of the NBS' National
Standard Reference Data Series of Critical Data
Compilations, (GPO # SN 003-003-02268-3, NSRDS-NBS
63, Suppl. 2, 1110 pages, 1983).
5. J. G. Dillard, S. R. Heller, F. W. McLafferty, G. W.
A. Milne, and R. Venkataraghavan, Org. Mass Spectrom., 16,
48-49(1981).