The gas chromatograph/mass spectrometer coupled to the minicomputer offers an accurate, economic approach to the detection of pollutants in all media

Stephen R. Heller John M. McGuire William L. Budde


Washington, DC 20460 Athens, GA 30601 Cincinnati, OH 45268

The identification and measurement of the concentration of specific organic compounds that contaminate the environment have concerned environmental research scientist for many years. Gross measurements of organic pollution such as the chemical oxygen demand (COD), biochemical oxygen demand (BOD), and total organic carbon (TOC) tests are of no value in situations where information is needed about specific pollutants. Firm identifications are required to determine:

A recent example of the importance of making firm identifications of specific compounds was the discovery of a wide variety of compounds, including the potential carcinogens, chloroform, bromodichloromethane, and dibromochloromethane, in several drinking water supplies.

The earliest approaches to separation, identification and measurement were based on detailed chemical-instrumental procedures required relatively inexpensive instrumentation. The colorimetric 4-aminoantipyrine derivative procedure for phenol, and the electron capture detector-gas chromatography procedures for chlorinated hydrocarbon pesticides are typical examples of this approach. Although these specialized techniques are justified for compounds of special significance-pesticide residue analyses in foods-there are several inherent limitations to the approach. In order to include all environmentally significant compounds, thousands of detailed procedures would have to be developed, tested, and documented; implementing these procedures would be extremely slow. This approach includes no provision for finding new contaminants except by an unexpected interference in a particular method.

Computerized GC/MS

With the development of environmental concerns in the late 1960's, it was recognized that general analytical methods of high information content were required; these methods would facilitate the economical analysis of a wide variety of organic pollutants. The use of the mass spectrometer as a detector for the gas chromatograph (GC/MS) was developed during the 1960's. The mini computer was invented at this time and applied to GC/MS to utilize the information produced by the mass spectrometer. Computerized GC/MS quickly revolutionized the field of trace organic analysis, and made very significant contributions to research in medicine, biochemistry, flavors, odors, and organic geochemistry. One advantage of the technique is that is substantially increases the capacity of a staff to handle large numbers of environmental samples, and to accurately identify specific organic compounds.

Based on the promising GC/MS results obtained at the Southeast Water Laboratory in the late 1960's, the U.S. Environmental Protection Agency (EPA) in 1971 made a major commitment to computerized GC/MS for organic pollutant analysis. Over 20 systems were installed in laboratories across the country. The labs included the Regional Surveillance and Analysis Facilities, which carry out EPA's monitoring function; several field investigation centers, which are part of the enforcement arm of EPA; several pesticide laboratories; and a number of R&D facilities.

Like any automated method, computerized GC/MS does not obviate the need for skilled manpower. In the field laboratories, GC/MS equipment is under the supervision of a chemist with an advanced degree or equivalent experience. Research support for field labs is obtained from spectrometrists, analytical chemist, electronics engineers, and laboratory minicomputer specialists in research and development laboratories.


Typical computerized GC/MS system

The identification of pollutants at the part-per-billion level with a high degree of confidence in the result has become nearly routine in several EPA laboratories. What was once an impossible task for a staff of 100 working six months sometimes can be accomplished by a skilled individual in a few hours.

Most of EPA's installed systems use Finnigan Quadrupole Mass Spectrometers controlled by Digital Equipment Corporation PDP-8 microcomputers. However, some Varian and Hewlett-Packard spectrometers are used with Varian 620 and H-P 2100 minicomputers. A typical EPA configuration is shown in Figure 1.

The gas chromatograph is a powerful tool for the separation of mixtures of volatile organic compounds. However, since conventional GC detectors provide no qualitative information about the sample, they are either removed entirely or used only during solvent venting or other ancillary functions. As is the case with conventional GC detectors the mass spectrometer is very sensitive. But, in contrast to the single channel response of most conventional GC detectors, the mass spectrometer provides a multichannel response (abundance measurements of ions of different masses) that provides a great deal of information about molecular structure, and composition of organic compounds. This information may be displayed graphically by a fast cathode ray tube or a hard copy plotter, printed in digital form, or transmitted over conventional voice-grade telephone circuits to other data handling systems. Figure 2 is a gas chromatogram constructed from mass spectrometric data. Two mass spectra, retrieved from the minicomputer disk storage, correspond to the indicated points on the GC.

In non-computerized GC/MS systems, mass spectra generally are not acquired continuously during a GC run. Continuous mass spectra acquisition generates very significant information that cannot be processed economically or accurately by hand. In a computerized system, mass spectra are acquired continuously; the minicomputer handles all information and controls the operation of the Quadrupole spectrometer simultaneously. Data are stored temporarily on a magnetic disk or tape before background corrections are made, and before data are output by plotting, printing, or transmission.


A gas chromatogram constructed from

MS data of an industrial waste sample

Assumptions and interpretation

With the hardware and computer programs mentioned above, it still may not be possible to make valid identifications of organic pollutants. A basic assumption of this method-that the organic compounds are sufficiently volatile for gas chromatography-must be met if conventional GC/MS is to be useful.

Another basic assumption is that the chromatograph produces a clean separation of the organic components in a mixture, the classic problem of chromatography. This is made somewhat more manageable with a mass spectrometer detector. An experienced user can frequently ascertain that the separation is clean and free of overlaps by examination of the consistency of the mass spectra obtained at various points across the GC peak. This is not possible with the simpler, conventional GC detector. By suitable background corrections, the GC/MS computer user usually can isolate both spectra of overlapping peaks.

With the mass spectrum of a pure component in hand, only one additional step is required to make a correct identification: The interpretation of the data. Interpretation may be made by applying the theory of mass spectra, and the rules of fragmentation of ions in the gas phase. This process is tedious, and it is difficult to sustain the necessary deductive reasoning process for the long periods of time required to make a large number of identifications. Lack of sufficient knowledge about the details for the fragmentation process further limits the effectiveness of this approach in interpreting some spectra. In addition, the mass spectrometrist may take advantage of the collections of reference mass spectra that have been accumulated in recent years. Empirical methods have been developed for searching a file of reference mass spectra to find a similar or exact match of an experimental mass spectrum.

Any empirical search and match system has two fundamental components:

Computerized search systems

In 1971, EPA undertook the development of a computerized search system with a grant awarded to the Battelle Memorial Institute of Columbus, Ohio. This system was patterned after an approach developed by Biemann and associates at the Massachusetts Institute of Technology. A significant feature of this system is that data are automatically transmitted over conventional voice-grade telephone lines directly from the microcomputer to a program running in a large-scale remote time-sharing computer.

The remote computer has access to the data base, conducts a search for a match based on the transmitted mass and abundance data, and sends the results back to the minicomputer in a matter of seconds. A major advantage of this system is that the names of compounds whose spectra are similar to the spectrum of the unknown are automatically printed in order of the similarity of their spectra to the spectrum of the unknown. The degree of similarity is measured by a numerical value on a scale of 0 to 1 that is included on the printout. Since the whole operation is relatively automatic, probable identification can be made without full-time interaction with a highly trained spectrometrist.

The major part of the data base used by EPA was acquired from the Mass Spectrometry data Centre (MSDC), an agency of the British government located at Aldermaston, England. Included in this collection were a number of sub-collections; the American Petroleum Institute's file of mass spectra, the Dow Chemicals Co.'s collection, and American Society for Testing and Materials collection, and several other smaller sets of spectra. The original data base consisted of about 10,600 spectra including an undetermined number of duplicates. This data base was augmented by 600 EPA pollutant spectra.

About the same time the National Heart and Lung Institute (NHLI) of the National Institutes of Health (NIH) implemented a matching system that, from the user's point of view, was somewhat different. The data base selected was a slightly updated Aldermaston file, but the data entry and the search procedure were different. The user enters mass and abundance data, one pair at the time, from an inexpensive keyboard/printer terminal that is interfaced to a conventional voice-grade telephone line. This terminal has no obligatory connection to a GC/MS minicomputer, which allows a large group of users of non-computerized GC/MS systems to test and evaluate the spectrum matching system. The user-entered data are transmitted to a large remote time-sharing computer that has access to the data base. The search is conducted, and the number of spectra in the file having the mass/abundance pair is returned to the user in a matter of seconds. By a repetition of this procedure the number of spectra that fit can be minimized until the choice is among a small number of spectra. The user then request the names of these compounds to be printed at his terminal.

An important feature of the NIH system is that the user can reach the data base with information other than mass and abundance data. Spectra can be retrieved based on molecular weights, partial or complete molecular formulas, mass losses from the molecular ion, MSDC classification codes, and combinations of all of these. Furthermore, complete spectra can be typed out or plotted at the user's terminal. Since the user must impose his judgment in entering data, this system is oriented to the experienced user. The flexibility of this system permits its use in situations where a good match is not available in the data base. Spectra can be retrieved that have features similar to the experimental spectrum, and these provide clues to the identity of the unknown. The NIH system has found wide use in many government laboratories, including EPA. In addition, a number of private industry and university laboratories have had the opportunity to test the system.

Worldwide MSSS

With the development and refining of these systems it became apparent that a consolidation of the two systems would be economical and beneficial. The EPA in conjunction with the NHLI, the Food and Drug Administration (FDA), and the MSDC is supporting the consolidation of the systems into an international mass spectral search system (MSSS). The goal of this merger is to provide a user-oriented, flexible, and self-supporting MSSS for the worldwide mass spectrometry community. The entire system will be designed to encourage experimentation in the expectation that a better and more useful system will evolve. Significant advantages of the merged systems included worldwide access to the same data base, and continuous updating on the data base.

In order to attain the goal of a truly worldwide system, it was decided to implement the MSSS on a commercial time-sharing computer system supported by a well-developed communications network. An International time sharing system was selected that is accessible by a local telephone call from many cities in the U.S. and overseas.

The MSDC has contracted with the time-sharing service to provide the computer and the communications network for the MSSS. A small royalty fee is paid MSDC by each user and this will be used for maintenance, storage, and "update" cost for the system. The time sharing system fees are for computer and network connect time only. There are more than 100 accounts on the system, with an average of two new accounts being added each week. An exact copy of the MSSS will remain on the timesharing system (a PDP-10) at the NIH, but it will only be accessible to the NIH, EPA and FDA for further research and development. As new and improved versions are proven reliable, they will be transferred to the commercial system by the MSDC. A summary of current MSSS options is given in Table 1.

Future Development

Currently funded R&D projects include a vigorous effort to expand and improve the quality of the combined data base of mass spectra. The present data base consists of an expanded version of the original MSDC file, another collection acquired from the book publisher, John Wiley & Sons, and spectra collected by EPA. This amounts to about 30,000 spectra. The EPA and FDA have contracted to acquire new spectra of particular interest to each agency. In addition, contractors are evaluating the spectra in the present data base, removing erroneous and duplicate spectra, and developing guidelines for the establishment of a large, high-quality file. All participating agencies are working to collect existing files of spectra from spectrometrists throughout the world for inclusion in the data base. EPA is establishing collaborative efforts in data collection and software techniques with environmental groups in the European Community (EC) and environmental and agricultural groups in Canada. Many contributions have been received from scientists around the world, making MSSS a truly user-oriented and user-accepted system. A goal of 50,000 quality spectra by 1976 has been set.

The original EPA-developed minicomputer-to-remote-computer direct transmission system was compatible with only one GC/MS system that employed a Digital Equipment Corp. PDP-8 Processor. Work is in progress to develop microcomputer and remote computer programs for many of the microcomputers that are used on different GC/MS data systems. This effort is receiving some support from GC/MS manufacturers and system houses such as Finnigan, Du Pont, Hewlett-Packard, Spectrometer Data Systems, Systems Industries, and Varian-MAT; these companies are participating in the development of direct transmission programs for their particular minicomputers.

It is emphasized that the data base and the software for accessing and searching it are separate and distinct. Therefore, a number of different and perhaps experimental software search systems may be operational simultaneously with the same data base. It is expected that in the near future new developments in software that use a mass spectral data base will be available. Indeed, a user may wish to develop specialized software and compare it to the existing operational software; this is being encouraged by the system designers. Possible future software includes structural interrogating systems (searchers for all spectra of beta-chloroamines, for example), the self-training interpretive and retrieval system (STIRS) developed by McLafferty and associates at Cornell University, software based on learning machine or pattern recognition techniques, software based on Wiswesser Line Notation (WLN) or Chemical Abstracts Service (CAS), Registry Number (REGN), and structure connection table files. Another proposal is to include the time and place of sampling for pollutants as input along with unknown spectra. This would permit retrievals by distribution of identified and even unidentified pollutants.


Mass Spectral Search System

(MSSS)-Current options

1. Peak and Intensity Search

2. Loss and Intensity Search

3. Molecular Weight Search

4. MSDC Code Search

5. Molecular Formula Search

(a) Complete

(b) Partial, Stripped

6. Peak and Loss Search

7. Peak and Molecular Weight Search

8. Peak and Molecular Formula Search

9. Peak and MSDC Code Search

10. Loss and Molecular Weight Search

11. Loss and Molecular Formula Search

12. Loss and MSDC Code Search

13. Molecular Weight and MSDC Code Search

14. Molecular Weight and Molecular Formula Search

15. Complete Search (Biemann/Battelle)

16. Dissimilarity Comparison

17. Spectrum/Source Print-out

18. Spectrum/Source Display

19. Spectrum/Source Plotting

20. Spectrum/Source Microfiche

21. Crab-Comments and Complaints

22. Entering New Data

(a) Minicomputer Interface

(b) Data Collection Sheets

23. News-News of the MSSS

24. MSDC Bulletin-Literature Search

25. CAS Registry Data

26. WLN

27. SSS-Substructure Search of CAS Data

As virtually all GC/MS systems become computerized, another component of MSSS is expected to evolve. This would be the capability of a GC/MS data system minicomputer to extract from the remote computer, by a direct telephone connection, a subset of the large, continuously updated master data base. This 500-2000 spectra minidata base would be retained at the local computer site and minicomputer software would be used to search the small library locally. Specialized users who have large numbers of unknowns in one area of concern-pesticides, food additives, drugs-would have the benefit of decreased costs, yet would retain the advantages of a uniform, updated data base, and the backup of the large numbers of unknowns in one area of concern base. This approach is far more economical and feasible than an attempt to develop a complete MSSS on a local spectrometer data system minicomputer. Support for a large data base requires costly peripherals such as large disks for each minicomputer; flexible search systems with many options require relatively large core memories for each minicomputer. It is difficult to develop time sharing minicomputer software to permit simultaneous data acquisition and data base searching. The maintenance of a large data base on a small system is costly and time consuming; thus it tends to become static.

With the further development of the MSSS, the effectiveness of computerized GC/MS should improve substantially at a small additional cost. The cost/identification should decrease dramatically in the next few years.

Additional reading

MSSS Users Manual, MIDSD, PM-218, EPA, Washington, DC 20460 (May 1974).

Heller, S. R., Anal. Chem., 44, 1951 (1972).

Kwok, K. S., Venkataraghavan, R., and McLafferty, F. W. , J. Am. Chem. Soc., 95, 4185 (1973).

McGuire, J. M., Alford, A. L., and Carter, M. H., EPA Research Report, EPA-R2-73-234 (July 1973).

Hertz, H. S., Hites, R. A., and Biemann, K., Anal. Chem., 43, 681 (1971).

23 EPA laboratories have computerized

GC/MS systems


Cincinnati, OH (Methods Development & Quality Assurance Research Lab.)

Athens, GA

West Kingston, RI

Duluth, MN

Gulf Breeze, FL

Ada, OK

Corvallis, OR

Research Triangle Park, NC

Cincinnati, OH (Water Supply Research Lab.)

College, AK


Denver, CO

Cincinnati, OH


Needham Heights, MA (Region I)

Edison, NJ (Region II)

Annapolis, MD (Region III)

Athens, GA (Region IV)

Chicago, IL (Region V)

Houston, TX (Region VI)

Kansas City, KS (Region VII)

Alameda, CA (Region IX)

Redmond, WA (Region X)


Bay Saint Louis, MO

Beltsville, MD

Stephen R. Heller is a computer specialist in the EPA's Management Information and Data Systems Division. Dr. Heller has been involven in developing a chemical information system.

John M. McGuire is chief of the chromatography and mass spectrometry section of EPA's Southeast Environmental Research Laboratory. Dr. McGuire's research interests involve ways of increasing speed and accuracy of pollutant identification.

William L. Budde is chief of the advanced instrumentation section at EPA's Methods Development and Quality Assurance Research laboratory. Dr. Budde is concerned with methods development in gas chromatography-mass spectrometry and laboratory authomation.

Coordinated by LRE