THE NIH/EPA CHEMICAL INFORMATION SYSTEM IN SUPPORT OF LAB CHEMISTS



By Stephen R. Heller and Rudolph Potenzone Jr.

EPA, MIDSD, PM-218, Washington, DC 20460.



The use of computers as an integral tool for the research lab chemist can be a slow and often difficult process. However, for better (or worse), computers are here and the working chemist will be better off knowing what computers can do to support researchers. Computer literacy is probably more important today for a chemist than learning two languages.



The phenomenal growth in the use of computers in analytical instrumentation, from the balance to the latest Fourier Transform spectrometer, has been seen by all. Using computers to do literature searching is also becoming commonplace, albeit expensive and sometimes less than ideal.



An even newer use of computers, which many feel will be one of the most useful by the end of the century, is computer applications in chemical data searching and manipulation, coupled with data analysis techniques (such as pattern recognition and molecular modeling). By chemical data searching, we mean the ability to use the computer such as a time-sharing computer available over ordinary telephone lines throughout most of the world, to search for chemical classes and retrieve specific data (mass spectrum or heat of formation), or vice versa. The NIH/EPA Chemical Information System provides this ability. It is a collection of primarily numeric data bases, linked together, as seen in Figure 1, by central data bases of chemical structures, names, and related information. The system is being used by more than 1400 scientists in 21 countries for a yearly subscription fee and an hourly usage charge. (Universities and public libraries pay no subscription fee and are given $100 per month free usage against CIS usage.) One needs only a computer terminal, coupled to the telephone line with a modem or acoustic coupler, and vast amounts of chemical information will be at your fingertips.



For example, if a lab chemist wants to know if a chemical matching a particular structure or molecular formula is in the EPA inventory of existing chemicals in commerce in the USA, one needs only to plug into the NIH/EPA Chemical Information System (CIS) to find the answer. At the same time, using the chemical locator capability of the CIS, it can be discovered if the mass spectrum is available for printout and if the chemical can be purchased from a supplier. If one does know the name, a chemical formula (drawn on a computer terminal with a simple set of instructions) can be entered into the computer, and one can find out if the chemical is one of some 4500 chemicals sold by Eastman Kodak Company in its KODAK Laboratory Chemicals Catalog.



How often has one found a reference to a chemical but not known where to obtain a sample? As more and more information is being put into the computer, these questions are becoming easy to answer on demand, day or night, weekday or weekend. The most used component of the CIS, the Structure and Nomenclature Search System (SANSS), costs $85 per hour to search for chemical name, structure, molecular formula (partial or full), molecular weight, chemical class code, or partial chemical structure. The SANSS data base contains more than 225,000 chemicals and more than 650,000 names associated with these chemicals. These chemicals are assembled from 72 different sources of chemicals, ranging from the Merck Index, KODAK Laboratory Chemicals Catalog, and EPA Chemicals in Commerce in the USA to mass spectral and toxicology data. This chemical locator feature is being expanded to some 200 lists or sources of information about a chemical. From valuable feedback from the multinational CIS user community, it has been determined that one of the most pressing needs the lab chemist has expressed is to find a source for a chemical. This need is going to be met, as the test example in Figure 2 shows, by adding chemical catalogs to the sources of information about chemicals. The example shown in Figure 2 shows how a user can quickly look for the carbazole ring system. In this example one finds the first chemical is carbazole itself. (The other five answers are salts, addition complexes, and the same ring system with different bonding.) Further, it can be seen from the list of non-CIS sources (that is, sources which are not on-line in the CIS computer system), that carbazole is available for purchase from Kodak Laboratory and Specialty Chemicals as chemical number 600. For those interested in more information about this system, a few recent references are given.



References



1. G. W. A. Milne, R. Potenzone, and S. R. Heller, "Environmental Uses of the NIH/EPA Chemical Information System," Science, 215, 371-375 (1982).



2. S. R. Heller, R. Potenzone, G. W. A. Milne, and C. Fisk, "Computers in Analytical Chemistry," Trends in Analytical Chemistry, 1, 41-45 (1981).



3. S. R. Heller and G. W. A. Milne, "How to Plug into a Listing of Chemical Information," Industrial Chemical News, 1, 20-21 (October 1981); ibid, 18 19 (December 1981).



4. S. R. Heller and G. W. A. Milne, "Online Spectroscopic Databases," American Laboratory, 12, 3, 33-48 (1980).



To obtain details on how to access the system, please contact:

CIS Operations Staff, CIS Inc., 7215 York Road, Baltimore, MD

21212 (800 368-2251 or 301 321-8440).