A Look at the Future of Chemical Information

Stephen R. Heller
USDA, ARS, Beltsville, MD 20705-2350 USA

The current state of the chemical information in a number of areas is presented. The author then discusses a number of areas in detail and predicts what is likely to be the state of the field by the year 2000.

INTRODUCTION

This presentation is designed to stimulate discussion of what new technology will be available to chemists, applied to chemistry, and most importantly used by chemists in their everyday activities by the beginning of the next century. The use of computers in chemistry has gone from just calculations to cover a very broad area of chemistry. This paper delves into many of these areas and tries to give the current state of the use of computers and what the author believes the use of computers in the field of chemistry will be at the beginning of the 21st century.

BACKGROUND

Max Planck, more widely known in some places for Planck's constant, is, I feel not given proper credit for Planck's Law, which is: "New scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it." [1]. The wide-spread use of computers in chemistry has clearly been handicapped by a number of factors, a major one of which is the lack of familiarity of chemists and managers in the field of chemistry with this new technology. This is true in all job areas: academia, government, and industry. At present, the routine use of computers in a chemistry lab or office is rare. Why is this the case? There are a few hundred thousand chemists in the USA. Combined with those in other developed countries one could estimate some 500,000 chemists as a potential market for computers and computer systems.

Please note that when the qualitative phrases "few" or "low" are mentioned for the overall use of a particular piece of computer, a computer program, or a computerized database, the phrase is meant in comparison to the overall potential purchase and use by some 500,000 users. While selling a total (over the lifetime of the program) of few hundred or even a few thousand molecular modeling or structure drawing programs is, today, a major accomplishment in the chemistry, it is a minor event relative to the daily sales word processing, database management, spreadsheets, and other such programs.

COMPUTER AND CHEMICAL INFORMATION ISSUES

Table 1 summarizes both the issues which are to be discussed here as well as the current and predicted level of activities in these areas. Space in these proceedings does not permit a proper and full analysis of all of these topics.



Table 1
Issues for Discussion
Topic Today 2000
Computer Low - Moderate -
Literacy Moderate High
Computers PC-DOS Mac/Windows/OS-2
Tele- Moderate Only the dead
Communications Usage don't use INTERNET
Interfaces Frightful Transparent
& &
Difficult Voice Based
Graphics Usage in Infancy Predominant Usage
CD-ROM Few Many
Chemical Raw Processed
Information
Online Usage Low Low
for Chemistry
SDI Manual or by Post Electronic
Databases Bibliographic Numeric
Factual
(e.g. SpecInfo)
Beilstein E-V Series E-V Series
being published still being published
Chemical Online Searching Online Ordering
Catalogs
Chemical CAS RN Chemical
Identification & BRN Structure
Molecular Few Some
Modeling
Educational Random Usage Integrated with
Software Textbooks
Publishing Semi-Electronic Mostly Electronic
Books Thought of as Thought of as
probable dinosaurs probable dinosaurs
Instruments Semi-automated Full-Automated with
ISO Data Transfer Standards
Routine Spectral Analysis (SpecInfo)


The heart of the matter is computer literacy. Growing up with, being familiar with, and making regular use of computers and computer systems of information will not become the norm and "triumph" (per Max Planck) without the necessary atmosphere and background being part of one's educational upbringing. The current state of education in many parts of the world will make this difficult. However one would hope that in college and graduate school there would be sufficient competence to train the upcoming generation of chemists to familiar with computers.

Using computers consists of two parts. Writing programs and using programs. Writing programs is really a rather limited issue. A computer is a tool. When a chemist gets too involved in the tool then he or she is, more often than not, no longer doing chemistry. What matters is using programs. To do this effectively and properly one needs to know what a computer can do for you in the area in which you need to solve a problem. I don't need to be an automotive engineer to know that to get somewhere. I need a car to drive there and I need to know how to drive a car. The same is true with computers. Understanding what a computer can do is important. Then either finding software and hardware to do it, or getting someone to produce what is needed to get the job done is relatively simple. Chemists don't use computers as an end in themselves. Chemists should use computers a one of many tools to do their job.

Today the use of computers is relatively low. Most chemists use computers for administrative purposes (like writing this manuscript). Use of computers for electronic communication appeals to a small, but growing number of chemists. There are reasons for this and they include the lack of modems and related dedicated phone lines. Also there is the lack of computer accounts on the necessary computer networks (INTERNET, BITNET, Compuserve, etc.). Along with this are the problems in connecting between networks. If I want to telephone someone in another city or country I need only get the phone number from a telephone operator (except for unlisted numbers). With computer networks, there is no phone book, no operator. All numbers (actually computer network addresses) are unlisted. That does present (using a good chemical expression) a minor energy barrier to solve a problem. However one can see changes coming. A few years ago a business card had a name, title, address, and phone number. Today many business cards have FAX and INTERNET addresses. This is part of computer literacy. This is progress.

Another major problem with computer programs is their downright difficulty in using them. Pacman and Nintendo (the popular video games of the 1980's and early 1990's) never came with manuals. Some manuals seem more designed for weight lifting than explaining how to use a particular computer program. Installing and running programs is a major energy barrier for many people. Some of the larger companies in the USA (e.g. WordPerfect) provide toll-free phone numbers for helping customers install and effectively use their programs. WordPerfect spends over $500,000 per month on such phone calls. Most other companies provide free help (but not toll-free telephone calls), especially for the first 90 days to new customers. My preferred philosophy is that if I must read the manual to use the computer program, I probably am better off without it. There is no way someone can become proficient in using a wide variety of programs and remembering what each does and how to perform particular tasks, as well as doing there assigned job as a chemists. As we all know there are few people using their VCR's to record TV shows because they can't figure out how to do it. This even created a market for a device which aromatically sets up the VCR to record based on a set of 5 digits you type into a device. The 5 digits are published in newspapers in the USA everyday next to each TV program listing.

Table 1 speaks of today's interfaces as being frightful and difficult. One can only hope and expect that as computers become more powerful and better software engineers graduate and get a job, that the interfaces in the year 2000 will become transparent and even voice-based. One way to accomplish having better interfaces in the future is through the extended use of graphics in computers. Today the use of high resolution graphics (1024 x 1024 pixels) is low. Color screen size is small (12 - 14 inches) and expensive. By the year 2000 I would expect that every computer will have a 20 inch color monitor with at least 2048 x 2048 resolution, along with a color laser printer or plotter with the same capabilities.

CD-ROM's are just beginning to find use in chemistry. Again the problem of the lack of good software, adequate computer hardware, and available databases has limited the growth and use of this medium. CD-ROM's, which today store about 660 million characters (about 330,000 pages of text), will, by the year 2000, replace many reference books on the chemists' bench and bookshelf. A few pioneers in this area, such at the Beilstein Institute in Frankfurt Germany, under the leadership of Clemens Jochum, are leading the way to what will clearly be the library of the future. The Beilstein Current Facts CD-ROM has about one year of extracted data from the literature, along with a computer chemical structure search system, all neatly tied together. Someday, the weekly issue of Chemical Abstracts will come to each chemist this way. Each chemist will have the Merck Index, CRC Handbook of Chemistry and Physics, ACS Directory of Graduate Research, and a few ACS journals, all on CD-ROM's. By the year 2000 it should be possible to custom order a set of books on CD-ROM. For example, the ACS Symposium Series of several hundred books could be entered into computer readable form and then books "printed" on a CD-ROM on demand, the same way floppy disks are copied today. Using keywords or phrases one could select a set of books you might want on your bookshelf (actually your CD-ROM jukebox device), and send the order for such a disk to be mastered and mailed to you (sorry about that, the US Postal Service will still be in business in the year 2000). Certainly custom made orders would be more expensive than pre-packaged ones, but well within the means of many chemists. Groups of chemists, such as the polymer or materials chemists could create their own CD-ROM's based on existing volumes already printed. IUPAC could create a CD-ROM of Pure and Applied Chemistry. Spectral database companies like SpecInfo in Germany could easily create customized CD-ROM spectral databases of either specific nuclei or particular classes of compounds. The list is almost endless.

A useful point to consider when thinking about how the dissemination of information will evolve, is to look at the ways various companies are currently making their strategic plans in this area. As this conference is being held in Yokohama, Japan, it seem most appropriate to use as an example, the Kinokuniya Bookstore Ltd. A group with in Kinokuniya, under the leadership of Mr. I. Miura, has been developing the computer literacy and capability within the company over the past decade to provide the total information needs for the chemist. Initially Kinokuniya started out as providing just books and journals. Today this far-sighted group has expanded to providing online information and CD-ROM databases.



ECONOMIC ISSUES [2]

The recent (and perhaps still current) recession in a number of developed counties of the world has led to the re-invention of how to sell products. When people don't fly, airlines have lowered their fares to fill seats. When people don't buy automobiles, General Motors, Ford, and Chrysler, along with foreign car companies lower the prices to stimulate sales. When hotel have occupancy rates below 50% and need 65% occupancy to at least break-even financially, hotels offer cheap rooms. There are many more examples outside of the chemical information area, but it should suffice to state that the Japanese predominance of the consumer electronics industry clearly shows lower prices leads to higher volumes and generally higher profits. As for examples in chemical information, one need only to mention such publications as the Merck Index (priced at $ 30) or the CRC Handbook of Chemistry and Physics (priced at $ 99), now in its 73th edition and currently edited by David Lide, a leading authority in scientific databases. Both these products sell in the tens of thousands of copies. One would hope that companies in the field of chemical information will begin to experiment with new marketing approaches which will both increase the usage of their products and reach a larger segment of the chemistry population. Without a greater volume of usage it is possible that information will remain a commodity for only a small portion of the chemical community.

CONCLUSION

The economics of chemical information have, up to this point in time, have made it a tool for the wealthy in the more developed nations of the world. Computers and the related technology described in this article hold the potential promise that by the 21st century more chemical information and computer systems will be available to the entire world-wide community. These additional numbers of users should allow the costs of the products being developed to be spread across a much wider number of people, leading to higher usage, higher productivity and lower costs for all computer related products.

REFERENCES

[1] M. Planck, "Scientific Autobiography and Other Papers", Williams & Norgate, London (1950), pages 33-34.

[2] S. Heller, "Proceedings of the 15th International Online Information Meeting, London, December 1991, pages 47 - 50.

[3] M. Williams, "Proceedings of the National Online Meeting, New York, May 1992, pages 1-4.