The Beilstein System - An Introduction

Stephen R. Heller

In This Chapter

This chapter presents recent developments of the Beilstein database and its component system parts, along with an overview of the many topics, databases, and software systems covered in this book. A short description of the Gmelin database is also presented.

  • Introduction to the Beilstein System
  • Evolution of the Beilstein Institute
  • Chemical Abstracts
  • NetFire
  • CrossFire Gmelin System
  • Coverage of the Periodic Table
  • Chapter Overviews

    This book is designed to both update the reader on the status of the Beilstein database and search system and derived electronic information, as well as provide the chemistry community with a comprehensive and up-to-date view of the overall activities and scope of the current Beilstein system. When the first ACS book on Beilstein was published in 1990 (1) there was a renewed and growing interest in Beilstein, as it moved from its past environment of a print based German language reference, existing only as a very large series of books (which made up the Beilstein Handbuch der Organischen Chemie - or, in English, the Beilstein Handbook of Organic Chemistry). In the short time since 1990, the staff of Beilstein (first in the Beilstein Institute and now in Beilstein Information Systems) have continued their vast modernization program, which has resulted in many new chemistry data and information resources for the chemical community. This book is a tribute to Konrad Beilstein, (Figure 1), the man who had the foresight to realize that high quality scientific chemical data and information would be a timeless resource for chemists.



    Figure 1 - Friedrich Konrad Beilstein


    Introduction to the Beilstein System

    Throughout the history of abstracting scientific literature in chemistry, the name Beilstein has had a unique position of quality and value to the chemist. As a young graduate student in organic chemistry I first came across Beilstein as part of my synthesis work on a bicyclic nitrogen ring system. In 1963 it took me a considerable amount of time to search for this class of compounds and translate the appropriate sections of the Beilstein Handbook into English; now those activities take a few seconds, because the database is computerized and the information is essentially all in English. The transformation of this sleeping giant into a modern information resource is amazing and of great value to the ongoing progress of most of academic and industrial chemistry.

    Table 1.1 is a brief chronology of Beilstein - the man, the Handbook, and the database. It is easy to see that the period between the first ACS book and this book was a time of major development and change at Beilstein.

    Table 1.1 - Beilstein Chronology

    1838: Friedrich Konrad Beilstein was born in St. Petersburg, Russia

    1865: Beilstein became Professor of Organic Chemistry, St. Petersburg

    1866: Beilstein became the Chair of Chemistry at the Imperial Technical Institute in St. Petersburg

    1881: First Edition of the Beilstein Handbook (2 volumes, 1500 compounds, 2200 pages)

    1885: Second Edition (3 volumes, 4080 pages)

    1906: Third Edition (8 volumes, 11,000 pages)

    1906: Death of Friedrich Konrad Beilstein

    1918: Fourth Edition (German Chemical Society, Berlin)

    1984: Fifth Edition in English (480 volumes, 400,000 pages)

    1988: Beilstein Online on STN

    1989: Beilstein Online on DIALOG

    1990: ACS Symposium Series book - The Beilstein Database

    1995: CrossFire - Beilstein file in-house with ca. 6,000,000 organic compounds with properties data

    1996: CrossFire Gmelin - Gmelin file in-house available with ca. 1,000,000 inorganic and organometallic compounds with properties data

    1997: ACS Book - The Beilstein Database and Search System

    Even though the Beilstein Handbook of Organic Chemistry covers Aonly@ organic chemistry, it has been a critical resource for most of the 20th century. As we approach the 21st century the management and leadership of Beilstein Information Systems GmbH has further developed this valuable resource into a number of practical, every-day, tools for the chemist. The purpose of this book, the successor to the ACS Symposium Series book entitled "The Beilstein Online Database - Implementation, Content, and Retrieval", is to demonstrate how the Beilstein database and associated software products have been evolving over the last 7 years since the first book was published.

    With all the changes, remodeling, and reorganization of Beilstein in the past few years into two separate organizations, the Beilstein Institute and Beilstein Information Systems (explained in further detail later in this chapter) it is probably best to first answer the question - "What is Beilstein, now?" Perhaps the best way to begin to answer this question is to show what Beilstein was in the past. Beilstein is an enormous set of reference books (Figure 1.2) that, although very well organized, occupied a great deal of space. It was relatively difficult to quickly locate information spread throughout the five different series composing the handbook.

    Figure 1.2. The current Beilstein Handbook, from the Basic Series to Supplements I - V.

    To find out what Beilstein is today, one need only search a number of Internet resources. A recent search showed many citations of the word "Beilstein". For example, the Lycos (2) search engine retrieved almost 600 records. Other search engines, such as, Alta-Vista (3) produced some 200 citations, whereas InfoSeek (4) produced 76 citations, and WebCrawler (5) produced 59 citations. Lastly Yahoo! (6) found Beilstein in just one citation, the Beilstein web site. These various citations included a number of references to the Beilstein Internet World Wide Web (WWW) site, www.beilstein.com as well as many chemistry department www sites around the world that have either the Beilstein Handbook or access to Beilstein in electronic format. There are also references to Beilstein the town in Germany, Beilstein jade, the Beilstein mountain in Austria, and so on.

    Although there are many answers to "What is Beilstein, now?", this book will cover only those that relate to the areas of chemistry for which the Beilstein Institute (7) and Beilstein Information Systems (7) are involved. When the first ACS book was published in 1990, it was easy to say that Beilstein was the Handbook and the online database. Today, a mere 7 years later, the name "Beilstein" represents the Handbook, Beilstein Online, CrossFire, CrossFire plus Reactions, and CrossFire Gmelin, Autonom, Current Facts, and more.

    Evolution of the Beilstein Institute

    To create this "new world" of the 21st century Beilstein, a new administrative approach was developed and implemented by the two presidents of the Beilstein Institute during the early 1990's, Clemens Jochum and Reiner Luckenbach. The evolution of the Beilstein Institute from a world class institute to a world class modern organization, involved a considerable change in approach, attitude, and internal and external operations. Beilstein has now evolved and remade itself into a commercial venture, and it is run as a business, in a most business-like manner. The transformation of all staffing and financial activities, from outsourcing the creation of the database, to the termination of German government support, to having all services paid for by users - a critical goal for the success of the Beilstein system - has produced a completely new Beilstein. Virtually nothing but the name and high quality are the same after this massive reorganization effort.

    Beilstein has also changed administratively from being only the Beilstein Institute to being both the Beilstein Institute and Beilstein Information Systems. Beilstein, which was funded for years by the German publisher Springer-Verlag and the German government, is now changed and a private company, Information Handling Systems (IHS), is responsible for the finances. A new corporate structure, shown in Figure 1.3 (8), shows these overall relationships. In this new structure, the database is now 74% owned by the Beilstein Institute and 26% owned by IHS. The marketing company, Beilstein Information Systems GmbH, is 74% owned by IHS and 26% owned by the Beilstein Institute. Lastly, the American subsidiary, Beilstein Information Systems Inc, is 100% owned by the German company Beilstein Information Systems GmbH. The new Beilstein database is now marketed and sold by IHS, a highly successful information company, which chose Beilstein as its first project in the scientific electronic database area. The Beilstein Institute staff now only continues to produce the Beilstein Handbook.

    Beilstein Corporate Structure



    Figure 1.3. The new Beilstein Corporate Structure

    Lastly, software such as Autonom (the automatic structure - nomenclature program) has been developed. It is from all of these changes and additions to Beilstein that the title of this chapter, The Beilstein System, was chosen. Because it is now a total system is made of many parts and activities, it seems to be the most appropriate term to use.

    Chemical Abstracts and Beilstein

    One other point needs to be stressed before going further. The Beilstein system is primarily a system of data, which is extracted and processed information. Chemical Abstracts Service, a part of the ACS, has produced Chemical Abstracts (CA) since 1907. CA is another valuable and critical source of chemical information. Chemical Abstracts and Beilstein are often thought of as competitors by some in the field of chemical information. In reality, the CA and Beilstein databases are complementary tools. CA covers virtually all of the chemical literature - organic, inorganic, physical, analytical, polymer, materials, and so on. CA also covers the literature primarily from a very different and valuable perspective by providing an abstract and a summary that is a synopsis of each article. In comparison, Beilstein covers primarily the chemical literature focused on organic chemistry dating back as far as 1771. Beilstein covers the chemical literature in which scientific data is presented, providing all the data presented in each published article as given by the original authors. Even though Beilstein has covered "classical" chemistry for most of the 19th and 20th centuries, as it moves into the 21st century it has responded to the needs of the chemical community and expanded its coverage of the literature to include toxicological and physiological effects of chemicals. Both CA and Beilstein have their audiences, which can overlap, but are often quite different and the applications of these two resources are also quite different. To obtain complete information, the research chemist must search both files to find all the, hard data and other information on any compound of interest.

    NetFire

    Two potentially important topics are covered in a very cursory way in this book, because they are too new for proper presentation and discussion. The first is NetFire, the new Internet based current awareness organic chemistry literature search service from Beilstein. Released in December 1996, and made available to the chemical community for testing and evaluation at no cost for the first half of 1997, there is insufficient experience with NetFire to include it here. There has been one short description of NetFire written (9). Basically NetFire is a database of journal titles, authors, and abstracts from the organic chemical literature from 1980 to the present. The NetFire database contains the titles, abstracts and authors of journal articles published in over 140 of the top journals in organic and medicinal chemistry. The Internet WWW query-by-form mechanism permits the formulation of a great variety of inquiries. You may search by author, by words included in the abstractor , by title (or words therein). You may also restrict your search to a certain journal, or time range. Also included in the output is a cross reference number to CrossFire, for those who have access to the entire Beilstein database.

    CrossFire Gmelin System

    The second topic not covered in this book is the CrossFire Gmelin system - which is the Gmelin database provided in-house under the Beilstein CrossFire software - The Gmelin database, produced by the Gmelin Institute of Frankfurt, Germany, is the most comprehensive collection of factual data on organometallic (coordination) compounds, alloys, glasses, ceramics, minerals and physical chemistry. At present the Gmelin database contains more than 1 million chemical substances and over 900,000 reactions reported in the literature from 1772 up to the present and Gmelin handbook citations from 1772 to 1975. Gmelin contents includes approximately:

    ca. 470.000 coordination compounds

    ca. 55.000 alloys

    ca. 14.000 glasses and ceramics

    ca. 3.200 minerals

    With the Gmelin database as part of the in-house Beilstein CrossFire system, the two most comprehensive factual databases covering the areas of organic, inorganic and organometallic compounds are now readily available under a speedy, convenient and user friendly client-server based software system.

    Coverage of the Periodic Table

    Before outlining the details of this book it is constructive to describe Beilstein's and Gmelin 's coverage of the periodic table of the chemical elements. In Figures 1.4 and 1.5 the chemical elements included in Beilstein and Gmelin are presented.




    Figure 4 . The Periodic Table showing which elements are covered in the Beilstein database

    Figure 5. The Periodic Table showing which elements are covered in the Gmelin database.

    Chapter Overviews

    Chapter 2, written by the recently retired Reiner Luckenbach, who was the president of the Beilstein Institute during the very exciting and turbulent times of the massive evolution of Beilstein into its current organizational structure, gives the background of the original Beilstein Handbook. The data quality control efforts are also described. This chapter helps put into perspective all of the other chapters in this book. It has been the massive massaging and manipulation of the existing Beilstein Handbook contents, coupled with the careful addition of additional extracted data and information, which has produced the evolving Beilstein system for the 21st century.

    Chapter 3

    Even though the Beilstein system has many new features, which are described in later chapters of this book, the Beilstein Online database, one of the original components of the Beilstein system, is still a very valuable resource for chemists. Andreas Barth, of STN - Karlsruhe, who worked with the Beilstein Institute in the late 1980's to develop the first version of the online database has revised and updated his original contribution from the ACS Symposium Series book. This chapter includes information on the general structure of the information in the database and in each chemical compound record. A variety of search examples for data, physical properties, chemical reactions and other information are given. Thus chapter 3 serves as the basic reference point for most of the information contained in this book.

    Chapter 4

    Wendy Warr , a well known expert consultant in chemical information, working with Bernd Wollny at Beilstein Information Systems describes the Current Facts database, its content, search capabilities, and its value to the chemist. Current Facts on CD-ROM is the Beilstein database of structures, data, and literature citations, designed to be used on a PC to provide recent information on chemicals reported in the literature. Current Facts contains the 1 year's worth of information, updated quarterly, with the most recent 3 months of information replacing the oldest 33 months. The database goes back to 1990. One nice feature of this product is the structure searching capability built into it.

    Chapter 5

    Chapter 5 begins the treatment of the new Beilstein for the 21st century. This chapter, "Computer Systems for Substructure Searching", was written by one of the designers and developers of CrossFire, Dirk Walkowiak, of Softron GmbH, in conjunction with John Barnard, a chemical information consultant and an expert in structure handling and search systems. The chapter starts with a history and introduction to chemical structure searching. From there the specifics of the CrossFire structure search system (general architecture, search algorithms, structure coding, and reaction retrieval) for the in-house Beilstein system is described in this chapter for the first time. This chapter describes the way in which CrossFire, which currently runs on the IBM RISC System/6000 computer system with a UNIX operating system (other platforms, such as DEC and Windows NT, will be available in the near future) allows for very rapid searching of the more than 7 million chemical structures in the Beilstein database.

    Chapter 6

    After the treatment of structure searching and CrossFire in Chapter 5, Alexander (Sandy) Lawson of Beilstein Information Systems, discusses CrossFireplusReactions in Chapter 6. Another improvement in the Beilstein system has been the extraction of reaction information which is the heart of the Beilstein Handbook of Organic Chemistry. One of the critical and unique features of Beilstein has been the fact that virtually all data reported in the chemical and related literature for a chemical substance are entered into the system. All compounds entered into the Beilstein database have actually been synthesized and the structure has been assigned unambiguously.

    This chemical approach of having a database comprising actually synthesized chemical substances (as opposed to the abstracting approach of CA is one of basic features of Beilstein that makes it so valuable to the bench chemist. In CA, published papers are abstracted on the basis of their being published in the chemical literature that CA covers (some 13,000 journals) plus patents. Even though the bulk of this material is the same information as is found in both the Beilstein Handbook of Organic Chemistry and database, there are differences. CA is a much larger database because its coverage goes beyond organic compounds and includes inorganic chemicals. Some chemicals that do not exist are in the CA database for two main reasons. Either they are chemicals that are being studied in theory or in computer analysis and calculation programs, or they are needed by the CA indexing system to allow for the proper indexing of derivatives (e.g., salts and parent ring systems) of a chemical. (This indexing should not be viewed as negative, because the CA indexing system and index database is one of its hidden virtues and a very valuable resource.) CA, owing to its broader coverage of the chemical literature, also has more reaction information in many cases for chemicals from more recent publications.

    The combination of the reaction information content of Beilstein, which goes back to the 18th century, and the powerful new Beilstein Commander makes the new Beilstein product, CrossFireplusReactions, a valuable and unique resource. CrossFireplusReactions is a database and search system that should be a part of every synthetic organic chemistry research laboratory.

    Chapters 7 and 8

    Even though the chemical industry is the major user of the various Beilstein products, the chemist's first exposure to Beilstein comes when he or she is in school, either as an undergraduate or graduate student. Chapters 7 and 8, written by chemical information and library experts from leading academic institutions in the United States and Europe, Englebert (Bert) Zass (Chapter 7) and Ken Rouse and Roger Beckman (Chapter 8), describe how Beilstein is being taught and used in the university setting both in Europe and the United States.

    In Chapter 7 by Bert Zass (ETH, Switzerland) provides an excellent comparison and discussion of a number of reaction databases that are available both online and in-house, with emphasis on the Beilstein reaction database and the Beilstein CrossFireplusReactions system.

    In Chapter 8, Ken Rouse (University of Wisconsin) and Roger Beckman (Indiana University) describe a consortium of universities that have joined forces to make the Beilstein database available to a large community of students and academics via the inhouse CrossFire system. In addition to discussing the very positive response from the academic user community, the authors talk about the economics and finances of the system.

    Chapter 9 From this background of the CrossFire system and examples of use in academia, Wendy Warr, in Chapter 9 discusses and explains its practical everyday value to the industrial chemist, and in particular to the pharmaceutical chemist. Training, pricing, integration of information sources, NetFire, and CrossFire Gmelin are all covered in this chapter.

    Chapter 10 describes Autonom, a software program that gives the chemical names of structures followin International Union of Pure and Applied Chemistry (IUPAC) nomenclature rules. This chapter, written by the developer of AUTONOM, Janusz Wisniewski of Beilstein Information Systems, is an detailed description of the performance of this program. The algorithmic approach that was used is also discussed in detail. AUTONOM, which stands for AUTOmatic NOMenclature, is a program every chemist who has ever published a manuscript with a chemical structure will love. Chemists can easily draw structures, but few can name a structure according to the extensive and often complicated IUPAC or ACS/CAS naming rules. Instead of referring to structures as I, II, III, ... , XL, - AUTONOM gives IUPAC names for organic and inorganic chemical structures. For companies that need to register substances under their proper chemical names in to obtain approval from government and regulatory bodies, AUTONOM is a wonderful tool. This chapter describes version 2.0 of AUTONOM, the classes of organic and inorganic compounds it covers, and its limitations.

    By the time you reach the end of Chapter 10, you will be both well versed in what Beilstein is today and is evolving into in the future, as well as knowledgeable about how the various Beilstein products can be of invaluable assistance to the everyday activities of almost every chemist.

    References

    1. Heller, S. R., Editor, The Beilstein Online Database - Implementation, Content, and Retrieval; ACS Symposium Series #436, American Chemical Society, Washington, DC, 1990.

    2. The Lycos internet address is: http://www.lycos.com.

    3. The Alta-Vista internet address is: http://altavista.digital.com.

    4. The InfoSeek internet address is: http://guide.infoseek.com.

    5. The WebCrawler internet address is: http://webcrawler.com.

    6. The Yahoo! Internet address is: http://www.yahoo.com.

    7. Both the Beilstein Institute and Beilstein Information Systems are located at Varrentrappstrasse 40-42, Carl-Bosch Haus, Frankfurt (Main) 90, D-60486 Germany.

    8. This figure is adapted from a talk by Bob Massie, CAS at the Herman Skolnik ACS Award symposium for Reiner Luckenbach and Clemens Jochum, Chicago, IL., August 1995.

    9. Heller, S. R., Trends in Anal. Chem., 16, 1997, pp. 112-115 (1997). The Internet address for this article is: http://www.elsevier.nl:80/inca/homepage/saa/trac/frames.shtml