|
1
|
- Stephen Heller, Stephen Stein, & Dmitrii Tchekhovskoi
- Physical & Chemical Properties Division
- NIST
- Gaithersburg, MD
- srheller@nist.gov
|
|
2
|
|
|
3
|
- Costs are high
- No cost for manuscript submission. Under ANY economic model the high
volume of submissions generated by the submission via the Internet will
drown any system.
- Lack of leadership at research institutions to demand changes from
researchers publication behavior.
- Difficulty to institute change
|
|
4
|
1. Institutional membership of NAR means that corresponding
authors based at the member institution will qualify for substantially
discounted NAR Open Access publication charges ($950 compared with $1900
per article in 2006).
2. Online access free of charge in 2006. Please note that a 2006
print subscription does not include institutional membership.
3. Online access will be completely free of charge in 2005. A
print subscription or institutional membership provides discounted
publication charges for corresponding authors based at the member
institution. See www.nar.oupjournals.org/openaccess
and
http://www3.oup.co.uk/nar/special/14/default.html
|
|
5
|
|
|
6
|
|
|
7
|
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
- Chemical structure is the true ‘identifier’
- But, structure representations are not unique or convenient for
computers.
- So, convert structure to a unique ‘name’ by fixed algorithms
- The IUPAC International Chemical Identifier (InChI)
|
|
12
|
- Chemicals
- Fast isomerization (tautomerization)
- Ill-defined connectivity
- Chemists
- Differing conventions
- Depends on discipline, education and convenience
- Imprecision/uncertainty
|
|
13
|
- Chemistry
- ‘Normalize’ Input Structure
- Math
- ‘Canonicalize’ (label the atoms)
- Equivalent atoms get the same label
- Format
- ‘Serialize’ Labeled Structure
- Output as character string (‘name’)
|
|
14
|
- Divide structure into ‘layers’
- Each layer ‘refines’ structure
- Ignore ‘Electron Density’
- Use simple ‘connectivity’ only
- Ignore bond type and electron location
- Stereochemistry
- sp2 and sp3 only
- Free rotation around single bonds
- No Z/E stereo for small rings (default)
|
|
15
|
|
|
16
|
|
|
17
|
|
|
18
|
- Identify compounds at the known level of detail
- Convention-free (mostly)
- Generate quickly from structure
- Contains all essential connectivity information
- Simple ASCII representation
|
|
19
|
|
|
20
|
|
|
21
|
|
|
22
|
|
|
23
|
|
|
24
|
- Description:
Version 1.0 of the Identifier expresses chemical structures in a
standard machine-readable format, in terms of atomic connectivity,
tautomeric state, isotopes, stereochemistry, and electronic charge. It
deals with neutral and ionic well-defined, covalently-bonded organic
molecules, and also with inorganic, organometallic and coordination
compounds.
- We propose to promote actively the use of the algorithm and its
associated implementations to developers of commercial chemical
software, database compilers and publishers of chemical information, in
order to enable sharing of molecular information throughout the
worldwide community of chemical scientists.
- We propose also to extend the applicability of the Identifier to
polymeric structures, and to explore the need for and the practicality
of an extension to cover Markush structures.
- In addition, we will evaluate the need for inclusion of information on
other attributes such as phases and excited states, and take steps to
include such information if appropriate.
|
|
25
|
- 1. Sophie Rovner, C&E News, ” CHEMICAL 'NAMING' METHOD UNVEILED ”,
August 22, 2005
Volume 83, Number 34, pp. 39-40
- 2. International chemical identifier goes online, Chem. World, 16 May
2005
- 3. M.D. Prasanna, J. Vondrasek, A. Wlodawer and T.N. Bhat, Application
of InChI to Curate, Index, and Query 3-D Structures, Proteins:
Structure, Function, and Bioinformatics, 2005, 60, 1-4
- 4. Enhancement of the chemical semantic web through the use of InChI
identifiers, S.J. Coles, N.E. Day, P. Murray-Rust, H.S. Rzepa and Y.
Zhang, Org. Biomol. Chem., 2005, 3(10), 1832-1834
- 5. InChI FAQ, by Nick Day (Unilever Centre for Molecular Informatics,
Cambridge University)
- 6.Representation and Use of Chemistry in the Global Electronic Age, P.
Murray-Rust, H.S. Rzepa, S.M. Tyrrell and Y. Zhang, Org. Biomol. Chem.,
2004, 3192-3203 [www.ch.ic.ac.uk/rzepa/obc/]
- 7.That INChI feeling, Reactive Reports, issue 40, Sep 2004
- 8.Unique labels for compounds, Chem. & Eng. News, 2 Dec 2002
- \
- 9. Chemists synthesize a single naming system, Nature, 23 May 2002
- 10.That IChI feeling ... The Alchemist, 24 Apr 2002
- 11.What's in a Name? The Alchemist, 21 Mar 2002
- 12. Stephen E. Stein, Stephen R. Heller, and Dmitrii
Tchekhovskoi, An Open Standard for Chemical Structure Representation:
The IUPAC Chemical Identifier,
-
Proceedings of the 2003 International Chemical Information
Conference (Nimes), Infonortics, pp. 131-143.
|
|
26
|
- NIST – 150,000 structures
- PubChem project – 5.2+ million structures
- ISI – 2+ million structures
- IBM – 1.6+ million structures
- NCI Database – 23+ million structures
- EPA –DSSTox database – 1450 structures
- KEGG database – 9584 structures
- UCSF ZINC – 3.3 million structures
|
|
27
|
|
|
28
|
|
|
29
|
|
|
30
|
|
|
31
|
|
|
32
|
- Steve Bachrach, Steve Bryant, Denise Creech, Nick Day, Rene Deplanque,
Guenter Grethe, Stevan Hanard, Sami Kassab, Gary Mallard, Randy
Marcinko, Alan McNaught, Bill Milne, Carmen Nitsche, Chris Reed, Rich
Roberts, Peter Murray-Rust, Henry Rzepa, Steve Stein, Peter Shepherd,
Bill Town, Andrea Twiss-Brooks, Wendy Warr, and Ann Wolpert
|