|
1
|
|
|
2
|
- Stephen R. Heller
Alan McNaught
- Igor Pletnev
- Stephen E. Stein
Dmitrii Tchekhovskoi
|
|
3
|
|
|
4
|
|
|
5
|
- Using an InChI/InChIKey
knowing you find a match if it is there and not need to worry if it was
coded differently by another person or program.
- The InChI/InChIKey is a system
for both public and private (fee-based) sources.
|
|
6
|
|
|
7
|
- 1. Easy to generate (It will use existing software.)
- 2. Expressive (It will contain structural information.)
- 3. Unique/Unambiguous
- 4. Easy to search for structure via Internet search engines (Google,
Yahoo, Microsoft Live, etc.) using the InChI (hash) Key.
|
|
8
|
|
|
9
|
|
|
10
|
|
|
11
|
|
|
12
|
- The InChI string has
been found to be too long for Internet search engines to use, hence the
need for a fixed length InChIKey. The InChIKey is a 25 character (14+8 =
22 +1 check + 1 flag + 1 dash)
hash code of the InChI string. It is made up to four (4) parts:
-
AAAAAAAAAAAAAA-BBBBBBBBCD
- 14 characters for the basic
structure
- 8 characters for the layers
- 1 character is a “check”
character
- 1 character is a flag
indicating certain features
- (e.g., fixed
or not fixed hydrogen atoms)
- A hash code is a fixed length condensed digital representation of a
variable character string.
- The InChIKey is based on truncated SHA-256 cryptographic hash function.
-
(http://en.wikipedia.org/wiki/SHA-2)
|
|
13
|
- The principal new features of the InChIKey are:
- A fixed-length (25-character) condensed digital representation of the
- Identifier to be known as InChIKey. In particular, this will
- * facilitate web searching, previously complicated by unpredictable
breaking of InChI character strings by search engines
- * allow development of a web-based InChI lookup service
- * permit an InChI representation to be stored in fixed length fields
- * make chemical structure database indexing easier
- * allow verification of InChI strings after network transmission.
|
|
14
|
|
|
15
|
D-Fructose
(natural)
InChI=1/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,5-9,11-12H,1-
2H2/t3-,5-,6-/m1/s1
InChIKey=BJHIKXHVCXFQLS-UYFOZJQFBH
L-Fructose
InChI=1/C6H12O6/c7-1-3(9)5(11)6(12)4(10)2-8/h3,5-9,11-12H,1-2H2/t3-,5-,6-/m0/s1
InChIKey=BJHIKXHVCXFQLS-FUTKDDECBR
|
|
16
|
|
|
17
|
|
|
18
|
|
|
19
|
- 1. InChI is the only
publicly available method for creating a unique chemical identifier for
a given chemical structure. In
addition InChI has a number of other value attributes noted below.
2. InChI is free-open source software.
3. Any organization (public and private) can use for internal
and/or external structure files at no cost.
|
|
20
|
- 4. It is sponsored by IUPAC
and primarily implemented by the US scientific standards agency –
NIST.
5. It allows the scientific and medical - healthcare community to
use the InChIKey as a universal
chemical identifier. This means
InChI’s can be freely searched for via Internet search
engines.
6. The InChIKey unlocks the data and information from all sites
around the world that choose to use it.
The InChIKey allows all those commercial chemical information providers
(e.g., Thieme, Elsevier,
Thomson, Prous Science, and
John Wiley ) to have a free structure and
number/linking system.
|