|
|
|
The most
widely-used, and perhaps best understood object under discussion in chemistry
is the chemical compound. Of course, we define chemical compounds by their
chemical structure, as typically shown in 2D diagrams or held as ‘connection
tables’.
|
|
|
|
Pronouncible
names have been developed for oral and written communication, though
derivation of a name from a structure can require highly complex rules known
only to experts. ‘Understanding’ these names requires reversing the naming
process to derive the original structure. They are clearly less direct and
often less effective means of identifying chemicals.
|
|
|
|
In the current
digital age, where compounds are represented digitally, the need for
effective identifiers is no less important. Freed from the restriction of
‘pronouncibility’, chemical identifiers can be tied more directly to
structures. In fact, they can be derived directly from structure by algorithm
such that any structure that can be drawn can be ‘identified’.
|
|
|
|
This project
aims to develop such a set of algorithms to serve as the unique identifier
for a compounds, its digital signature. Since a series of characters is the
method of storage and transmission of information, such a string, derived
from a structures, is the output format.
|
|
|
|
|
|
speaking an While the most fundamental
description of the identity of a compound is its structure, this requires a
picture, which is not usable for speech and often inconvenient for text. The
use of pronouncible names is very efficient for common substances, which
often acquire a ‘trivial’ name, but can be cumbersome or impossible for
complex compounds.
|
|
|
|
|
|
Describe what
has been done, what remains to be done
|