Chemical Name to Structure: OPSIN, an Open Source Solution
2011; American Chemical Society; Volume: 51; Issue: 3 Linguagem: Inglês
10.1021/ci100384d
ISSN1549-960X
AutoresDaniel M. Lowe, Peter Corbett, Peter Murray‐Rust, Robert C. Glen,
Tópico(s)Analytical Chemistry and Chromatography
ResumoWe have produced an open source, freely available, algorithm (Open Parser for Systematic IUPAC Nomenclature, OPSIN) that interprets the majority of organic chemical nomenclature in a fast and precise manner. This has been achieved using an approach based on a regular grammar. This grammar is used to guide tokenization, a potentially difficult problem in chemical names. From the parsed chemical name, an XML parse tree is constructed that is operated on in a stepwise manner until the structure has been reconstructed from the name. Results from OPSIN on various computer generated name/structure pair sets are presented. These show exceptionally high precision (99.8%+) and, when using general organic chemical nomenclature, high recall (98.7−99.2%). This software can serve as the basis for future open source developments of chemical name interpretation.
Referência(s)