Artigo Acesso aberto Revisado por pares

Employing fingerprinting of medicinal plants by means of LC-MS and machine learning for species identification task

2018; Nature Portfolio; Volume: 8; Issue: 1 Linguagem: Inglês

10.1038/s41598-018-35399-z

ISSN

2045-2322

Autores

Pavel Kharyuk, Dmitry Nazarenko, Ivan Oseledets, И. А. Родин, О. А. Шпигун, Andrey Tsitsilin, Mikhail Lavrentyev,

Tópico(s)

Traditional Chinese Medicine Studies

Resumo

Abstract A dataset of liquid chromatography-mass spectrometry measurements of medicinal plant extracts from 74 species was generated and used for training and validating plant species identification algorithms. Various strategies for data handling and feature space extraction were tested. Constrained Tucker decomposition, large-scale (more than 1500 variables) discrete Bayesian Networks and autoencoder based dimensionality reduction coupled with continuous Bayes classifier and logistic regression were optimized to achieve the best accuracy. Even with elimination of all retention time values accuracies of up to around 85% were achieved on validation set for plant species and plant organ identification. Benefits and drawbacks of used algortihms were discussed. Preliminary test showed that developed approaches exhibit tolerance to changes in data created by using different extraction methods and/or equipment. Dataset with more than 2200 chromatograms was published in an open repository.

Referência(s)