NLP

Gecko2vec

Gecko2vec is an embedding software based on mol2vec. It is an application of word2vec algorithm to atmospheric molecules representation. It builds a large and unique database of atmospheric molecules, where embedding representations retain information on molecular structures (i.e. functional groups distribution) and chemical compositions. It allows further investigation of molecular properties via machine learning algorithms.

Mechanism Synthesizer

The mechanism synthesizer is an autoencoder that unfolds and reduces automatically generated chemical mechanisms of atmospheric chemistry. The synthesizer encodes chemical reactions in a multidimensional chemical space and identifies the most representative reactions via unsupervised learning algorithms. It relies on multidimendsional representations of atmospheric molecules via word2vec implementation (i.e. gecko2vec) and on Natural Language Processing algorithms for text and reactions classification.