I am a third year Postdoctoral Scholar in Physical chemistry and cheminformatics at AirUCI at the University of California - Irvine (UCI). My primary research interests are algorithms, artificial intelligence, data modelling, and their applications to real world problems. I work with Prof. Manabu Shiraiwa at the intersection of atmospheric chemistry, artificial intelligence and Natural Language Processing (NLP). Currently I am developing APIs that predict physicochemical properties of atmospheric chemical species. I also work as a data scientist, developing predictive models for the private sector and startups.
Before joining UCI, I obtained a Ph.D. in Applied physical chemistry from Sorbonne University in Paris while on residence at the Institut Pierre-Simon Laplace (IPSL). During my Ph.D. I have developed a subpackage of a community software that simulates physicochemical reactions in the gas phase. Prior to my Ph.D., I pursued a MSc. degree in Physical Chemistry from the University of Copenhagen, where I worked on computational chemistry and statistical thermodynamics applied to atmospheric chemistry under the supervision of Prof. Matthew S. Johnson.
PhD. in Atmospheric Sciences, 2018
Sorbonne University
MSc. in Physical Chemistry, 2014
University of Copenhagen
BSc. in Chemistry, 2011
University of Padua
Developed an AI driven software for text analysis and classification increasing information extraction and productivity by 30%
Responsibilities include:
Application of machine learning algorithms to atmospheric chemical modelling
Responsibilities include:
Applying NLP techniques (word2vec, embeddings, t-SNE, text classification) to molecular modelling for classification of atmospheric chemical reactions
Predicting molecular physical properties using supervised and unsupervised machine learning algorithms
Developing a community software simulating air pollutants generation and evolution (chemical kinetics)
Collaboration
Responsibilities include:
Gecko2vec is an embedding software based on mol2vec. It is an application of word2vec algorithm to atmospheric molecules representation. It builds a large and unique database of atmospheric molecules, where embedding representations retain information on molecular structures (i.e. functional groups distribution) and chemical compositions. It allows further investigation of molecular properties via machine learning algorithms.
The mechanism synthesizer is an autoencoder that unfolds and reduces automatically generated chemical mechanisms of atmospheric chemistry. The synthesizer encodes chemical reactions in a multidimensional chemical space and identifies the most representative reactions via unsupervised learning algorithms. It relies on multidimendsional representations of atmospheric molecules via word2vec implementation (i.e. gecko2vec) and on Natural Language Processing algorithms for text and reactions classification.