Automated Generation of Input Data for Machine-Learning-Based Predictions of Ni(I) Dimer Formation
TimeTuesday, June 23rd3:49pm - 3:52pm
DescriptionNi(I) dimers are useful for selective synthesis and catalyze complex reactions, which is why there is a high demand for finding similar catalysts. An experimental approach for testing candidates to form this specific catalyst is time-consuming and difficult to study systematically. Therefore, the researchers of the Schoenebeck Group at the RWTH Institute of Organic Chemistry hope to explore suitable ligands through machine learning. Ligands are ions/molecules attached to the dinuclear nickel core.
Previously, the workflow to create data to extract ML features from has only been partially implemented and required a lot of manual interaction, which made it prone to errors. This project focuses on developing a fully automated, Python-based framework for generating ML input for identifying ligands that form new, reactive Ni(I) dimers. The framework adapts the previous workflow in a more efficient way that includes automatic error detection and requires little user interference. This ML data set is generated in silico by applying DFT calculations to a library of so-called species that ligands and nickel can form. Beyond that, this approach results in an input data set large enough to provide innovative insights into identifying novel, reactive Ni(I) dimers. This is more difficult to achieve with purely experimental data which is often limited in size and incidentally biased, which are still some of the main challenges of ML in the field of chemistry. As the next step, different ML algorithms and their effectiveness for this particular data set generated by the automated framework will be investigated.