Classification of Hemilabile Ligands Using Machine Learning

Abstract

Discovery of hemilabile ligands that optimally balance reactivity and stability is important for identifying novel catalyst structures. We design a workflow for identifying ligands in the Cambridge Structural Database (CSD) that have been crystalized with distinct denticities and are thus identifiable as hemilabile ligands. To overcome the difficulty of identifying negative example, non-hemilabile ligands in our data set, we implement a semi-supervised learning approach using a label-spreading algorithm together with a set of heuristic rules based on ligand frequency of appearance. We show that a heuristic based on coordinating atom identity alone is not sufficient to identify whether a ligand is hemilabile and our trained machine-learning classification models are instead needed to predict whether a bi-, tri-, or tetradentate ligand is hemilabile with high accuracy and precision. We gain deeper insight into the factors that govern ligand hemilability by conducting feature importance analysis on our models, finding that the second, third, and fourth coordination spheres all play an important role in ligand hemilability.

Publication
submitted
Ilia Kevlishvili
Ilia Kevlishvili
Postdoctoral Associate
Chenru Duan
Chenru Duan
Chemistry PhD
Heather J. Kulik
Heather J. Kulik
Associate Professor of Chemical Engineering