Merging first-principles calculations with machine learning (ML), we aim to accelerate the exploration of catalytic behaviour in novel materials. We focus on two-dimensional (2D) Ti2CTy MXenes, whose versatile surface chemistry makes them particularly compelling candidates for catalysis. Resolving their composition and structure under realistic conditions exceeds the reach of standard density functional theory (DFT) due to computational cost. To address this challenge, we generate a comprehensive dataset of 50,000 DFT calculations for training and 10,000 for testing, encompassing both Ti2CTy MXene configurations and molecular systems, along with an additional test dataset with 1000 genuinely new, larger systems to investigate how well models generalise. We train and validate widely used and competitive machine learning interatomic potentials (MLIP) models, EquiformerV2, MACE, MatRIS, UPET, and MatRIS that accurately predict atomic forces and formation energies — quantities that DFT must repeatedly compute for structural and catalytic investigations — for these 2D materials. This combined DFT–ML framework achieves computational acceleration of the order ∼1 − 4 · 103 (on a CPU) while maintaining desired-level accuracy (∼±10 meV/A for forces and ∼±1 meV for per-atom energies), paving the way for more efficient investigations of MXene catalytic behaviour. Moreover, we perform an extensive qualitative evaluation of the trained models, showcasing the importance of the comprehensive simulation-based comparison beyond the benchmark metrics. The dataset and the trained models with the code are available at https://huggingface.co/datasets/CatalystAnonymous/catalyst_mxenes.
May 30, 2026
Merging advanced computations with machine learning, we aim to accelerate the exploration of catalytic behaviour in novel materials. We focus on two-dimensional (2D) Ti$_2$CT$_y$ MXenes, whose versatile surface chemistry makes them particularly compelling candidates for catalysis. However, resolving their composition and structure under realistic conditions requires going beyond the systems typically studied with density functional theory (DFT), as the computational cost of such calculations limits accessible system sizes and timescales, calling instead for more efficient approaches. To address this challenge, we generate a comprehensive dataset of 50,000 DFT calculations for training and 10,000 for testing, encompassing both Ti$_2$CT$_y$ MXene configurations and molecular systems, along with an augmented dataset where systems are artificially repeated to investigate how well models generalise to larger systems.Employing advances in geometric deep learning, we train and validate an equivariant (\ie symmetry-aware) model (EquiformerV2) that accurately predicts atomic forces and formation energies — quantities that DFT must repeatedly compute for structural and catalytic investigations — for these 2D materials. This combined DFT–ML framework achieves computational acceleration of the order ${\sim}10^3$–$10^4$ (on a CPU) while maintaining DFT-level accuracy (${\sim} {\pm} 45$ meV/Å for forces and ${\sim} {\pm} 6$ meV for per-atom energies), paving the way for more efficient investigations of MXene catalytic behaviour. Moreover, we confirm that the total energy prediction error of the model grows linearly with the number of atoms in an input system, while the force error remains the same, which, along with the equivariant model design, is a necessity for a robust model. The dataset and the trained models with the code are available at \url{https://github.com/CataLiUst}.
Nov 24, 2025