Due to rapid advancements in deep learning techniques, the demand for large-volume high-quality databases grows significantly in chemical researches. LanGroup has developed several quantum chemistry datasets to support machine learning studies in chemistry.
A comprehensive quantum-chemistry database that includes 443,106 small organic molecules with sizes up to 10 atoms, containing C, N, O and F heavy atoms. This database features both ground-state and excited-state properties, making it particularly valuable for machine learning applications in excited-state research.
The first quantum chemistry repository of its scale to systematically combine ground-state and S₀–S₁ conical intersection geometries and energies. The dataset contains 260,541 small molecules (C, N, O, F, up to 10 heavy atoms), enabling researchers to train models to directly predict CI geometries and energies at scale and opening the door to high-throughput virtual screening of photoreactions and accelerated mechanistic discovery.
This dataset includes 200 molecules (16-25 heavy atoms: C, N, O, F) from ChEMBL, with conformations generated by RDKit and StoL, then optimized at the B3LYP/6-31G/BJD3* level. It contains several optimized conformations for each molecule, providing a rich source of data for benchmarking conformation generation methods.
NAMD simulation results of keto isocytosine in Phys. Chem. Chem. Phys., 2022, 24, 24362-24382.
NAMD simulation results of C2N2H6.
Each dataset page contains detailed information about accessing the data. Please visit the individual dataset pages for specific download links and access instructions.