Due to rapid advancements in deep learning techniques, the demand for large-volume high-quality databases grows significantly in chemical researches. LanGroup has developed several quantum chemistry datasets to support machine learning studies in chemistry.
A comprehensive quantum-chemistry database that includes 443,106 small organic molecules with sizes up to 10 atoms, containing C, N, O and F heavy atoms. This database features both ground-state and excited-state properties, making it particularly valuable for machine learning applications in excited-state research.
This dataset includes 200 molecules (16-25 heavy atoms: C, N, O, F) from ChEMBL, with conformations generated by RDKit and StoL, then optimized at the B3LYP/6-31G/BJD3* level. It contains several optimized conformations for each molecule, providing a rich source of data for benchmarking conformation generation methods.
Each dataset page contains detailed information about accessing the data. Please visit the individual dataset pages for specific download links and access instructions.