Small Molecule
Big Data
Infinite Truth

QCDGE_CI: Quantum Chemistry Dataset with Ground-state and Conical-Intersection Structures

Conical intersections(CIs) play central roles in photoinduced reactions. However, comprehensive conical-intersection datasets that could advance our understanding of excited-state reaction processes remain scarce. To address this gap, we constructed a quantumchemistry dataset containing ground-state and conical-intersection structures of small molecules (up to ten heavy atoms: C, N, O, F). Ground-state geometries were optimized at the semi-empirical OM2 level, with single-point energies calculated at the OM2/MRCI level. Conical-intersection geometries and energies were also computed at the OM2/MRCI level. This dataset is designed to enable a deep integration of photochemistry with machine learning, bridging the gap between photochemical insight and data-driven approaches.

How to access?

If you are downloading via the public web please download via url.

If you are downloading via LAN please download via url.

The following files are provided:

  1. Final_property.hdf5: A binary file that stores the geometries and energies of all compounds, including their ground state minima and S₀-S₁ CIs.
  2. final_all.csv: An auxiliary file, containing the InChI strings, SMILES strings, compound types, ring numbers, and heavy atom counts.
  3. extract_data_from_QCDGE_CI.py: This script allow for extracting molecular geometries and energies from the HDF5 file.

How to cite?

Citation information will be provided in the subsequent paper.