Architecture¶
Current repository structure¶
dfode_kit/cli/: CLI entrypoints and subcommandsdfode_kit/cases/: explicit case init, presets, sampling, and DeepFlame-facing helpersdfode_kit/data/: contracts, HDF5 I/O, integration, augmentation, and labeling utilitiesdfode_kit/models/: model architectures and registriesdfode_kit/training/: training configuration, training loops, registries, and preprocessingdocs/agents/: agent-facing operational and planning docstests/: lightweight repository and harness tests
Current refactor themes¶
1. Harness engineering¶
The repository now includes:
AGENTS.md- local verification commands
- lightweight CI
- documentation topology for agents and maintainers
2. Data contracts and workflow boundaries¶
A contracts layer is used to make HDF5 dataset assumptions explicit and testable.
The canonical dfode_kit.data package now also owns the main data-preparation boundary:
- HDF5 sampling outputs
- HDF5-to-NumPy conversion
- perturbation-based augmentation
- CVODE/Cantera labeling
- integration utilities used by downstream workflows
3. Config-driven training¶
The training stack is moving toward explicit config objects and registries so new model architectures and trainer types can be added without editing a monolithic training loop.
4. Agent-friendly CLI¶
The CLI now uses lighter command discovery and deferred heavy imports for improved usability in minimal environments.
Architectural end state of the recent refactor¶
The repository has now completed the transition away from the older compatibility layout. In particular, these legacy layers are removed from main:
dfode_kit/cli_tools/dfode_kit/df_interface/dfode_kit/data_operations/dfode_kit/runtime_config.py- legacy
dfode_coremodel/train compatibility packages
The current published docs should therefore treat cli, cases, data, models, runtime, and training as the only canonical implementation homes.