Package topology migration plan¶
Status¶
Draft staged plan for moving toward the target package topology in package-topology-spec.md.
Goal¶
Reach the target topology through small, behavior-preserving slices that are easy to review and verify.
Non-goals¶
- no big-bang rename of the whole package
- no simultaneous redesign of APIs and directory layout
- no broad feature work mixed into topology-only branches
Refactor strategy¶
Prefer slices that follow this pattern: 1. extract or split logic behind tests 2. add compatibility imports/shims if needed 3. switch call sites 4. remove old path only after verification
This minimizes breakage and keeps diffs understandable.
Phase 0: invariants before moves¶
Before major moves, preserve or add tests around: - CLI command listing and help paths - HDF5 contract helpers - init plan behavior - runtime config behavior - run-case planning behavior - training config/registry behavior
If a move lacks a small invariant test, add one first.
Phase 1: rename only the most stable package boundary¶
Candidate¶
cli_tools/ -> cli/
Why first¶
- package boundary is already conceptually clean
- command behavior is already tested
- low domain ambiguity
Slice contents¶
- create
dfode_kit/cli/ - move
main.py,command_loader.py,commands/ - keep import compatibility from old paths temporarily if needed
- update tests/docs/import sites
Success check¶
dfode-kit --list-commandsunchanged- per-command
--helpbehavior unchanged - lazy command loading preserved
Phase 2: split h5_kit.py by concern before broader package renames¶
Why now¶
h5_kit.py is one of the clearest mixed-responsibility hotspots.
Target outcome¶
- keep schema checks in
data/contracts.py - move pure HDF5 I/O helpers to
data/io_hdf5.py - move model/integration helpers to more appropriate modules later
Suggested slices¶
- extract read/write/stack helpers
- switch callers
- only then rename package path if useful
Success check¶
h52npyand sampling-related HDF5 paths behave identically- no model/training logic remains hidden in an I/O-named module unless clearly temporary
Phase 3: data_operations/ -> data/¶
Why after Phase 2¶
It is easier once the most confusing mixed module is split first.
Target mapping¶
contracts.py->data/contracts.pyaugment_data.py->data/augment.pylabel_data.py->data/label.py- extracted HDF5 helpers ->
data/io_hdf5.py
Success check¶
- user-facing CLI behavior unchanged
- internal imports simpler and more domain-named
Phase 4: df_interface/ -> cases/¶
Why this is a later slice¶
There is some ambiguity in how to divide: - presets - case planning - DeepFlame-specific setup - case sampling
Recommended sequence¶
- stabilize public responsibilities first
- rename package second
- split files only where the split improves clarity immediately
Target mapping¶
case_init.py->cases/init.pyflame_configurations.py->cases/presets.pysample_case.py->cases/sampling.pyoneDflame_setup.py->cases/deepflame.pyor split further
Success check¶
initandsamplecommands still work the same way- preset behavior remains explicit and documented
Phase 5: dfode_core/ -> models/ + training/¶
Why this is larger¶
This is really two reorganizations: - model code - trainer/config/train loop code
Recommended sequence¶
- move model registry and model implementations to
models/ - move trainer config/registry/train loop to
training/ - relocate or split
preprocess.pyonly after its actual ownership is clear - remove in-package
dfode_core/test/
Success check¶
- train command behavior preserved
- model/trainer registries still work
- package navigation becomes clearer for experimentation work
Phase 6: runtime_config.py -> runtime/config.py¶
Why this is a clean later slice¶
This is small and conceptually stable, but not urgent.
Possible follow-up¶
- move reusable run-case planning/application helpers from CLI helper modules into
runtime/run_case.py
Success check¶
- runtime config file path/schema unchanged
configandrun-caseCLI behavior unchanged
Compatibility rules during migration¶
- prefer re-export shims for old import paths during transition
- mark old paths as deprecated in docs/comments before removing them
- remove shims only after downstream call sites are updated and verified
Branching guidance¶
Keep branches focused by package boundary, for example:
- refactor/cli-package-rename
- refactor/h5-io-split
- refactor/data-package-rename
- refactor/cases-package-rename
- refactor/models-training-split
- refactor/runtime-package
Verification checklist per slice¶
- update or add targeted unit tests
- run
make verify - build docs if docs or import paths changed
- smoke-check relevant CLI commands
- check for accidental eager heavy imports on help/list paths
Initial recommended order¶
cli_tools/->cli/- split
h5_kit.py data_operations/->data/df_interface/->cases/dfode_core/model->models/dfode_core/train->training/runtime_config.py->runtime/config.py- evaluate remaining
preprocess.py/ utility cleanup
Definition of done for the overall direction¶
The topology migration is meaningfully complete when: - major package names match stable responsibilities - mixed-responsibility files have been split at the obvious boundaries - old compatibility paths are removed or minimized - docs and tests refer to the new package structure - a new contributor can infer where to add code without reading large amounts of historical context