Workflow Contract#

Back to reference

macroforecast keeps statistical functions callable and puts workflow composition in the forecasting runner.

Ownership#

Module

Owns

Does not own

macroforecast.data

Canonical pandas panels and metadata.

Model-ready matrices.

macroforecast.preprocessing

Callable data-cleaning transforms and reusable preprocessing specs.

Expanding/rolling schedule decisions.

macroforecast.filters

Callable one-series filters and smoothers, including HP, Hamilton, Savitzky-Golay, wavelet-style components, and AlbaMA.

Panel feature matrices or forecast-origin scheduling.

macroforecast.feature_engineering

Callable feature/target transforms, filter-to-feature wrappers, and reusable feature specs.

Forecast-origin scheduling.

macroforecast.feature_analysis

Feature-matrix and learned-weight diagnostics after feature construction.

Predictor construction or model fitting.

macroforecast.window

Estimation mode, validation/test time frames, retrain/retune cadence, and reusable stage policies.

Low-level transformation formulas.

macroforecast.models

Single-estimator model callables and model-owned hyperparameter spaces.

Forecast loops, forecast combination, or fit-time model ensembles.

macroforecast.model_ensemble

Fit-time composition of multiple member models into one ModelFit.

Forecast-output combination after runner execution.

macroforecast.model_selection

Parameter search over supplied data and validation windows.

Global run orchestration.

macroforecast.forecasting

Runner-level composition and forecast combination.

Low-level transformation formulas.

macroforecast.metrics

Forecast scoring, forecast ranking, and metric resolution.

Data splits, model fitting, or statistical tests.

macroforecast.tests

Forecast-comparison statistical tests and residual diagnostics.

Forecast scoring or model fitting.

macroforecast.evaluation

Namespace wrapper for metrics and tests.

Callable metric/test functions.

macroforecast.interpretation

Post-fit importance and effect summaries.

Model fitting, feature construction, or forecast testing.

macroforecast.output

Output tables/JSON summaries, artifact writing, schema-aware manifests, hashes, compression, and provenance.

Paper/report presentation style.

macroforecast.reporting

Presentation table formatting, LaTeX/HTML/Markdown rendering, and figure-ready data.

Artifact writing or workflow design.

Review Pages#

Use these pages before opening individual function references:

Page

Use

Documentation Map

Decide which page to inspect for a specific question.

Legacy Callable Coverage

Check whether an old runtime feature is covered, intentionally removed, or deferred.

Reference Verification

Check formula/reference anchors and future verification priorities.

Public Python API

Check importable public symbols.

Runner Loop#

The runner is the only module that combines stages:

for origin in window.iter_origins(panel.index):
    preprocessing_fit_panel = rows_allowed_by(preprocessing_policy, origin)
    fitted_preprocessing = preprocess_spec.fit(
        preprocessing_fit_panel,
        policy=preprocessing_policy.scope,
    )
    processed = fitted_preprocessing.transform(rows_needed_by_runner, ...)

    feature_fit_panel = rows_allowed_by(feature_policy, origin)
    builder = feature_spec.fit(feature_fit_panel)
    train_features = builder.transform(processed, index=train_dates)
    test_features = builder.transform(processed, index=test_dates)

    model_selection_features = rows_allowed_by(model_selection_policy, origin)
    selected_params = model_selection.select_params(model, model_selection_features, ...)
    fit = model(train_features.X, train_features.y, **selected_params)
    forecast = fit.predict(test_features.X)

This keeps expanding, rolling, fixed-sample, retrain cadence, and retune cadence in macroforecast.window and macroforecast.forecasting, not inside individual preprocessing or feature functions. Full-sample preprocessing remains available through reprocess(...); origin-local preprocessing uses preprocess_spec(...) inside the runner.

Post-run objects follow the same separation:

report = mf.evaluation.evaluate_report(result)
tests = mf.tests.model_confidence_set(loss_panel)
explain = mf.interpretation.permutation_importance(model, X, y)

bundle = mf.output.bundle_outputs(
    forecasts=result,
    evaluation=report,
    tests={"mcs": tests},
    interpretation={"importance": explain},
)
paper_table = mf.reporting.report_table(report.scores)
manifest = mf.output.write_artifacts(bundle, "results/run_001")

output handles named outputs and files. reporting handles presentation formatting. Neither module decides the modeling design.

Stage Policies#

window.stage_policy(...) is the shared timing grammar for runner stages.

mf.window.stage_policy("origin_available")
mf.window.stage_policy("fit_window")
mf.window.stage_policy(
    "fixed_reference",
    reference_start="2000-01-01",
    reference_end="2019-12-31",
    update="never",
)

The same policy object can be supplied as preprocessing_policy, feature_policy, or model_selection_policy in forecasting.run(...). This makes full-sample, expanding, rolling, fixed-reference, and scheduled-refit designs expressible without putting time logic inside low-level transformation functions.

One-Shot Convenience#

The direct one-shot helpers remain useful:

processed = mf.preprocessing.reprocess(panel)
features = mf.feature_engineering.build_features(processed, target="INDPRO", horizon=1)
fit = mf.models.ridge(features)

They are convenience calls. For strict out-of-sample work, use preprocess_spec(...), feature_spec(...), and forecasting.run(...) so fitted transforms are learned inside each training window.

Fit Policies#

Public time policy should come from window, not from feature functions. Older one-shot feature helpers may still expose narrow convenience arguments, but new runner-compatible code should use feature_spec(...) and let the runner decide which rows belong to each fit.