macroforecast.forecasting#

Back to reference

macroforecast.forecasting is the workflow composition module. It connects window, preprocessing, feature_engineering, model_selection, models, model_ensemble, and metrics/tests.

run#

run_forecast is an alias for run. New code can call run(...); use run_forecast(...) only when the longer name makes a script clearer.

macroforecast.forecasting.run(
    data,
    model,
    *,
    window=None,
    preprocessing=None,
    preprocessing_policy=None,
    features=None,
    feature_policy=None,
    model_selection=None,
    model_selection_policy=None,
    model_selection_metric="mse",
    maximize_model_selection=False,
    preset=None,
    params=None,
    target=None,
    horizon=1,
    horizons=None,
    forecast_policy="direct",
    future_feature_policy=None,
    target_transform=None,
    combination=None,
    save_models=True,
    model_store="trained_model",
)

Input

Type

Default

Meaning

data

FeatureSet, DataBundle, DataSpec, (panel, metadata), or pandas-like panel

required

Prebuilt model matrix or canonical panel. DataBundle metadata, including native frequencies, is preserved.

model

str, callable, ModelSpec, list, or mapping

required

One or more model or fit-time model-ensemble specs to fit at each origin.

window

WindowSpec, str, or None

None

Forecast experiment design: estimation mode, validation, test origins, retrain and retune cadence.

preprocessing

PreprocessSpec or None

None

Callable preprocessing operations.

preprocessing_policy

StagePolicy, str, or None

origin_available when preprocessing is supplied

Where preprocessing may fit and apply: full_panel, origin_available, fit_window, or fixed_reference.

features

FeatureSpec or None

None

Feature and target construction operations. For panel-input models such as dfm_mixed_mariano_murasawa, leave this as None.

feature_policy

StagePolicy, str, or None

fit_window

Where stateful feature engineering such as PCA may fit.

model_selection

SearchSpec, model-keyed mapping, or None

None

Hyperparameter candidate generation and search method. None uses each model’s owned default search space; a model-keyed value of None disables model selection for that model.

model_selection_policy

StagePolicy, str, or None

fit_window

Which feature rows are supplied to model selection.

model_selection_metric

str or callable

"mse"

Objective used during model selection.

maximize_model_selection

bool

False

Whether larger selection scores are better.

preset

str, mapping, or None

None

Model-owned search-space preset.

params

mapping or None

None

Fixed model parameters.

target

str or None

None

Required for raw panel input when features is omitted, and required for panel-input models unless every model spec sets the same target parameter.

horizon

positive int

1

Target horizon when features is omitted.

horizons

positive int, sequence, or None

None

Multiple target horizons. Provide either horizon or horizons, not both.

forecast_policy

str

"direct"

Target/forecast construction policy: "direct", "direct_average", "path_average", "recursive", or alias "iterated".

future_feature_policy

str or None

None

Used only for recursive forecasting. None becomes "target_lags". Use "observed_future" only for explicit oracle/scenario paths where future predictors are known or supplied.

target_transform

str or None

None

Optional target transform override. For direct_average, "growth" becomes "average_growth" and "value" becomes "average_value"; for path_average, "growth" means average step growth forecasts and "value" means step forecasts of an already transformed one-period target; for recursive, allowed values are level, change, growth, and log_growth.

combination

str, CombinationSpec, sequence, mapping, or None

None

Optional forecast-combination requests. Combined forecasts are appended as additional model rows.

save_models

bool

True

Save each fitted origin/model object and its metadata.

model_store

str or path-like

"trained_model"

Root directory for saved fitted models.

Output: ForecastResult.

ForecastResult methods:

Method

Input

Output

Meaning

to_frame()

none

DataFrame

Copy of forecast table.

evaluate(**kwargs)

arguments forwarded to mf.metrics.evaluate_forecasts

DataFrame

Forecast-score table.

with_sidecar(name, value)

sidecar name and runtime object

ForecastResult

Copy with a named sidecar recorded in metadata.

get_sidecar(name, default=None)

sidecar name

object

Return an attached sidecar.

sidecar_names()

none

tuple

Names of attached sidecars.

with_oshapley(X, y, models, window=..., ...)

explicit aligned oShapley/PBSV inputs

ForecastResult

Build and attach an oShapley/PBSV sidecar through mf.interpretation.oshapley_from_forecast_result(...).

with_anatomy(X, y, models, window=..., ...)

explicit aligned backend inputs

ForecastResult

Backend alias for with_oshapley(...).

with_dual(model, X_train, y_train, X_test=None, ...)

fitted model and explicit train/test design

ForecastResult

Build and attach a DualML observation-weight sidecar through mf.interpretation.dual_from_forecast_result(...).

anatomy_explain(anatomy, **kwargs)

precomputed anatomy.Anatomy object or saved path

DataFrame

Convenience call to mf.interpretation.anatomy_explain(...), with forecast-result metadata attached.

pbsv(anatomy, **kwargs)

precomputed backend object or saved path

DataFrame

Convenience call to mf.interpretation.pbsv(...).

oshapley_vi(anatomy, **kwargs)

precomputed backend object or saved path

DataFrame

Convenience call to mf.interpretation.oshapley_vi(...).

anatomy_pbsv(anatomy, **kwargs)

precomputed backend object or saved path

DataFrame

Backend alias for pbsv(...).

anatomy_oshapley_vi(anatomy, **kwargs)

precomputed backend object or saved path

DataFrame

Backend alias for oshapley_vi(...).

pbsv(...) and oshapley_vi(...) do not create the backend object. Use with_oshapley(...) or mf.interpretation.oshapley_from_forecast_result(...) when the sidecar should be built from explicit X/y, model specs, and WindowSpec.

Dual interpretation is also attached after the runner. forecasting.run() does not infer observation weights automatically because the forecast table does not contain the exact fitted estimator, training feature matrix, training target, and forecast-row feature matrix. Use with_dual(...) or mf.interpretation.dual_from_forecast_result(...) with those objects explicitly.

pre = mf.preprocessing.preprocess_spec(
    transform="none",
    outliers="none",
    impute="mean",
    standardize="zscore",
    frame="keep",
)

features = mf.feature_engineering.feature_spec(
    target="INDPRO",
    horizon=1,
    predictors="all",
    lags=(0, 1, 2),
    pca_components=3,
)

window = mf.window.spec(
    estimation=mf.window.estimation_expanding(min_size=120),
    val=mf.window.val_last_block(size=24),
    test=mf.window.test_origins(horizon=1, step=1),
)

result = mf.forecasting.run(
    panel,
    ["ridge", "lasso"],
    window=window,
    preprocessing=pre,
    preprocessing_policy=mf.window.stage_policy("origin_available"),
    features=features,
    feature_policy=mf.window.stage_policy("fit_window"),
    model_selection=mf.model_selection.search_spec("ridge", preset="small"),
    model_selection_policy=mf.window.stage_policy("fit_window"),
    preset="small",
)

model_selection=None means “use registered model defaults” when a model has a search space. To run fixed parameters with no tuning, pass a model-keyed mapping such as model_selection={"ridge": None}. To evaluate a single explicit candidate through the model-selection ledger, pass model_selection={"ridge": mf.model_selection.fixed({"alpha": 0.1})}.

Forecast Policies#

forecast_policy decides how the target is constructed and how forecasts are written to the forecast table.

Policy

Target construction

Model fit

Forecast row

"direct"

y[t+h] or a direct transform such as growth/change.

One model per requested horizon.

date is the target date t+h; horizon is h.

"direct_average"

Direct average target, such as average change or average growth over steps 1..h.

One model per requested horizon.

prediction and actual are horizon-average objects.

"path_average"

Step-level targets for steps 1..h.

One model per future step, then average the step forecasts.

prediction and actual are averages of the step forecasts/outcomes.

"recursive" / "iterated"

One-step target y[t+1].

Fit one one-step model at each origin, then feed predicted target values back into the feature panel for steps 2..h.

prediction and actual are expressed at the requested horizon h under the selected target_transform.

Examples:

# Fit separate direct models for h=1, 3, and 12.
direct = mf.forecasting.run(
    panel,
    "ridge",
    target="INDPRO",
    horizons=[1, 3, 12],
    model_selection={"ridge": None},
)

# Direct average growth target over the next 12 months.
direct_avg = mf.forecasting.run(
    panel,
    "ridge",
    target="INDPRO",
    horizon=12,
    forecast_policy="direct_average",
    target_transform="growth",
    model_selection={"ridge": None},
)

# Fit step 1..12 models and average the step growth forecasts.
path_avg = mf.forecasting.run(
    panel,
    "ridge",
    target="INDPRO",
    horizon=12,
    forecast_policy="path_average",
    target_transform="growth",
    model_selection={"ridge": None},
)

# Recursive / iterated forecast: fit one-step AR-style model and iterate to h=12.
recursive = mf.forecasting.run(
    panel,
    "ridge",
    target="INDPRO",
    horizon=12,
    forecast_policy="recursive",
    target_transform="level",
    model_selection={"ridge": None},
)
Target availability rule:

for feature-matrix direct and direct-average forecasts, a training row u is used only when u + h <= origin. For path-average forecasts, each step model uses only rows satisfying u + step <= origin. Hyperparameter validation splits are rebuilt under the same rule. Forecast rows expose window["target_availability_end"], window["target_availability_end_pos"], and window["target_availability_lag"] so users can audit the effective fit sample. Path-average rows also expose window["target_availability_by_step"]. This can be earlier than the generic WindowSpec fit_end, because fit_end is an information-window boundary while h-step target labels realize later.

Test-origin rule:

window.test.first_origin and window.test.last_origin are origin dates. step=1 means every emitted row in the input index, so overlapping h-step macro forecasts are supported. The runner writes a scored row only when the realized target date is available. With drop_incomplete=True, an h-step origin t is kept for scoring only if t + h is inside the panel. For a monthly panel ending in 2017-12, h=24 origins after 2015-12 are not evaluable even though the origin dates themselves exist. If an entire final calendar block has no evaluable origins, WindowSpec.validate(...) reports no test origins; replication scripts should skip that tail block rather than count it as a forecast error.

For FRED-MD-style replications where preprocessing has already produced the one-period target object, use forecast_policy="direct_average", target_transform="value" or target_transform="average_value" for direct average targets, and use forecast_policy="path_average", target_transform="value" for path-average step targets.

Recursive forecasting has an explicit future-feature contract:

future_feature_policy

Meaning

Use when

"target_lags"

Default. The runner builds or requires target-lag features only, then updates the target path with its own predictions. It does not invent future exogenous predictors.

Real-time recursive or iterated forecasting.

"observed_future"

The runner uses future non-target predictor values already present in the panel while recursively updating only the target. This is an oracle/scenario path.

Scenario analysis, controlled simulations, or cases where future predictor paths are genuinely known.

When features=None and forecast_policy="recursive", the runner creates feature_spec(target=target, predictors=[], lags=None, target_lags=(0, 1), horizon=1). In row-date convention, target_lags=(0, 1) means the current target value at the one-step information date and its previous value. For supplied FeatureSpec with future_feature_policy="target_lags", regular predictors must be empty and target_lags must be declared with lag 0 included, because predicted target values are written into the next row before the next one-step forecast is formed. For supplied FeatureSpec with future_feature_policy="observed_future", regular predictors are allowed, but the user is responsible for the future predictor path.

For feature-matrix models, the runner uses the requested horizon to restrict test origins to dates where t+h is observable, but it fits/predicts one origin row per forecast origin. This keeps origin as the information date and date as the realized target date.

Panel-input models consume the canonical panel directly instead of an engineered X, y matrix. They cannot be mixed with feature-matrix models in one runner call. Currently forecasting.run(..., features=None) supports dfm_mixed_mariano_murasawa and dfm_unrestricted_midas as native mixed-frequency panel models. It keeps DataBundle metadata such as native_frequency_by_column so the model can separate monthly and quarterly columns inside each fit window. Panel-input runner calls currently fit fixed model parameters; pass model_selection={model_name: None} for these models.

For standard MIDAS regressions, build the mixed-frequency lag matrix explicitly and pass it as a FeatureSet. This keeps calendar anchoring in mixed_frequency_lags() and model weighting in midas_almon(), midas_beta(), midas_step(), or unrestricted_midas():

X_midas = mf.feature_engineering.mixed_frequency_lags(
    mixed,
    target="GDPC1",
    columns=["PAYEMS", "INDPRO"],
    lags=range(0, 12),
    target_frequency="quarterly",
    anchor_position="period_end",
    drop_missing=True,
)
y = mixed.panel["GDPC1"].reindex(X_midas.index).rename("GDPC1").to_frame()
features = mf.feature_engineering.FeatureSet(
    X=X_midas,
    y=y,
    metadata={"feature_engineering": {"method": "mixed_frequency_lags"}},
    target="GDPC1",
    targets=("GDPC1",),
    horizons=(1,),
    predictors=tuple(X_midas.columns),
)
result = mf.forecasting.run(
    features,
    "midas_beta",
    window=mf.window.spec(
        estimation=mf.window.estimation_expanding(min_size=40),
        val=mf.window.val_last_block(size=12),
        test=mf.window.test_origins(horizon=1, step=1),
    ),
    params={"midas_beta": {"beta_params": (1.0, 2.0), "alpha": 0.1}},
    model_selection={"midas_beta": None},
)

If the analysis intentionally follows the common full-sample empirical workflow, preprocess first and pass the processed panel to run(..., preprocessing=None). If the analysis is a real-time forecasting exercise, pass preprocess_spec(...) plus an explicit preprocessing_policy.

Stage policies are intentionally shared across preprocessing, feature engineering, and model selection:

Scope

Meaning

full_panel

Fit the stage once on the full panel. Useful for retrospective replication designs.

origin_available

Fit using observations available up to each origin. This supports common macro cleaning designs, including EM imputation on variables observed by that origin.

fit_window

Fit only on the model fit window and apply that fitted state to validation/test rows.

fixed_reference

Fit on a named reference period, then keep that state fixed. Useful for fixed PCA loadings or fixed standardization windows.

Each StagePolicy also has an update cadence. For preprocessing and feature engineering, run() refits or reuses the fitted state according to "every_origin", "on_retrain", "never", a positive integer cadence, or a pandas date offset such as "12ME". This lets the same runner express both full re-estimation designs and fixed-reference designs such as “fit PCA loadings once, then keep the loadings fixed while origins roll forward.”

The runner metadata records the window, each stage policy, preprocessing options, feature-engineering options, model-selection spec, model specs, and origin stage records. Each origin stage record includes an updated flag showing whether that stage fitted new state at that origin.

Before any origin is fitted, run() validates the resolved WindowSpec against the input index. Window validation errors, such as invalid inner validation splits or reuse_params=False with skipped retune origins, stop the run with window validation failed: ....

Execution Order#

For raw panel input, run() executes one test origin at a time:

Step

Owner

What happens

1

meta

Read package defaults such as random seed and default stage scopes.

2

window

Resolve estimation, validation, test, retrain, and retune rows.

3

preprocessing

Fit or reuse the preprocessing state according to preprocessing_policy; transform the rows needed by the origin.

4

feature_engineering

Fit or reuse the feature builder according to feature_policy; create X_fit, y_fit, X_selection, y_selection, X_test, and y_test.

5

model_selection

If enabled, evaluate model parameter candidates on validation splits supplied by the window.

6

models or model_ensemble

Fit the model or fit-time ensemble on the origin fit window with selected or fixed parameters.

7

models

Save the fitted object and JSON sidecar when save_models=True.

8

forecasting

Collect point, variance, and quantile forecasts.

9

forecasting

Append requested forecast-combination rows and write the run metadata ledger.

If data is already a FeatureSet, preprocessing and feature construction are skipped. The runner slices the supplied X and y by the window plan and then runs model selection, model fitting, prediction, and optional model storage.

If every selected model has input_kind="panel" and features=None, the runner uses the window plan to slice the panel directly into fit and test panels. This is the path used by mixed-frequency DFM models. Runner-level preprocessing can run on this path with preprocess_spec(...) and preprocessing_policy; feature engineering is skipped because panel-input models consume native panel columns.

The result metadata records which execution path was used:

run.input_path

Input

Stages run inside run()

panel_to_features

Raw canonical panel, DataBundle, DataSpec, or (panel, metadata) with feature-matrix models

Optional preprocessing, feature engineering, model selection, model fitting, optional combination.

feature_set

Prebuilt FeatureSet

Model selection, model fitting, optional combination. Preprocessing and feature construction are assumed to have already happened.

panel_model

Canonical panel with panel-input models such as mixed-frequency DFM

Optional preprocessing, panel slicing, panel-model fitting, optional combination. Feature engineering is skipped.

Forecast Table#

ForecastResult.to_frame() returns one row per (origin, test date, model).

Column

Meaning

date

Test target date.

origin

Forecast origin from the window row.

origin_pos

Integer position of the origin in the input index.

horizon

Forecast horizon for the row.

forecast_policy

Policy used to construct the row: direct, direct_average, path_average, or recursive.

target

Source target variable.

model

Runner alias, such as ridge or a key from a model mapping.

model_spec

Canonical registered model name.

prediction

Point forecast.

variance_prediction

Forecast variance when the fitted model exposes predict_variance(...); otherwise None.

quantile_predictions

Per-row quantile dictionary when the fitted model exposes predict_quantiles(X); otherwise None.

actual

Realized target value when available.

params

Actual fixed-plus-selected parameters used for the origin fit.

model_selection

Model-selection ledger for the origin/model, including retuned.

stored_model

Model and metadata sidecar paths when save_models=True; otherwise None.

window

Full window row used for the origin.

preprocessed

True when runner-level preprocessing was active for the row; otherwise False.

combined

True for runner-created combination rows; False for base model rows.

combination

Combination spec dictionary for combined rows; otherwise None.

Metadata Structure#

ForecastResult.metadata is JSON-ready and contains:

Key

Meaning

metadata_schema

Stable metadata contract: kind="forecast_result", schema version, execution path, forecast-table columns, and stage-record columns.

run

Forecast count, model count, execution path, forecast policy, horizons, meta config, metadata level, and model storage settings.

data

Input panel summary from macroforecast.data.panel_info(...).

window

Complete WindowSpec.to_dict() output.

stage_policies

Resolved preprocessing, feature-engineering, and model-selection policies.

preprocessing

PreprocessSpec.to_dict() output, or None.

forecast_policy

Policy metadata including method, horizons, future_feature_policy, and whether observed future predictors were used.

features

FeatureSpec.to_dict() output or supplied FeatureSet metadata.

model_selection

Search spec metadata, model-keyed search metadata, or None.

combination

List of resolved forecast-combination specs.

models

Runner aliases plus model spec metadata.

stages

Origin-level preprocessing and feature-engineering fit records unless metadata_level="minimal".

metadata_schema.version is currently 1. Code that consumes runner outputs should check this value before assuming column or metadata shape. The forecast_table_columns entry lists the stable row fields emitted by the runner. stage_record_columns lists the ledger fields used when a stateful stage is fitted or reused at an origin.

Runner Examples#

Full-Sample Empirical Preprocessing#

This pattern matches many retrospective empirical designs: clean the panel once, then run the forecasting experiment on the processed panel.

processed = mf.preprocessing.reprocess(
    panel,
    transform="official",
    outliers="iqr",
    outlier_action="flag_as_nan",
    impute="em_factor",
    standardize="zscore",
    frame="keep",
).panel

features = mf.feature_engineering.feature_spec(
    target="INDPRO",
    horizon=1,
    predictors="all",
    lags=(0, 1, 2),
)

result = mf.forecasting.run(
    processed,
    ["ridge", "lasso"],
    window=window,
    features=features,
    preprocessing=None,
    preset="small",
    save_models=False,
)

Window-Local Preprocessing#

This pattern is stricter for real-time forecasting. Preprocessing state is fit inside each origin’s fit window and then applied to validation/test rows.

pre = mf.preprocessing.preprocess_spec(
    transform="official",
    outliers="iqr",
    impute="mean",
    standardize="zscore",
    standardize_columns="predictors",
    frame="keep",
)

result = mf.forecasting.run(
    panel,
    "ridge",
    window=window,
    preprocessing=pre,
    preprocessing_policy=mf.window.stage_policy("fit_window", update="every_origin"),
    features=features,
    feature_policy=mf.window.stage_policy("fit_window", update="on_retrain"),
)

Fixed Standardization Reference#

Use a fixed preprocessing reference when scaling or imputation state should come from a named historical period rather than from every rolling origin.

result = mf.forecasting.run(
    panel,
    "ridge",
    window=window,
    preprocessing=pre,
    preprocessing_policy=mf.window.stage_policy(
        "fixed_reference",
        reference_start="1985-01-31",
        reference_end="2004-12-31",
        update="never",
    ),
    features=features,
)

Fixed PCA Loadings#

Use a fixed feature reference when factor loadings should be estimated once and held fixed while forecast origins move forward.

pca_features = mf.feature_engineering.feature_spec(
    target="INDPRO",
    horizon=1,
    predictors="all",
    lags=(0, 1),
    pca_components=8,
)

result = mf.forecasting.run(
    panel,
    "ols",
    window=window,
    preprocessing=None,
    features=pca_features,
    feature_policy=mf.window.stage_policy(
        "fixed_reference",
        reference_start="1985-01-31",
        reference_end="2004-12-31",
        update="never",
    ),
)

When feature_policy is full_panel or fixed_reference, runner-level preprocessing must be absent or preprocessing_policy must be full_panel. This keeps fixed feature state fitted on one well-defined processed panel.

Multiple Models with Model-Keyed Model Selection#

model_selection can be one shared SearchSpec or a model-keyed mapping. A model-keyed None disables model selection for that model. Mapping keys can be the output alias, such as linear, or the registered spec name, such as ridge.

result = mf.forecasting.run(
    panel,
    {"linear": "ridge", "sparse": "lasso", "tree": "random_forest"},
    window=window,
    features=features,
    model_selection={
        "linear": mf.model_selection.grid({"alpha": [0.01, 0.1, 1.0]}),
        "sparse": mf.model_selection.cv_path(
            param="alpha",
            values=[0.001, 0.01, 0.1, 1.0],
        ),
        "tree": mf.model_selection.random_search(
            {
                "n_estimators": mf.model_selection.randint(100, 500),
                "max_depth": mf.model_selection.choice([2, 3, 4, None]),
            },
            n_iter=12,
            random_state=123,
        ),
    },
    model_selection_policy=mf.window.stage_policy("fit_window"),
    model_selection_metric="mse",
)

params and preset follow the same alias-or-spec-key rule. Unknown keys raise an error instead of being silently ignored. For a single model, direct parameter names are also accepted, including dict-valued parameters such as params={"base_params": {"alpha": 0.1}} for a fit-time ensemble spec. This is also how model-local preprocessing options are expressed. For example, Elastic Net can standardize predictors inside each fit window while a tree model uses the raw feature matrix:

result = mf.forecasting.run(
    panel,
    ["elastic_net", "random_forest"],
    window=window,
    features=features,
    params={
        "elastic_net": {"standardize": True},
        "random_forest": {"n_estimators": 200, "max_features": 1 / 3},
    },
)

This differs from macroforecast.preprocessing.standardize_panel(), which is a panel preprocessing step. If that preprocessing is run on the full panel outside the runner, it uses full-sample moments. If leakage-safe scaling is needed for all models, use runner preprocessing specs and window policies. If only selected models need scaling, use model-local options such as standardize=True.

Forecast rows record the actual fixed-plus-selected parameter set in the params column. For example, if params={"ridge": {"fit_intercept": False}} and model selection picks {"alpha": 0.1}, the row records both values.

Forecast Combination In The Runner#

combination asks the runner to append combined forecasts after all base model forecasts have been collected. Combination rows use the same date, origin, origin_pos, horizon, actual, and window fields as the base rows, with model set to the combination name.

The models= filter inside a combination spec refers to the output model column, not the registry model_spec. If model={"bagged": "bagging"}, use models=["bagged"] when selecting that fit-time ensemble for a forecast combination.

result = mf.forecasting.run(
    panel,
    ["ridge", "lasso", "random_forest"],
    window=window,
    features=features,
    preset="small",
    combination="mean",
)

result.to_frame().query("model == 'combined_mean'")

Multiple combinations can be requested together:

result = mf.forecasting.run(
    panel,
    {"linear": "ridge", "sparse": "lasso", "tree": "random_forest"},
    window=window,
    features=features,
    combination={
        "avg": "mean",
        "dmspe": {
            "method": "dmspe",
            "models": ["linear", "sparse", "tree"],
            "discount": 0.95,
        },
        "best_two": {
            "method": "best_n",
            "n": 2,
        },
    },
)

inverse_mspe, dmspe, and best_n use only historical forecast errors when forming the current combined forecast. The current row’s realized value is used only after that row’s weight or best-model decision has already been made.

Custom forecast combinations use the same runner hook:

def blend(forecasts, *, actual, weight=0.5):
    return weight * forecasts.iloc[:, 0] + (1.0 - weight) * forecasts.iloc[:, -1]

result = mf.forecasting.run(
    panel,
    {"ridge": "ridge", "lasso": "lasso"},
    window=window,
    features=features,
    combination=mf.forecasting.custom_combination(
        "ridge_lasso_blend",
        blend,
        models=["ridge", "lasso"],
        weight=0.25,
    ),
)

The callable receives a wide forecast matrix indexed by (date, origin, origin_pos, horizon) and an actual series aligned to the same index:

func(forecasts: pandas.DataFrame, *, actual: pandas.Series, **params)

It must return a Series or one-dimensional array-like object with the same length. The runner appends the output as rows with combined=True and records the callable name in metadata["combination"].

Mixed-Frequency DFM In The Runner#

Use the panel-input path for native monthly/quarterly state-space models. The input should be a DataBundle whose metadata records column-level native frequencies.

mixed = mf.data.combine(monthly_bundle, quarterly_bundle, frequency="native")

window = mf.window.spec(
    estimation=mf.window.estimation_expanding(min_size=120),
    val=mf.window.val_last_block(size=24),
    test=mf.window.test_origins(horizon=1, step=3),
)

result = mf.forecasting.run(
    mixed,
    "dfm_mixed_mariano_murasawa",
    window=window,
    target="GDPC1",
    params={
        "dfm_mixed_mariano_murasawa": {
            "n_factors": 1,
            "factor_order": 1,
        }
    },
    model_selection={"dfm_mixed_mariano_murasawa": None},
    features=None,
)

The result metadata records run.panel_model_runner=True and keeps data.native_frequency_counts, data.output_frequency_counts, and data.frequency="mixed" when those fields are present on the input bundle. The requested target must be present in the panel before fitting begins; a missing target raises a direct error before any model backend is called.

For all runner paths, fitted model predict(X_test) output is validated before records are appended. Array-like predictions are positional. A pandas Series or single-column DataFrame must either use X_test.index or the default RangeIndex(len(X_test)); any other index is rejected to avoid silently writing NaN forecasts. predict_quantiles(X_test) follows the same DataFrame index rule when it returns a DataFrame.

Quantile and Variance Outputs#

Models that expose variance or quantile prediction methods fill extra forecast columns automatically. The runner first tries predict_variance(X_test) for models such as hemisphere_nn, then falls back to predict_variance(horizon=len(X_test)) for volatility models.

quantile_result = mf.forecasting.run(
    panel,
    "quantile_regression_forest",
    window=window,
    features=features,
    params={
        "quantile_regression_forest": {
            "n_estimators": 200,
            "quantile_levels": (0.1, 0.5, 0.9),
            "random_state": 123,
        }
    },
    model_selection={"quantile_regression_forest": None},
)

quantile_result.to_frame()[["prediction", "quantile_predictions"]]
variance_result = mf.forecasting.run(
    panel,
    "garch11",
    window=window,
    features=mf.feature_engineering.feature_spec(target="y", horizon=1),
    model_selection={"garch11": None},
)

variance_result.to_frame()[["prediction", "variance_prediction"]]

Model Storage#

Model storage is on by default. Use a custom root when the run should keep its fitted objects separate from other experiments.

stored = mf.forecasting.run(
    panel,
    ["ridge", "random_forest"],
    window=window,
    features=features,
    model_store="trained_model/monthly_baseline",
)

stored.to_frame()[["model", "stored_model"]]

Disable storage for fast exploratory runs:

preview = mf.forecasting.run(
    panel,
    "ridge",
    window=window,
    features=features,
    save_models=False,
)

Trained Model Storage#

By default, run() saves the fitted model object produced at each forecast origin. The runner decides which object to save: after model selection, it refits the model on the origin fit window with the selected best parameters, then delegates the actual pickle and JSON write to macroforecast.models.save_fit().

The default root is relative to the current working directory:

trained_model/{model_name}/origin_{origin_pos}_h{horizon}_{origin}.pkl
trained_model/{model_name}/origin_{origin_pos}_h{horizon}_{origin}.json

Model selection remains a runner responsibility because it depends on the window, validation split, model-selection policy, and model-owned search space. Model persistence remains a model-module utility because it only knows how to save a fitted object and a metadata sidecar.

The forecast table includes a stored_model dictionary for each row:

Key

Meaning

model_path

Pickle path for the fitted model, or None when the object cannot be pickled.

metadata_path

JSON metadata path written for the fitted model.

save_error

None on success, otherwise the pickle error string.

The sidecar JSON records the model alias, canonical model spec, fit metadata, fit diagnostics, selected parameters, model-selection ledger, and window row used for the fit. For custom/local callables that cannot be pickled, the runner still writes the JSON sidecar and records save_error; forecasting continues.

When a ForecastResult is passed to mf.output.write_artifacts(...), the artifact manifest also records the stored_model pickle and sidecar paths. The output writer does not copy those model files; it links the already-written runner artifacts into the manifest with stored_model_pickle and stored_model_metadata records.

To disable storage:

result = mf.forecasting.run(
    panel,
    "ridge",
    window=window,
    features=features,
    save_models=False,
)

Benchmark Forecasts#

Benchmark-relative metrics are evaluated after run(), but the benchmark forecast should normally be generated at the forecasting stage. Include the benchmark model in the same runner call so it uses the same preprocessing, feature policy, window, validation split, forecast origin, horizon, and target support as the candidate models.

result = mf.forecasting.run(
    panel,
    ["ridge", "ols"],
    window=window,
    features=features,
)

scores = result.evaluate(
    metrics=("mse", "relative_mse", "r2_oos"),
    benchmark_model="ols",
)

External benchmark forecasts are also allowed when they come from a published system or an existing CSV. Append those rows to the forecast table first, using the same model, date or origin, horizon, target, actual, and prediction contract. macroforecast.metrics.evaluate_forecasts() then checks that candidate and benchmark supports match before computing relative metrics.

ForecastResult#

macroforecast.forecasting.ForecastResult(forecasts, metadata={})

Attribute

Type

Meaning

forecasts

pandas DataFrame

One row per emitted forecast.

metadata

dict

Window, preprocessing, feature, model, and model-selection metadata.

sidecars

dict

Runtime objects attached after forecasting, such as a ForecastShapleyResult.

The forecast table always includes prediction. If the fitted model exposes predict_variance(X_test) or predict_variance(horizon=...), the runner also fills variance_prediction; otherwise that column is None. If the fitted model exposes predict_quantiles(X), the runner also fills quantile_predictions with a per-row dictionary such as {"0.1": value, "0.5": value, "0.9": value}; otherwise that column is None.

Methods:

Method

Output

to_frame()

Forecast table copy.

evaluate(**kwargs)

Calls macroforecast.metrics.evaluate_forecasts() on this result.

with_sidecar(name, value)

Returns a copy with a named runtime sidecar.

with_oshapley(X, y, models, window=..., **kwargs)

Builds and attaches an oShapley/PBSV sidecar.

with_anatomy(X, y, models, window=..., **kwargs)

Backend alias for with_oshapley(...).

with_dual(model, X_train, y_train, X_test=None, **kwargs)

Builds and attaches a dual interpretation sidecar.

get_sidecar(name, default=None)

Retrieves one sidecar.

sidecar_names()

Lists attached sidecars.

to_dict()

JSON-ready dictionary.

to_json(path=None)

JSON text and optional file write.

Direct Forecast Combination Functions#

Forecast combination lives in macroforecast.forecasting because it combines forecast outputs, not model fits. Fit-time member-model composition lives in macroforecast.model_ensemble.

Function

Meaning

combine_mean(forecasts)

Equal-weight average.

combine_median(forecasts)

Cross-model median.

combine_trimmed_mean(forecasts, trim=0.1)

Trim extremes before averaging.

combine_winsorized_mean(forecasts, limits=(0.1, 0.1))

Winsorize extremes before averaging.

combine_inverse_mspe(forecasts, y_true, discount=1.0)

Inverse discounted MSPE weights.

combine_dmspe(forecasts, y_true, discount=1.0)

Alias for inverse discounted MSPE.

combine_best_n(forecasts, y_true, n=3)

Average historically best n models.

combine_bates_granger(forecasts, y_true, ...)

Minimum error-variance weights (full covariance, 1969).

combine_granger_ramanathan(forecasts, y_true, variant=...)

Regression combination (ols/no_intercept/constrained).

combine_constrained_ls(forecasts, y_true, ...)

Non-negative weights summing to one.

combine_eigenvector(forecasts, y_true, ...)

Principal-component (Hsiao-Wan) combination.

combine_regularized(forecasts, y_true, penalty=...)

Ridge/Lasso-penalised regression weights.

combine_linear_pool(means, sds, weights=None)

Linear opinion pool of Gaussian densities.

combine_log_pool(means, sds, weights=None)

Logarithmic opinion pool of Gaussian densities.

combination_spec(method, name=None, models=None, **params)

Build a reusable runner combination spec.

custom_combination(name, func, models=None, **params)

Build a runner combination spec from a user callable.