macroforecast.model_ensemble#

Back to reference

macroforecast.model_ensemble owns fit-time model composition. These callables receive one aligned X, y training sample, fit multiple member models inside that training window, and return one ModelFit whose predict(X_new) returns one forecast series. This is different from macroforecast.forecasting.combination, which combines already-produced forecast rows after models have been fitted.

Public Surface#

Symbol	Kind	Purpose
`bagging`	fit function	Bootstrap or block-bootstrap member fits.
`subagging`	fit function	Sampling-without-replacement member fits.
`random_subspace`	fit function	Member fits on random feature subsets.
`stacking`	fit function	Out-of-fold base predictions plus a meta learner.
`super_learner`	fit function	SuperLearner-style OOF convex weighted average.
`booging`	fit function	Bagged overfit stochastic boosting with feature perturbation.
`BaggingRegressor`	estimator class	Backend for `bagging` and `subagging`.
`RandomSubspaceRegressor`	estimator class	Backend for `random_subspace`.
`StackingRegressor`	estimator class	Backend for `stacking`.
`SuperLearnerRegressor`	estimator class	Backend for `super_learner`.
`BoogingRegressor`	estimator class	Backend for `booging`.
`MODEL_ENSEMBLE_BASE_ESTIMATORS`	registry	Supported inner estimators for `base`, `models`, and `meta_model`.
`MODEL_ENSEMBLE_SPECS`	registry	Registered fit-time ensemble specs.
`list_model_ensemble_bases`	helper	Return supported inner estimators and backend names.
`get_model_ensemble`	spec helper	Resolve a name, callable, or spec.
`list_model_ensemble_specs`	spec helper	Return a registry table.
`describe_model_ensemble`	spec helper	Return parameter/default/search-space documentation.
`model_ensemble_search_space`	spec helper	Return a preset search space.
`custom_model_ensemble`	extension helper	Build a user-owned `ModelSpec` with `family="model_ensemble"`.

Boundary#

Question	Use
Fit several member models inside one training window and expose one predictor?	`macroforecast.model_ensemble`
Fit independent models and combine their OOS forecast rows?	`macroforecast.forecasting.combination`
Use tree boosting as one estimator, such as XGBoost or LightGBM?	`macroforecast.models`

Common Contract#

All public fit functions in this namespace use the same callable shape:

Item	Contract
Input `X`	pandas-like predictor matrix. Index is preserved in fitted diagnostics. Missing values are aligned with `y` and filled with `0.0` inside member estimators after row alignment.
Input `y`	pandas-like target series. Required unless `X` is a `FeatureSet`.
Output	`macroforecast.models.ModelFit`. `fit.model` is the callable name, such as `subagging`, and `fit.predict(X_new)` returns one prediction series.
Metadata	`fit.metadata` stores the resolved training count, ensemble parameters, base-model choices, seeds, and implementation note.
Diagnostics	`fit.diagnostics["model_ensemble"]` stores member/fold diagnostics when available. Point fitted values and residual metrics are also stored by the shared model wrapper.

Common diagnostics:

Key	Produced by	Meaning
`n_members`	all estimator-backed ensembles	Number of fitted member models stored in the ensemble.
`member_samples`	`bagging`, `subagging`, `booging`	Compact row-sampling ledger for each member.
`member_features`	`random_subspace`, `booging`	Feature subset ledger for each member.
`oob_predictions`, `oob_residuals`, `oob_metrics`	`bagging`, `subagging` when OOB rows exist	Out-of-bag diagnostics from member fits that did not use a row.
`folds`	`stacking`, `super_learner`	Fold ledger used to create out-of-fold predictions.
`oof_predictions`	`stacking`, `super_learner`	Out-of-fold library matrix.
`weights`, `oof_risk`	`super_learner`	Convex library weights and OOF MSE by learner.

Example:

fit = macroforecast.model_ensemble.subagging(
    X_train,
    y_train,
    base="ridge",
    n_estimators=50,
    max_samples=0.632,
)
pred = fit.predict(X_test)

Runner usage:

result = macroforecast.forecasting.run(
    panel,
    model="super_learner",
    target="INDPRO",
    horizon=1,
    features=macroforecast.feature_engineering.feature_spec(
        target="INDPRO",
        horizon=1,
        lags=12,
    ),
    params={
        "super_learner": {
            "models": ("ridge", "lasso", "random_forest"),
            "n_splits": 5,
            "weight_method": "nnls",
        }
    },
)

When a model ensemble is passed through a runner alias, downstream forecast combination selects the alias, not the registry name:

result = macroforecast.forecasting.run(
    panel,
    {"bagged": "bagging", "linear": "ridge"},
    combination={
        "linear_plus_bagged": {
            "method": "mean",
            "models": ["linear", "bagged"],
        }
    },
)

R And Paper Alignment#

macroforecast callable	R / paper reference	Alignment	Difference
`bagging`	`ipred::bagging` / `ipredbagg`	Multiple resampled training sets, one member model per draw, average predictions.	R default is tree-focused; macroforecast allows several sklearn-compatible base regressors.
`subagging`	`ipredbagg(ns < n)`	Samples fewer than `n` observations without replacement.	Exposed as a separate callable for clarity.
`random_subspace`	`regRSM::regRSM` and random-forest `mtry` logic	Repeated random predictor-subspace fits.	General member-level feature bagging, not tree split-level `mtry` and not regRSM’s variable-importance final-model selection.
`stacking`	`caretEnsemble::caretStack`	Fits a meta model on out-of-fold base predictions.	Adds `splitter="forward"` for macro time ordering; R caret examples often use generic CV.
`super_learner`	`SuperLearner::SuperLearner`	Uses OOF library predictions and nonnegative weights that sum to one.	Regression-only callable; supports `nnls`, `best`, and `equal` weights, not every R family/loss/plugin.
`booging`	Goulet Coulombe, To Bag is to Prune and the `bagofprunes` R source `Booging(y, X, X.new, ...)`	Samples rows with `sampling_rate`, samples features with `mtry`, optionally appends two perturbed fake feature copies, fits intentionally overfit stochastic boosting members, and averages member predictions.	Uses sklearn `GradientBoostingRegressor` as the boosting backend. Continuous scaling is train-only for leakage-safe estimator semantics, while the R script scales train and `X.new` jointly inside one prediction call.

The added functions beyond the old bagging/booging pair are subagging, random_subspace, stacking, and super_learner. They cover the main fit-time ensemble families that are useful before producing forecast rows: row-resampling ensembles, feature-subspace ensembles, OOF meta-learner ensembles, and convex-weight OOF library ensembles.

Not every R ensemble feature is copied. ipred double-bagging/bundling is a classification-oriented extension that passes extra learner predictions into trees; in macroforecast, cross-model forecast combinations belong in macroforecast.forecasting.combination. regRSM variable-importance final model selection belongs closer to feature selection and is not bundled into this fit-time ensemble callable.

Paper Citation And Scope#

The Booging implementation is based on:

Philippe Goulet Coulombe. 2024. To Bag is to Prune. arXiv:2008.07063v5.

The method code is cross-checked against the author’s public bagofprunes R source, especially Booging(y, X, X.new, ...) in PGC_Bag_of_Prunes_v200829.R. The package treats the paper as a method port, not as a full empirical replication. The goal is to make the randomized greedy-ensemble mechanism callable inside the macroforecast workflow.

The review-mapped mechanism is:

Paper idea	Meaning for macroforecast
Random Forest has a large in-sample versus out-of-sample R-squared wedge	The model can overfit member trees without necessarily damaging the ensemble forecast.
Bagging plus perturbation performs implicit pruning	Averaging many randomized greedy paths cancels unstable noise-fitting steps while preserving stable signal-fitting steps.
Greedy separability	Early greedy steps are not re-optimized after later overfit steps, so useful structure can survive overfitting.
Perfectly random forest null argument	Under pure noise, averaging many random predictions tends toward a sample-mean-like forecast, matching the optimal pruning intuition.
LASSO contrast	Global re-optimization along a regularization path weakens the same mechanism; setting the penalty to zero collapses to OLS rather than randomized implicit pruning.
Booging	Applies bagging and perturbation to boosted trees, so the stopping point is regularized by averaging randomized overfit boosting paths.
MARSquake	Applies the same outer idea to MARS through randomized predictor admission in the forward pass.

This package currently implements Booging. MARSquake is documented as a related paper method but is not exposed as a first-class callable yet, because exact alignment requires either a stable Python analogue of R earth::earth with the custom allowed callback or a package-native MARS forward-pass implementation. The plain MARS model remains available under macroforecast.models; that is not the same as MARSquake.

The paper’s empirical evidence spans simulated DGPs, many non-macro benchmark datasets, and six US macro forecasting tasks covering GDP, unemployment, and inflation horizons. For macroforecast, the relevant takeaway is operational: Booging is useful when small noisy macro samples make the exact boosting stopping point unstable, and the user wants randomized overfitting plus averaging as an alternative regularization route.

Bagging, Random Forest, And Booging#

These functions expose a common decomposition:

Method	Row sampling	Feature perturbation	Base learner	macroforecast location
Bagging	bootstrap or block bootstrap	optional member-level `max_features`	any registered inner base learner	`macroforecast.model_ensemble.bagging`
Random-subspace bagging	bootstrap/subsampling	member-level `max_features`	any registered inner base learner	`bagging(..., max_features=...)` or `random_subspace(...)`
Random forest	bootstrap plus split-level feature search	split-level tree `mtry`	CART trees	`macroforecast.models.random_forest`
Booging	row subsampling	member-level `mtry`, optional fake feature copies	overfit boosted trees	`macroforecast.model_ensemble.booging`

The distinction matters. bagging(base="decision_tree", max_features="sqrt") is a random-subspace tree ensemble, not an exact random forest, because the feature subset is drawn once per member rather than at each tree split. Use macroforecast.models.random_forest when the exact random-forest backend is the target. Use booging when the target is Goulet Coulombe’s randomized greedy-boosting ensemble.

Registered Specs#

MODEL_ENSEMBLE_SPECS mirrors the model registry but is intentionally separate from macroforecast.models.MODEL_SPECS.

macroforecast.model_ensemble.list_model_ensemble_specs()

Name	Default search	Presets	Backend
`bagging`	`random`	`small`, `standard`, `wide`	internal member resampling + sklearn-compatible base estimators
`subagging`	`random`	`small`, `standard`, `wide`	internal subagging + sklearn-compatible base estimators
`random_subspace`	`random`	`small`, `standard`, `wide`	internal random subspace + sklearn-compatible base estimators
`stacking`	`random`	`small`, `standard`, `wide`	internal OOF stacking + sklearn-compatible base/meta estimators
`super_learner`	`random`	`small`, `standard`, `wide`	internal SuperLearner-style OOF NNLS/equal/best weighting
`booging`	`random`	`small`, `standard`, `wide`	internal augmentation/bagging + sklearn.ensemble.GradientBoostingRegressor

The forecasting runner and macroforecast.model_selection.search_spec() both resolve these names, so model="bagging" now means the fit-time ensemble spec, not a base model spec.

Inner Base Estimators#

bagging, subagging, and random_subspace use base=.... stacking and super_learner use models=(...), and stacking also uses meta_model=.... These names are intentionally narrower than macroforecast.models.MODEL_SPECS: they are inner sklearn-compatible estimators used inside a fit-time ensemble.

macroforecast.model_ensemble.list_model_ensemble_bases()

Name	Backend
`ols`	`sklearn.linear_model.LinearRegression`
`ridge`	`sklearn.linear_model.Ridge`
`lasso`	`sklearn.linear_model.Lasso`
`elastic_net`	`sklearn.linear_model.ElasticNet`
`decision_tree`	`sklearn.tree.DecisionTreeRegressor`
`random_forest`	`sklearn.ensemble.RandomForestRegressor`
`extra_trees`	`sklearn.ensemble.ExtraTreesRegressor`
`gradient_boosting`	`sklearn.ensemble.GradientBoostingRegressor`
`knn`	`sklearn.neighbors.KNeighborsRegressor`
`svr`	`sklearn.svm.SVR`

bagging#

macroforecast.model_ensemble.bagging(
    X,
    y,
    *,
    base="ridge",
    n_estimators=50,
    max_samples=0.8,
    random_state=0,
    base_params=None,
    strategy="standard",
    block_length=4,
    replace=True,
    max_features=None,
)

Input: X is a pandas-like predictor matrix and y is a target series. Output: ModelFit wrapping BaggingRegressor. predict(X_new) returns the mean across member predictions. predict_quantiles(X_new, levels=...) returns empirical quantiles across members.

The fitted object stores out-of-bag diagnostics when any observation is left out by at least one member: fit.diagnostics["model_ensemble"]["oob_predictions"], oob_residuals, oob_metrics, and n_members. It also stores member_features, even when max_features=None; this makes generic bagging, random-subspace bagging, and CART-like tree bagging inspectable with the same metadata contract.

Parameter	Default	Meaning
`base`	`"ridge"`	Base estimator name.
`n_estimators`	`50`	Number of member fits.
`max_samples`	`0.8`	Row sample fraction per member.
`random_state`	`0`	Resampling seed.
`base_params`	`None`	Parameters passed to the base estimator.
`strategy`	`"standard"`	`standard` row draws or moving `block` draws.
`block_length`	`4`	Moving-block length when `strategy="block"`.
`replace`	`True`	Whether rows are sampled with replacement.
`max_features`	`None`	Optional member-level feature-subspace size. Accepts `None`/`"all"`, a fraction, an integer count, `"sqrt"`, or `"log2"`.

subagging#

macroforecast.model_ensemble.subagging(
    X,
    y,
    *,
    base="ridge",
    n_estimators=50,
    max_samples=0.632,
    random_state=0,
    base_params=None,
    max_features=None,
)

subagging is bagging(..., replace=False, strategy="standard"). It follows the ipredbagg(ns < n) distinction where sampling fewer than n rows without replacement creates subagging rather than bootstrap bagging.

Input: X, y as above. Output: ModelFit wrapping BaggingRegressor with fit.model == "subagging". predict(X_new) averages member predictions.

Diagnostics store member_samples, OOB diagnostics when available, and n_members. fit.metadata["replace"] is always False.

Parameter	Default	Meaning
`base`	`"ridge"`	Base estimator name.
`n_estimators`	`50`	Number of subsampled member fits.
`max_samples`	`0.632`	Row sample fraction per member.
`random_state`	`0`	Resampling seed.
`base_params`	`None`	Parameters passed to the base estimator.
`max_features`	`None`	Optional member-level feature-subspace size, with the same accepted values as `bagging`.

random_subspace#

macroforecast.model_ensemble.random_subspace(
    X,
    y,
    *,
    base="ridge",
    n_estimators=100,
    max_features=0.5,
    max_samples=1.0,
    random_state=0,
    base_params=None,
)

Each member model sees a random subset of columns. max_features can be a fraction of columns, an integer count, "sqrt", or "log2". This is useful when p is large and the package user wants a fit-time model ensemble distinct from random-forest split-level mtry.

predict_quantiles(X_new, levels=...) returns empirical quantiles across random subspace members. The fitted diagnostics include member_features, which records which columns each member saw.

Parameter	Default	Meaning
`base`	`"ridge"`	Base estimator name.
`n_estimators`	`100`	Number of random feature-subspace fits.
`max_features`	`0.5`	Fraction, integer count, `"sqrt"`, or `"log2"` column count per member.
`max_samples`	`1.0`	Row subsample fraction per member.
`random_state`	`0`	Feature and row sampling seed.
`base_params`	`None`	Parameters passed to the base estimator.

stacking#

macroforecast.model_ensemble.stacking(
    X,
    y,
    *,
    models=("ridge", "lasso", "random_forest"),
    meta_model="ridge",
    n_splits=5,
    splitter="forward",
    random_state=0,
    model_params=None,
    meta_params=None,
    passthrough=False,
)

stacking creates out-of-fold base predictions, fits meta_model on those predictions, then refits every base model on the full training sample. For macro forecasting, the default splitter="forward" only validates on later blocks after earlier training data. splitter="blocked" and splitter="kfold" are available when the user wants R-style generic cross-validation behavior.

models must contain unique names because model_params is keyed by model name. The fitted diagnostics include the OOF prediction matrix.

Parameter	Default	Meaning
`models`	`("ridge", "lasso", "random_forest")`	Base model library. Names must be unique.
`meta_model`	`"ridge"`	Meta learner fit on OOF predictions.
`n_splits`	`5`	Number of OOF validation folds.
`splitter`	`"forward"`	OOF splitter: `forward`, `blocked`, or `kfold`.
`random_state`	`0`	Base/meta seed.
`model_params`	`None`	Per-base model parameter dictionary keyed by model name.
`meta_params`	`None`	Meta-model parameters.
`passthrough`	`False`	Whether the meta learner also receives original `X`.

Output: ModelFit wrapping StackingRegressor. Diagnostics include folds and oof_predictions.

super_learner#

macroforecast.model_ensemble.super_learner(
    X,
    y,
    *,
    models=("ridge", "lasso", "random_forest"),
    n_splits=5,
    splitter="forward",
    weight_method="nnls",
    random_state=0,
    model_params=None,
)

super_learner uses the same OOF library matrix as stacking but fits a convex weighted average instead of a general meta model. weight_method="nnls" uses nonnegative least squares and normalizes weights to sum to one. best gives all weight to the lowest OOF-MSE learner, matching the discrete Super Learner idea. equal uses equal weights.

The fitted estimator exposes weights_ and oof_predictions_. models must contain unique names because weights and model_params are keyed by model name. fit.diagnostics["model_ensemble"] stores weights, oof_risk, and oof_predictions.

Parameter	Default	Meaning
`models`	`("ridge", "lasso", "random_forest")`	Base learner library. Names must be unique.
`n_splits`	`5`	Number of OOF validation folds.
`splitter`	`"forward"`	OOF splitter: `forward`, `blocked`, or `kfold`.
`weight_method`	`"nnls"`	`nnls`, `best`, or `equal`.
`random_state`	`0`	Base learner seed.
`model_params`	`None`	Per-base learner parameters keyed by model name.

Output: ModelFit wrapping SuperLearnerRegressor. Diagnostics include folds, oof_predictions, oof_risk, and weights.

booging#

macroforecast.model_ensemble.booging(
    X,
    y,
    *,
    B=100,
    sampling_rate=0.75,
    mtry=0.8,
    data_aug=False,
    noise_level=0.3,
    shuffle_rate=0.2,
    n_trees=1000,
    tree_depth=3,
    nu=0.3,
    bf=0.5,
    n_augmented_copies=2,
    scale_continuous=True,
    fix_seeds=True,
    random_state=0,
)

booging fits many intentionally overfit stochastic gradient-boosting members. It is the boosted-tree analogue of the randomized greedy ensemble logic behind random forests: draw rows, draw features, fit an overfit boosted-tree path, and average over many perturbed paths.

The callable accepts the paper/R-code names directly:

R code name	macroforecast name	Meaning
`B`	`B`	Number of overfit boosting members.
`sampling.rate`	`sampling_rate`	Row fraction sampled without replacement per member.
`mtry`	`mtry`	Feature-subspace size per member. Accepts fractions, integer counts, `"sqrt"`, or `"log2"`.
`data.aug`	`data_aug`	Whether to append fake perturbed feature copies.
`noise.level`	`noise_level`	Gaussian noise scale for continuous fake copies after standardization.
`shuffle.rate`	`shuffle_rate`	Share of rows shuffled inside binary fake copies.
`bf`	`bf`	Stochastic boosting subsample share inside each boosted member.
`n.trees`	`n_trees`	Boosting stages inside each member.
`tree.depth`	`tree_depth`	Inner boosting tree depth.
`nu`	`nu`	Inner boosting learning rate.

Backward-compatible names are still accepted: sample_frac, inner_n_estimators, inner_learning_rate, inner_max_depth, inner_subsample, max_features, da_noise_frac, and da_drop_rate.

Backend parameter mapping:

R `gbm` / Booging parameter	sklearn / macroforecast target
`gbm::gbm(n.trees=...)`	`GradientBoostingRegressor(n_estimators=...)` through `n_trees`
`gbm::gbm(shrinkage=...)`	`GradientBoostingRegressor(learning_rate=...)` through `nu`
`gbm::gbm(interaction.depth=...)`	`GradientBoostingRegressor(max_depth=...)` through `tree_depth`
`gbm::gbm(bag.fraction=...)`	`GradientBoostingRegressor(subsample=...)` through `bf`
`Booging(..., sampling.rate=...)`	outer row subsampling before each boosted member
`Booging(..., mtry=...)`	outer member-level feature subsampling before each boosted member

When data_aug=True, macroforecast follows the R algorithm’s fake-column structure. Continuous variables are standardized and then copied with Gaussian noise. Binary variables are copied after shuffling a shuffle_rate share of rows. Two fake copies are used by default, matching the R source’s fake1_ and fake2_ construction. Prediction uses deterministic fake copies of X_new, so calling predict() does not draw new perturbations.

Algorithm:

Align X and y under the standard ModelFit contract.
Detect binary predictors as columns with exactly two finite unique values.
Standardize continuous predictors using the training sample.
If data_aug=True, append n_augmented_copies fake copies: continuous fake columns receive Gaussian perturbations, and binary fake columns receive row-shuffle perturbations.
For each member b = 1, ..., B, draw rows at sampling_rate, draw columns at mtry, fit an overfit stochastic gradient-boosting member, and store the row and feature ledgers.
Predict by averaging member predictions. Quantile forecasts use empirical quantiles across member predictions.

Input: X, y as above. Output: ModelFit wrapping BoogingRegressor. predict(X_new) averages overfit boosting members, and predict_quantiles(X_new, levels=...) returns member-forecast empirical quantiles.

Diagnostics include member_samples, member_features, augmentation_summary, and n_members. member_features names original and fake columns; the augmentation summary reports the number of binary and continuous features, the number of fake copies, and the leakage boundary for continuous scaling.

Parameter	Default	Meaning
`B`	`100`	Number of overfit boosting members.
`sampling_rate`	`0.75`	Row sample fraction per member.
`mtry`	`0.8`	Feature-subspace size per member.
`data_aug`	`False`	Whether to append perturbed fake feature copies.
`noise_level`	`0.3`	Gaussian noise scale for continuous fake copies.
`shuffle_rate`	`0.2`	Binary-feature row-shuffle share for fake copies.
`n_trees`	`1000`	Boosting stages inside each member.
`tree_depth`	`3`	Inner boosting tree depth.
`nu`	`0.3`	Inner boosting learning rate.
`bf`	`0.5`	Stochastic gradient boosting subsample share. For samples below 100 rows, this is floored at `0.4`, matching the R code.
`n_augmented_copies`	`2`	Number of fake feature copies when `data_aug=True`.
`scale_continuous`	`True`	Standardize continuous variables before fake-copy perturbation.
`fix_seeds`	`True`	Use deterministic member seeds analogous to the R source’s `set.seed(2020+b)`.
`random_state`	`0`	Member, row, column, and augmentation seed.

Important implementation note: the R script scales continuous X and X.new jointly because it receives the training and new matrices in the same function call. The macroforecast estimator uses train-only scaling, because a standard fit/predict object must not use future X_new information during fit(). This is the deliberate leakage-safe difference; the member sampling, fake-copy perturbation, boosting backend role, and averaging logic follow the R algorithm.

Custom Extensions#

Use custom_model_ensemble() when the user wants a fit-time ensemble that behaves like a model spec:

spec = macroforecast.model_ensemble.custom_model_ensemble(
    "my_ensemble",
    my_fit_function,
    default_params={"B": 20},
)

Use macroforecast.forecasting.custom_combination() instead when the custom logic combines forecast rows after runner execution.