macroforecast.models#
macroforecast.models contains direct callable model fits. Each function
accepts pandas data, fits immediately, and returns a fitted result object with
predict().
lasso_path is intentionally not a public model family. Use lasso() with a
chosen alpha, or use get_model("lasso") with model_selection.select_params()
to choose alpha from the lasso-owned search space.
Return Objects#
Model functions return ModelFit.
macroforecast.models.ModelFit(
estimator,
model,
feature_names=(),
target_name=None,
metadata={},
diagnostics={},
)
Attribute |
Type |
Description |
|---|---|---|
|
object |
Fitted underlying estimator. |
|
str |
Canonical model name. |
|
tuple[str, …] |
Feature columns used at fit time. |
|
str or |
Target name when available. |
|
dict |
Fit metadata such as |
|
dict |
Model-specific fitted diagnostics that can be collected safely. |
ModelFit.predict(X) returns a pandas.Series named "prediction" and keeps
the index of the provided X when X is a DataFrame.
For custom fitted objects used in forecasting.run(...), predict(X_test)
may return an array-like object or a pandas Series/single-column DataFrame.
Array-like output is treated positionally. Pandas output must either be indexed
exactly like X_test.index or use the default positional index
RangeIndex(len(X_test)). Any other index raises an error rather than silently
creating missing forecasts. The same DataFrame index rule applies to
predict_quantiles(X_test) when a fitted object exposes quantile forecasts.
ModelFit.to_dict() returns JSON-ready fit metadata. It records the canonical
model name, the underlying estimator class name, the fitted feature names, the
target name, n_features, the fit metadata, and JSON-ready diagnostics.
It does not serialize the fitted estimator itself.
fit = macroforecast.models.ridge(X, y, alpha=0.5)
fit.to_dict()
Output shape:
{
"model": "ridge",
"estimator": "sklearn.linear_model._ridge.Ridge",
"feature_names": ["x1", "x2"],
"target_name": "y",
"n_features": 2,
"metadata": {"n_obs": 120, "alpha": 0.5},
"diagnostics": {
"fitted_values": {"name": "fitted", "index": [...], "data": [...]},
"residuals": {"name": "residual", "index": [...], "data": [...]},
"metrics": {"n": 120, "mae": 0.04, "mse": 0.003, "rmse": 0.055},
"coefficients": {"name": "coefficient", "index": ["x1", "x2"], "data": [...]},
"selected_features": ["x1", "x2"],
},
}
ModelFit.to_metadata() wraps the same block under {"model": ..., "fit": ...}
for downstream forecasting and result records.
Volatility functions return VolatilityFit, which extends ModelFit with
predict_variance(horizon=1) and conditional_volatility.
Some model families expose backend estimator classes as public symbols. These are useful when users need estimator-native attributes, custom wrappers, or type checks; the normal user entry point remains the lowercase fit function.
Estimator class |
Fit function |
Meaning |
|---|---|---|
|
|
Internal MARS-style spline regressor. |
|
|
Competition-based LGB+ hybrid tree/linear boosting backend. |
|
|
Alternating LGB^A+ hybrid tree/linear boosting backend. |
|
|
Quantile forest backend. |
|
|
Macro Random Forest backend. |
|
|
Supervised PCA backend. |
|
|
Supervised scaled-PCA backend. |
|
|
ARCH/GARCH volatility backend. |
|
|
Realized-GARCH backend. |
|
|
Generic assemblage/Albacore-style constrained aggregation backend. |
Fit Persistence#
models owns the low-level persistence format for fitted model objects.
Forecasting runners decide which fitted object should be saved and which
window, selection, and parameter metadata should be attached.
saved = macroforecast.models.save_fit(
fit,
"trained_model/ridge/origin_0_h1_20000131.pkl",
metadata={
"alias": "ridge",
"params": {"alpha": 0.1},
"model_selection": selection_metadata,
},
)
loaded = macroforecast.models.load_fit(saved.model_path)
Function |
Input |
Output |
Meaning |
|---|---|---|---|
|
Fitted model object and output paths. |
|
Writes pickle plus JSON sidecar. |
|
Pickle path. |
fitted object |
Loads a saved fit. |
SavedModel.to_dict() returns model_path, metadata_path, and
save_error. If a custom/local model cannot be pickled, model_path is
None, save_error records the failure, and the JSON sidecar is still
written. The sidecar always includes the available fit.to_metadata() block
when the fit exposes it.
Fit Diagnostics#
Diagnostics are collected on a best-effort basis. A model only records values that the fitted backend exposes and that can be computed without changing the fit. Missing keys mean the model does not expose that diagnostic, not that the fit failed.
Common keys:
Key |
Recorded when |
Meaning |
|---|---|---|
|
Estimator exposes |
In-sample fitted values indexed like the aligned target. |
|
|
Training residuals, |
|
|
Residual count, mean, standard deviation, MAE, MSE, and RMSE. |
|
Estimator exposes |
Coefficients indexed by feature name when possible. |
|
Estimator exposes |
Scalar or list intercept. |
|
Nonzero coefficients or estimator selection metadata is available. |
Selected feature names. |
|
Estimator exposes |
Tree-style importances sorted descending. |
|
Estimator exposes factor loadings, loadings, or components. |
Factor/PCA loading matrix when available. |
|
Estimator exposes component-level selection metadata. |
Selected source features for supervised PCA-style components. |
|
Estimator records iterative training history. |
Epoch loss or backend-specific training trace. |
|
Volatility estimator exposes fitted conditional volatility. |
In-sample conditional volatility path. |
|
Volatility estimator exposes fitted parameter estimates. |
Fitted volatility-model parameters. |
Example:
fit = macroforecast.models.random_forest(X, y, n_estimators=200)
fit.diagnostics["feature_importance"].head()
Model Specs And Hyperparameter Spaces#
Model functions fit immediately. Model specs are the model-selection objects: they keep the fit callable together with model-owned defaults, tunable parameters, and preset search spaces.
model = macroforecast.models.get_model("lasso", preset="standard")
result = macroforecast.model_selection.select_params(
model,
X,
y,
window=macroforecast.window.expanding(min_train_size=120),
metric=macroforecast.metrics.rmse,
)
fit = model(X, y, **result.best_params)
ModelSpec#
macroforecast.models.ModelSpec(
name,
family,
fit_func,
default_params={},
parameters=(),
search_spaces={},
default_search_method="grid",
default_preset="standard",
input_kind="supervised",
preset="standard",
params={},
backend="internal",
requires_extra=None,
requires_scaling=False,
recommended_preprocessing=(),
)
Attribute |
Type |
Meaning |
|---|---|---|
|
str |
Canonical model name. |
|
str |
Model family such as |
|
callable |
Underlying fit function. |
|
dict |
Model-owned default keyword arguments. |
|
tuple |
|
|
dict |
Preset-specific hyperparameter candidates. |
|
str |
Search method normally used for the model. |
|
str |
Default hyperparameter preset. |
|
str |
Input convention: |
|
str |
Active search-space preset. |
|
dict |
User-fixed model parameters. |
|
str |
Implementation backend, for example |
|
str or |
Optional dependency extra required to fit the model. |
|
bool |
Whether the model is scale-sensitive and expects explicit preprocessing. |
|
tuple[str, …] |
Short preprocessing notes attached to metadata. |
ModelSpec is callable:
model = macroforecast.models.get_model("ridge", params={"alpha": 0.5})
fit = model(X, y)
ModelSpec.to_dict() returns a detailed JSON-ready specification including
defaults, fixed params, parameter descriptions, and all preset search spaces.
ModelSpec.to_metadata() returns the compact runner-facing block:
{
"model": "ridge",
"model_family": "linear",
"model_preset": "small",
"input_kind": "supervised",
"backend": "sklearn.linear_model.Ridge",
"requires_extra": None,
"requires_scaling": False,
"recommended_preprocessing": [],
"default_search_method": "cv_path",
"default_params": {"alpha": 1.0},
"params": {"alpha": 0.5},
"search_space": {"alpha": [0.01, 0.1, 1.0]},
}
get_model#
macroforecast.models.get_model(model, *, preset=None, params=None)
Input |
Type |
Meaning |
|---|---|---|
|
str, callable, or |
Model name, registered model callable, or existing spec. |
|
str or |
Search-space preset to attach. |
|
dict or |
Fixed model parameters to attach. |
Output |
Type |
Meaning |
|---|---|---|
return |
|
Callable model spec with model-owned defaults and spaces. |
custom_model#
Build a user-owned ModelSpec without registering a package model.
macroforecast.models.custom_model(
name: str,
fit_func,
*,
family: str = "custom",
default_params: Mapping[str, object] | None = None,
parameters: tuple[ModelParameter, ...] = (),
search_spaces: dict[str, dict[str, tuple[object, ...]]] | None = None,
default_search_method: str = "grid",
default_preset: str = "standard",
input_kind: str = "supervised",
backend: str = "custom",
requires_extra: str | None = None,
requires_scaling: bool = False,
recommended_preprocessing: tuple[str, ...] = (),
description: str | None = None,
) -> ModelSpec
Callable Contract#
The default supervised contract is:
fit_func(X: pandas.DataFrame, y: pandas.Series, **params) -> fitted_object
The fitted object must expose:
fitted_object.predict(X_test)
predict(X_test) may return a pandas Series, a single-column DataFrame, or
an array-like object with length len(X_test). Pandas output must either use
X_test.index or RangeIndex(len(X_test)); any other index is rejected by
forecasting.run(...).
Set input_kind when the custom model follows another convention:
|
Fit callable receives |
Use case |
|---|---|---|
|
|
Regression-style models. |
|
|
Target-only time-series models. |
|
|
Panel-input models. |
|
|
Volatility or density models. |
search_spaces uses the same model-owned preset contract as registered models:
model = mf.models.custom_model(
"mean_model",
mean_model,
default_params={"offset": 0.0},
search_spaces={
"small": {"offset": (-0.1, 0.0, 0.1)},
"standard": {"offset": (-0.5, 0.0, 0.5)},
},
)
result = mf.forecasting.run(
panel,
{"mean": model},
window=window,
features=features,
preset={"mean": "small"},
)
custom_model() does not mutate the global registry. Pass the returned
ModelSpec directly to forecasting.run(...), model_selection.select_params(...),
or model_search_space(...).
list_model_specs#
macroforecast.models.list_model_specs(family=None)
Returns a DataFrame with one row per registered model: name, family,
input_kind, backend, requires_extra, requires_scaling,
recommended_preprocessing, default_search_method, default_preset,
available presets, and n_tunable.
describe_model#
macroforecast.models.describe_model(model)
Returns a DataFrame with parameter-level documentation and preset search spaces.
Example:
parameter |
default |
tunable |
small_space |
standard_space |
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
model_search_space#
macroforecast.models.model_search_space(model, *, preset=None)
Returns the model-owned candidate dictionary for the selected preset.
MODEL_SPECS is the public registry backing get_model(...),
list_model_specs(), describe_model(...), and model_search_space(...).
macroforecast.models.model_search_space("random_forest", preset="small")
Output:
{
"n_estimators": (50, 100),
"max_depth": (3, 5, None),
"min_samples_leaf": (1, 3),
}
Presets#
Preset |
Purpose |
|---|---|
|
Fast smoke tests and short interactive checks. |
|
Default analysis-scale search space. |
|
Larger search space for more expensive runs. |
Input And Output Conventions#
Input kind |
Callable shape |
Use case |
|---|---|---|
|
|
Most regression, factor, and tree models. Fit-time ensembles use the same shape in |
|
|
Univariate target-only models such as |
|
|
Multivariate time-series models such as |
|
|
Return and volatility models. |
For supervised models, X may be a pandas DataFrame, a 2-D array, or a
FeatureSet. y may be a Series, 1-D array, or one-column DataFrame. If X
is a FeatureSet, y can be omitted.
All non-volatility model functions return ModelFit. Volatility functions
return VolatilityFit.
Scaling Policy#
The clean model API does not silently standardize predictors for models that
are traditionally scale-sensitive. Instead, those models advertise
requires_scaling=True through ModelSpec, list_model_specs(),
model_search_space(), and select_params() metadata.
lasso(..., standardize=True) and elastic_net(..., standardize=True) are
explicit opt-in replication helpers; the default remains False.
There are two different scaling locations:
macroforecast.preprocessing.standardize_panel()standardizes a panel before model fitting. If it is run on the full sample outside the forecasting runner, it uses full-sample moments. In a leakage-safe run, use runner preprocessing specs and window policies so the scaling state is fitted only on allowed rows.model(..., standardize=True)standardizes inside that model’s own fit call. It is useful when only selected models need model-local scaling, or when a lasso/elastic-net replication requires the penalty grid to be defined on window-local standardized predictors. For broader model-specific transformations beyond scaling, run separate model pipelines or use a model-pipeline runner layer rather than hiding those transformations inside a single estimator.
Current scale-sensitive callable models:
Model |
Backend |
Scaling policy |
|---|---|---|
|
|
Standardize predictors with |
|
|
Standardize predictors before fitting. |
|
|
Standardize predictors before fitting. |
|
|
Standardizes |
|
|
Standardizes |
|
torch dual-head dense network |
Standardizes |
|
torch-native Aionx DensityHNN port |
Standardizes |
nn, lstm, gru, transformer, and density_hnn standardize X and y
inside each fit window and map predictions back to the target scale.
hemisphere_nn standardizes X and keeps the target in original units because
its variance head is a compact density-forecast object. Their metadata records
requires_extra="deep" and requires_scaling=False.
Registered Model Catalog#
Model |
Family |
Input kind |
Default search |
Presets |
|---|---|---|---|---|
|
linear |
supervised |
|
none |
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
assemblage |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
|
|
linear |
supervised |
|
none |
|
linear |
supervised |
|
|
|
nonparametric |
supervised |
|
|
|
nonparametric |
supervised |
|
|
|
linear |
supervised |
|
|
|
support_vector |
supervised |
|
|
|
support_vector |
supervised |
|
|
|
support_vector |
supervised |
|
|
|
neural |
supervised |
|
|
|
neural |
supervised |
|
|
|
neural |
supervised |
|
|
|
neural |
supervised |
|
|
|
neural |
supervised |
|
|
|
neural |
supervised |
|
|
|
composite |
supervised |
|
|
|
composite |
supervised |
|
|
|
composite |
supervised |
|
|
|
composite |
supervised |
|
|
|
timeseries |
target |
|
|
|
timeseries |
panel |
|
|
|
timeseries |
panel |
|
|
|
timeseries |
panel |
|
|
|
timeseries |
target |
|
none |
|
timeseries |
target |
|
none |
|
timeseries |
target |
|
none |
|
mixed_frequency |
panel |
|
|
|
mixed_frequency |
supervised |
|
|
|
mixed_frequency |
supervised |
|
|
|
mixed_frequency |
supervised |
|
|
|
mixed_frequency |
supervised |
|
|
|
mixed_frequency |
supervised |
|
|
|
mixed_frequency |
panel |
|
|
|
factor |
supervised |
|
|
|
factor |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
spline |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
tree |
supervised |
|
|
|
volatility |
volatility |
|
|
|
volatility |
volatility |
|
|
|
volatility |
volatility |
|
|
Linear Models#
Linear implementation map#
The linear family mixes thin sklearn wrappers, hybrid macroforecast code, and
package-native solvers. backend in ModelSpec records that distinction so
metadata exported by describe_model(), list_model_specs(), and saved
ModelFit objects is inspectable.
Model |
Implementation class |
Runtime backend |
|---|---|---|
|
external wrapper |
|
|
external wrapper |
|
|
external wrapper |
|
|
external wrapper |
|
|
external wrapper |
|
|
external wrapper |
|
|
hybrid |
macroforecast adaptive weights, final |
|
hybrid |
macroforecast adaptive weights, final |
|
package-native |
augmented ridge design solved by |
|
package-native |
custom objective solved by |
|
package-native |
custom difference-penalty objective solved by SLSQP |
|
package-native |
Albacore/assemblage-derived constrained aggregation objectives solved by SLSQP |
|
package-native |
expanded time-varying design solved by |
|
package-native |
Python port of |
|
package-native |
proximal-gradient group-lasso solver |
|
package-native |
proximal-gradient sparse-group-lasso solver |
|
package-native |
componentwise L2 boosting loop |
external wrapper means the statistical estimator is delegated to an external
package and macroforecast only standardizes the callable contract, metadata,
diagnostics, and persistence. hybrid means macroforecast owns the macro-level
algorithmic transformation and delegates the final convex solver. package-native
means the objective, iteration, or coefficient path logic is implemented inside
macroforecast, using NumPy/SciPy only for basic numerical linear algebra or
generic optimization.
R source comparison map#
The following R sources are the comparison surface for the linear models where macroforecast owns nontrivial logic. These sources are not vendored into the package. They are used as independent algorithm references; the Python code keeps short source cues in comments and implements the corresponding mathematical objective in macroforecast’s callable API.
macroforecast model |
R package/source to inspect |
Comparison target |
Current equivalence status |
|---|---|---|---|
|
|
Adaptive lasso is expressible in R by computing initial weights and passing them as |
Same fixed-weight objective after macroforecast standardizes |
|
|
Adaptive elastic net uses the same adaptive weights with elastic-net mixing. |
Same fixed-weight idea with mean-one penalty weights; macroforecast delegates the final fit to sklearn |
|
|
NNLS solves least squares under coefficient non-negativity. |
Equivalent to NNLS on the augmented design |
|
|
Compare target-shrinkage/tikhonov logic, not an identical regression API. |
No exact same R regression callable found. macroforecast solves ` |
|
fused L2 ridge family in |
Compare L2 fusion/smoothness penalty structure. |
Not identical domain: R source is primarily fused ridge for precision matrices; macroforecast applies an L2 finite-difference penalty directly to regression coefficients. |
|
|
Nonnegative component weights, optional target-weight shrinkage, sum-to-one basket constraint. |
Same fixed-alpha objective family as the R CVXR fit: SSE plus feature-std-scaled target shrinkage, |
|
|
Sort components into rank space, estimate nonnegative smooth rank weights with a mean-matching constraint. |
Same fixed-alpha rank objective family: row sorting, fused difference penalty on scaled rank weights, |
|
|
Generic component/rank supervised aggregation. |
Exposes the reusable primitives without requiring inflation data. Paper-specific inflation semantics live in the |
|
|
Random-walk coefficients in a time-varying regression. |
Same modeling prior idea, different inference: |
|
|
Goulet Coulombe TVP ridge / two-step ridge regression. |
Direct Python port of the R |
|
|
Group penalty over coefficient blocks. |
Same group-lasso penalty family for Gaussian loss; macroforecast uses a single-alpha proximal-gradient solver rather than a full regularization path. |
|
|
Sparse group lasso objective with group and feature-level penalties. |
Same penalty decomposition; macroforecast uses one selected |
|
|
Componentwise gradient/L2 boosting. |
Same Gaussian componentwise L2 update: center predictors by default, select the base learner by normalized correlation, and apply shrinkage. The paper’s per-step random candidate rule is expressed as |
The direct implementations should be reviewed against the objective, scaling, intercept handling, penalty normalization, and solver stopping rule separately. Matching an R package name is not enough: several R implementations solve a path problem, a Bayesian state-space problem, or a matrix-estimation problem, whereas macroforecast exposes a single callable forecasting estimator.
ols#
macroforecast.models.ols(X, y)
Fits ordinary least squares.
Item |
Value |
|---|---|
Input |
|
Output |
|
Backend |
|
Default params |
none |
Tunable params |
none |
Preset search spaces |
none |
ridge#
macroforecast.models.ridge(X, y, *, alpha=1.0)
Fits ridge regression with an L2 penalty.
Backend: sklearn.linear_model.Ridge.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
L2 penalty strength. |
Preset |
|
|---|---|
|
|
|
|
|
|
Default model-selection method: cv_path.
nonneg_ridge#
macroforecast.models.nonneg_ridge(X, y, *, alpha=1.0, fit_intercept=True)
Fits ridge regression with coefficients constrained to be non-negative. This
uses SciPy NNLS on an augmented ridge design, so it does not require cvxpy.
Backend: package-native augmented ridge design plus scipy.optimize.nnls.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
L2 penalty strength. |
|
|
fixed by preset |
Fit an intercept outside the constrained coefficients. |
Default model-selection method: cv_path.
shrink_to_target_ridge#
macroforecast.models.shrink_to_target_ridge(
X,
y,
*,
alpha=1.0,
prior_target=None,
simplex=False,
nonneg=False,
fit_intercept=True,
max_iter=1000,
tol=1e-9,
)
Fits a ridge-type model where coefficients are shrunk toward a user-specified
target vector. prior_target can be a scalar, a sequence ordered like X
columns, or a mapping from column name to target coefficient. If
prior_target=None, the target is zero, except under simplex=True, where the
target is a uniform coefficient vector. simplex=True constrains coefficients
to sum to one and uses no intercept; nonneg=True also enforces non-negative
coefficients. The solver is SciPy SLSQP.
Backend: package-native objective plus scipy.optimize.minimize(method="SLSQP").
R comparison: this is a regression analogue of target-ridge/Tikhonov
shrinkage. rags2ridges uses target ridge for covariance and precision
matrices, not a direct X, y regression callable, but the same target idea is
present: shrink an estimated parameter object toward a target rather than
toward zero. In the unconstrained regression case, macroforecast solves
min_beta ||y - X beta||^2 + alpha ||beta - beta0||^2
with closed-form normal equation
(X'X + alpha I) beta = X'y + alpha beta0
after optional centering for the intercept. simplex=True changes the problem
into a forecast-combination form: coefficients must sum to one, no intercept is
fit, and prior_target=None means a uniform target vector.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Strength of shrinkage toward |
|
|
fixed by preset |
Scalar, sequence, mapping, or |
|
|
fixed by preset |
Constrain coefficients to sum to one. |
|
|
fixed by preset |
Constrain coefficients to be non-negative. |
|
|
fixed by preset |
Fit an intercept unless |
|
|
fixed by preset |
SLSQP iteration cap. |
|
|
fixed by preset |
SLSQP tolerance. |
Default model-selection method: cv_path.
fused_difference_ridge#
macroforecast.models.fused_difference_ridge(
X,
y,
*,
alpha=1.0,
difference_order=1,
mean_equality=False,
nonneg=False,
fit_intercept=True,
max_iter=1000,
tol=1e-9,
)
Fits ridge regression with a finite-difference penalty on adjacent
coefficients. This is useful when columns have an ordered meaning such as lag
age, maturity, or horizon and neighboring coefficients should vary smoothly.
difference_order=1 penalizes first differences; larger orders penalize higher
order coefficient curvature. mean_equality=True adds a conservation-style
constraint that the fitted and observed sums match and uses no intercept.
Backend: package-native finite-difference objective plus SLSQP.
R comparison: rags2ridges::ridgeP.fused uses a fused L2 penalty for related
precision matrices. fused_difference_ridge() uses the same penalty idea on a
single ordered regression-coefficient vector. With no sign or equality
constraints, macroforecast solves
min_beta ||y - X beta||^2 + alpha ||D beta||^2
where D is the finite-difference matrix over adjacent coefficients. The
closed-form normal equation is
(X'X + alpha D'D) beta = X'y
after optional centering for the intercept. mean_equality=True is a
macro-forecasting conservation variant; it constrains fitted and observed sums
to match and is intentionally outside the rags2ridges precision-matrix API.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Strength of the smoothness penalty. |
|
|
fixed by preset |
Finite-difference order applied to coefficients. |
|
|
fixed by preset |
Constrain fitted and observed sums to match. |
|
|
fixed by preset |
Constrain coefficients to be non-negative. |
|
|
fixed by preset |
Fit an intercept unless |
|
|
fixed by preset |
SLSQP iteration cap. |
|
|
fixed by preset |
SLSQP tolerance. |
Default model-selection method: cv_path.
Assemblage / Supervised Aggregation#
This family is derived from Goulet Coulombe, Klieber, Barrette, and Goebel,
Maximally Forward-Looking Core Inflation, and the R package assemblage.
The package splits the paper model into generic reusable primitives plus thin
inflation-specific wrappers.
The generic problem is:
given components X_t and future aggregate target y_t,h,
learn weights w so X_t w predicts y_t,h
This is not ordinary ridge in disguise. The weights can be constrained to be nonnegative, sum to one, match the target mean, shrink toward reference basket weights, or vary smoothly across ranks. For inflation, those weights form an Albacore core-inflation measure. Outside inflation, the same functions can aggregate sectors, states, industries, survey items, or regional indicators.
supervised_aggregation#
macroforecast.models.supervised_aggregation(
X,
y,
*,
space="component",
penalty="ridge",
alpha=1.0,
reference_weights=None,
nonneg=True,
simplex=False,
mean_match=False,
difference_order=1,
fit_intercept=False,
penalty_scale="feature_std",
max_iter=1000,
tol=1e-9,
)
Parameter |
Default |
Choices |
Meaning |
|---|---|---|---|
|
|
|
Use named components or row-wise sorted order statistics. |
|
|
|
Coefficient penalty family. |
|
|
nonnegative float |
Penalty strength; tune with |
|
|
mapping, sequence, |
Target weights for |
|
|
bool |
Enforce |
|
|
bool |
Enforce |
|
|
bool |
Enforce |
|
|
positive int |
Difference order for fused rank weights. |
|
|
bool |
Fit an intercept outside the aggregation weights when no equality constraint is active. |
|
|
|
Match the R assemblage convention by scaling penalties with feature standard deviations. |
Output: ModelFit. The fitted estimator exposes coef_, weights_, and,
for rank space, rank_weight_curve_. Diagnostics include fitted values,
residuals, metrics, and coefficient weights.
component_aggregation#
macroforecast.models.component_aggregation(
X,
y,
*,
alpha=1.0,
reference_weights=None,
penalty=None,
simplex=True,
nonneg=True,
penalty_scale="feature_std",
max_iter=1000,
tol=1e-9,
)
Component-space aggregation estimates weights on named columns. With
reference_weights supplied, penalty=None selects target_shrinkage, making
this the generic version of Albacorecomps. Without reference weights, it is a
nonnegative simplex ridge basket.
R source cue: nonneg.ridge.sum1 in assemblage_v240228.R.
rank_aggregation#
macroforecast.models.rank_aggregation(
X,
y,
*,
alpha=1.0,
penalty="fused_difference",
mean_match=True,
nonneg=True,
difference_order=1,
penalty_scale="feature_std",
max_iter=1000,
tol=1e-9,
)
Rank-space aggregation sorts each row of X before fitting, then learns
weights on rank_1, rank_2, … rather than on named components. This is a
generic supervised trimmed-mean model. The fitted object stores
estimator.rank_weight_curve_, a table with rank, percentile, and weight.
R source cue: x.transformation plus nonneg.ridge.meanD.
assemblage_regression#
macroforecast.models.assemblage_regression(
X,
y,
*,
space="component",
alpha=1.0,
reference_weights=None,
penalty=None,
max_iter=1000,
tol=1e-9,
)
Convenience wrapper over component_aggregation() and rank_aggregation().
Use it when the model family is known to be assemblage-style but the final
choice between component and rank space is part of the experiment design.
albacore_components#
macroforecast.models.albacore_components(
X,
y,
*,
reference_weights=None,
alpha=1.0,
max_iter=1000,
tol=1e-9,
)
Inflation-specific wrapper for component-space Albacore. X should be a panel
of price-component changes, y should be the forward average headline
inflation target, and reference_weights should be official basket or
expenditure weights when available. The wrapper sets nonneg=True,
simplex=True, penalty="target_shrinkage", and fit_intercept=False.
albacore_ranks#
macroforecast.models.albacore_ranks(
X,
y,
*,
alpha=1.0,
difference_order=1,
max_iter=1000,
tol=1e-9,
)
Inflation-specific wrapper for rank-space Albacore. X should be price
component changes and y should be the forward average headline inflation
target. The wrapper sorts components row by row, estimates nonnegative fused
rank weights, and enforces the Albacoreranks mean-matching constraint.
Low-Level Solver Helpers#
These return a weight Series rather than a full ModelFit:
Function |
Meaning |
|---|---|
|
R |
|
Nonnegative weights constrained to sum to one. |
|
R |
|
R |
|
R |
These helpers are deliberately not inflation-specific. They exist so users can compose custom supervised aggregation models without taking the Albacore wrappers.
random_walk_ridge#
macroforecast.models.random_walk_ridge(
X,
y,
*,
alpha=1.0,
initial_alpha=1.0,
fit_intercept=True,
)
Fits a time-varying coefficient path with a random-walk penalty:
sum_t (y_t - x_t beta_t)^2
+ initial_alpha * ||beta_1||^2
+ alpha * sum_t ||beta_t - beta_{t-1}||^2
Predictions use the final estimated coefficient vector. The full fitted path is
stored on the estimator as coef_path_, and standard ModelFit diagnostics
record the final coefficients, fitted values, and residuals.
Backend: package-native expanded design solved by numpy.linalg.lstsq.
R comparison: walker::walker_rw1 is the closest R source. It treats
coefficients as random-walk state variables and estimates a Bayesian posterior
with Stan / state-space smoothing. random_walk_ridge() keeps the same RW1
prior idea but solves the penalized least-squares MAP-style objective as one
augmented linear system over the full coefficient path:
min_{beta_1,...,beta_T}
sum_t (y_t - x_t beta_t)^2
+ initial_alpha ||beta_1||^2
+ alpha sum_t ||beta_t - beta_{t-1}||^2
The fitted coef_path_ is the estimated path. predict() uses only the final
coefficient vector, because this callable is a deterministic forecasting model,
not a posterior simulation or Kalman-smoothing interface. fit_intercept=True
centers the fit and recovers a static intercept from the final coefficient
vector.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Penalty on adjacent coefficient changes. |
|
|
fixed by preset |
Penalty on the first coefficient vector. |
|
|
fixed by preset |
Fit an intercept outside the time-varying coefficient path. |
Default model-selection method: cv_path.
tvp_ridge#
macroforecast.models.tvp_ridge(
X,
y,
*,
lambda_candidates=None,
oosX=None,
lambda2=0.1,
kfold=5,
cv_plot=False,
cv_2srr=True,
sig_u_param=0.75,
sig_eps_param=0.75,
ols_prior=False,
random_state=1071,
use_garch=True,
)
Fits Philippe Goulet Coulombe’s TVP ridge / two-step ridge regression
estimator from Time-varying parameters as ridge regressions
(International Journal of Forecasting, DOI
https://doi.org/10.1016/j.ijforecast.2024.08.006). The implementation is a
Python port of the R package TVPRidge, source file
R/MV2SRR_v210407.R, local snapshot:
wiki/raw/paper_code/coulombe_site_github_20260530/tvpridge/R/MV2SRR_v210407.R
This is not a thin wrapper around random_walk_ridge(). Both models use a
random-walk coefficient idea, but tvp_ridge() ports the paper’s full
estimator:
Stage |
R function |
Python implementation |
|---|---|---|
Basis expansion |
|
|
Generalized ridge solve |
|
|
Initial lambda CV |
|
|
Second-step lambda CV |
|
|
Dropout correction |
|
|
Public callable |
|
|
The estimated path solves the paper’s time-varying parameter ridge problem:
min_{beta_1,...,beta_T}
sum_t (y_t - x_t beta_t)^2
+ lambda * sum_t ||beta_t - beta_{t-1}||^2
+ lambda2 * ||beta_0||^2
lambda controls the amount of time variation. Large values force smoother
coefficient paths; small values allow more movement. lambda2 is the soft
penalty on the starting coefficient values. The R code standardizes by sample
standard deviation without centering; macroforecast follows that convention and
rescales coefficient paths and fitted values back to the original data scale.
The 2SRR step follows the R package logic. First, the homogeneous ridge TVP is
estimated. Then coefficient innovations are used to build coefficient-specific
variance weights, and residual volatility weights are optionally estimated by a
GARCH(1,1) backend. The model is refit with those weights. If Python package
arch is unavailable or the GARCH fit fails, residual-volatility weights fall
back to ones and the reason is recorded in
fit.estimator.diagnostics_["garch_status"]; the ridge/2SRR fit still runs.
Input:
Argument |
Required |
Expected object |
Meaning |
|---|---|---|---|
|
yes |
pandas DataFrame, NumPy array, or |
Predictor matrix with shape |
|
yes unless |
pandas Series or one-column DataFrame for the public |
Target series aligned to |
|
no |
sequence of positive floats or |
Candidate values for the time-variation penalty. |
|
no |
one predictor vector of length |
Optional one-step forecast using the final coefficient vector. |
Output:
Object |
Type |
Contents |
|---|---|---|
return value |
|
Standard macroforecast fitted model wrapper. |
|
NumPy array, shape |
First-step ridge TVP coefficient paths, original scale. |
|
NumPy array, shape |
2SRR coefficient paths, original scale. |
|
NumPy array |
Initial CV lambda for each target. |
|
NumPy array |
Second-step lambda used after reweighting. |
|
DataFrame |
In-sample first-step fitted values. |
|
DataFrame |
In-sample 2SRR fitted values. |
|
DataFrame |
Normalized residual-volatility weights. |
|
NumPy array |
Optional forecast when |
|
DataFrame |
Final 2SRR path for the first target, excluding intercept. |
|
DataFrame |
MultiIndex coefficient path including intercept and target names. |
Prediction rule:
Call |
Behavior |
|---|---|
|
Returns the time-varying in-sample |
|
Uses the final estimated coefficient vector |
Default parameters:
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
R default candidate grid for the time-variation penalty. |
|
|
fixed by preset |
Soft penalty on starting coefficient values. |
|
|
fixed by preset |
Number of random CV folds. |
|
|
fixed by preset |
Re-run lambda CV after variance reweighting. |
|
|
fixed by preset |
Shrinkage exponent for coefficient-innovation variance weights. |
|
|
fixed by preset |
Shrinkage exponent for residual-volatility weights. |
|
|
fixed by preset |
Shrink starting coefficients toward OLS rather than zero. |
|
|
fixed by preset |
Fold seed matching the R source’s |
|
|
fixed by preset |
Use optional Python |
R parity notes:
Topic |
Status |
|---|---|
Standardization |
Matches R: divide |
Basis columns |
Matches R |
Dual/primal solve |
Matches R |
CV folds |
Same random-fold design and default seed, but NumPy’s RNG is not bit-identical to R’s |
GARCH volatility |
Uses optional Python |
Multivariate |
Estimator internals preserve |
Default model-selection method: cv_path.
lasso#
macroforecast.models.lasso(
X,
y,
*,
alpha=1.0,
max_iter=20000,
standardize=False,
)
Fits lasso regression with an L1 penalty. There is no lasso_path() model
callable; use get_model("lasso") and model_selection.select_params().
Backend: sklearn.linear_model.Lasso.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
L1 penalty strength. |
|
|
fixed by preset |
Optimization iteration cap. |
|
|
fixed by preset |
Standardize predictors inside the fitted estimator. Defaults to |
Preset |
|
|---|---|
|
|
|
|
|
|
Default model-selection method: cv_path.
elastic_net#
macroforecast.models.elastic_net(
X,
y,
*,
alpha=1.0,
l1_ratio=0.5,
max_iter=20000,
standardize=False,
)
Fits elastic net regression.
Backend: sklearn.linear_model.ElasticNet.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Overall penalty strength. |
|
|
yes |
L1 share of the elastic-net penalty. |
|
|
fixed by preset |
Optimization iteration cap. |
|
|
fixed by preset |
Standardize predictors inside the fitted estimator. Defaults to |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
adaptive_lasso#
macroforecast.models.adaptive_lasso(
X,
y,
*,
alpha=1.0,
gamma=1.0,
initial="ridge",
initial_alpha=1.0,
eps=1e-4,
normalize_weights=True,
max_iter=20000,
tol=1e-4,
random_state=None,
)
Fits adaptive lasso. The model first estimates initial coefficients with
initial="ridge" or initial="ols", builds feature weights
1 / (abs(beta_init) + eps) ** gamma, and fits lasso on weighted standardized
predictors. Predictions are mapped back to the original target scale.
Backend: macroforecast adaptive-weight construction plus final
sklearn.linear_model.Lasso.
R/glmnet comparison: glmnet accepts the same idea through penalty.factor.
It internally rescales penalty factors to sum to the number of predictors, so
macroforecast defaults to normalize_weights=True, which rescales adaptive
weights to mean one before fitting the final lasso. Set
normalize_weights=False only when the absolute weight scale should change the
effective penalty strength.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Final adaptive lasso penalty strength. |
|
|
yes |
Exponent applied to initial coefficient weights. |
|
|
manual |
Initial model: |
|
|
fixed by preset |
Initial ridge penalty. |
|
|
fixed by preset |
Small denominator floor for adaptive weights. |
|
|
fixed by preset |
Rescale adaptive weights to mean one, matching |
|
|
fixed by preset |
Final solver iteration cap. |
|
|
fixed by preset |
Final solver convergence tolerance. |
|
|
fixed by preset |
Final solver random seed. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
adaptive_elastic_net#
macroforecast.models.adaptive_elastic_net(
X,
y,
*,
alpha=1.0,
l1_ratio=0.5,
gamma=1.0,
initial="ridge",
initial_alpha=1.0,
eps=1e-4,
normalize_weights=True,
max_iter=20000,
tol=1e-4,
random_state=None,
)
Fits an adaptive elastic-net variant with the same initial coefficient weights
as adaptive_lasso, followed by an elastic-net fit on weighted standardized
predictors.
Backend: macroforecast adaptive-weight construction plus final
sklearn.linear_model.ElasticNet.
R/glmnet comparison: this is the elastic-net analogue of adaptive lasso.
normalize_weights=True gives the same mean-one penalty-factor convention as
glmnet; the remaining difference is solver style, because macroforecast fits
one selected alpha while glmnet usually estimates a regularization path.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Final adaptive elastic-net penalty strength. |
|
|
yes |
L1 share of the final elastic-net penalty. |
|
|
yes |
Exponent applied to initial coefficient weights. |
|
|
manual |
Initial model: |
|
|
fixed by preset |
Initial ridge penalty. |
|
|
fixed by preset |
Small denominator floor for adaptive weights. |
|
|
fixed by preset |
Rescale adaptive weights to mean one, matching |
|
|
fixed by preset |
Final solver iteration cap. |
|
|
fixed by preset |
Final solver convergence tolerance. |
|
|
fixed by preset |
Final solver random seed. |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
group_lasso#
macroforecast.models.group_lasso(
X,
y,
*,
groups=None,
alpha=1.0,
group_weights=None,
max_iter=5000,
tol=1e-5,
scale=True,
)
Fits group lasso with a package-native proximal-gradient solver. groups
must contain one label per predictor column. If groups=None, each predictor
is treated as its own group.
Backend: package-native proximal-gradient solver.
R comparison: this follows the Gaussian group-lasso objective used by
grpreg::grpreg(..., penalty = "grLasso"): standardized predictors,
group-level L2 shrinkage, and default group weights proportional to
sqrt(group_size). macroforecast fits one selected alpha and does not
reproduce grpreg’s full path solver, GLM families, C backend, or within-group
orthogonalization step.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
manual |
One group label per predictor. |
|
|
yes |
Group penalty strength. |
|
|
manual |
Optional group penalty weights; default is |
|
|
fixed by preset |
Proximal-gradient iteration cap. |
|
|
fixed by preset |
Proximal-gradient convergence tolerance. |
|
|
fixed by preset |
Whether to standardize predictors inside the model. |
Preset |
|
|---|---|
|
|
|
|
|
|
sparse_group_lasso#
macroforecast.models.sparse_group_lasso(
X,
y,
*,
groups=None,
alpha=1.0,
l1_ratio=0.5,
group_weights=None,
max_iter=5000,
tol=1e-5,
scale=True,
)
Fits sparse group lasso. l1_ratio controls the feature-level L1 share; the
remaining penalty share is applied at the group level.
Backend: package-native proximal-gradient solver.
R comparison: this follows the sparse-group penalty decomposition used by
sparsegl::sparsegl: a feature-level L1 part plus a group L2 part with default
sqrt(group_size) group weights. macroforecast fits one selected alpha and
l1_ratio; it does not reproduce sparsegl’s full lambda path, bounds, GLM
families, or C++ backend.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
manual |
One group label per predictor. |
|
|
yes |
Total sparse-group penalty strength. |
|
|
yes |
Feature-level L1 share. |
|
|
manual |
Optional group penalty weights; default is |
|
|
fixed by preset |
Proximal-gradient iteration cap. |
|
|
fixed by preset |
Proximal-gradient convergence tolerance. |
|
|
fixed by preset |
Whether to standardize predictors inside the model. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
bayesian_ridge#
macroforecast.models.bayesian_ridge(X, y)
Fits sklearn empirical-Bayes Bayesian ridge.
Item |
Value |
|---|---|
Input |
|
Output |
|
Backend |
|
Default params |
sklearn defaults |
Tunable params |
none in the clean preset catalog |
Preset search spaces |
none |
huber#
macroforecast.models.huber(X, y, *, epsilon=1.35, max_iter=1000)
Fits robust Huber regression.
Backend: sklearn.linear_model.HuberRegressor.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Huber loss transition threshold. |
|
|
fixed by preset |
Optimization iteration cap. |
Preset |
|
|---|---|
|
|
|
|
|
|
Kernel And Nonparametric Models#
These models are external sklearn wrappers, not package-native numerical
solvers. They live in macroforecast.models.nonparametric and are re-exported
from macroforecast.models and top-level macroforecast.
kernel_ridge#
macroforecast.models.kernel_ridge(
X,
y,
*,
alpha=1.0,
kernel="linear",
gamma=None,
degree=3,
coef0=1.0,
)
Fits sklearn kernel ridge regression. This model is scale-sensitive for
nonlinear kernels, so standardize predictors before rbf, poly, or
sigmoid kernels.
Backend: sklearn.kernel_ridge.KernelRidge.
R parity is intentionally not claimed for this callable. It is a thin sklearn
backend wrapper; macroforecast owns only the pandas X, y contract, ModelFit
metadata/diagnostics, and search-space registration.
Item |
Value |
|---|---|
Input |
|
Output |
|
Internal scaling |
none |
|
|
Default model-selection method |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Ridge penalty strength. |
|
|
search option |
Kernel name. |
|
|
search option |
Kernel coefficient. |
|
|
search option |
Polynomial kernel degree. |
|
|
fixed by preset |
Independent term for polynomial/sigmoid kernels. |
Preset |
|
|
extra searched params |
|---|---|---|---|
|
|
|
none |
|
|
|
|
|
|
|
|
knn#
macroforecast.models.knn(
X,
y,
*,
n_neighbors=5,
weights="uniform",
metric="minkowski",
p=2,
)
Fits sklearn k-nearest-neighbor regression. This is distance-based and should usually receive standardized predictors.
Backend: sklearn.neighbors.KNeighborsRegressor.
R parity is intentionally not claimed for this callable. It is a thin sklearn
backend wrapper; macroforecast owns only the pandas X, y contract, small-window
n_neighbors resolution, ModelFit metadata/diagnostics, and search-space
registration.
If the requested n_neighbors is larger than the fitted sample size,
macroforecast resolves it down to n_obs before constructing the sklearn
estimator. The fit metadata records the effective n_neighbors and, when
different, requested_n_neighbors. This avoids small-window forecasting runs
failing at prediction time.
Item |
Value |
|---|---|
Input |
|
Output |
|
Internal scaling |
none |
|
|
Default model-selection method |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of nearest neighbors. |
|
|
yes |
|
|
|
fixed by preset |
Distance metric. |
|
|
search option |
Minkowski distance order. |
Preset |
|
|
extra searched params |
|---|---|---|---|
|
|
|
none |
|
|
|
|
|
|
|
|
Linear Boosting#
glmboost#
macroforecast.models.glmboost(
X,
y,
*,
n_iter=100,
learning_rate=0.1,
center=True,
candidate_sampling="all",
candidate_count=None,
candidate_fraction=None,
candidate_cap=None,
candidate_min=1,
candidate_rounding="floor",
random_state=None,
)
Fits componentwise L2 boosting with linear base learners.
Backend: package-native componentwise L2 boosting loop. The R comparison target
is mboost::glmboost. macroforecast implements the matrix-input Gaussian path:
predictors are centered by default, each iteration selects the column with the
largest normalized correlation with the current residual, and the selected
least-squares coefficient is shrunk by learning_rate.
Candidate sampling is deliberately decomposed into separate arguments. For
Goulet Coulombe, Leroux, Stevanovic, and Surprenant (2021), Appendix A.6, use
candidate_sampling="random", candidate_fraction=1/3,
candidate_cap=200, and candidate_rounding="floor", which gives
m=min(200, floor(n_features / 3)) sampled predictors at each boosting step.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of boosting iterations. |
|
|
yes |
Shrinkage applied to each update. |
|
|
no |
Center predictors before componentwise updates, matching |
|
|
fixed by preset |
|
|
|
fixed by preset |
Fixed sampled candidate count when |
|
|
fixed by preset |
Fraction of predictors sampled each step when |
|
|
fixed by preset |
Maximum sampled candidate count after resolving |
|
|
fixed by preset |
Minimum sampled candidate count. |
|
|
fixed by preset |
Rounding rule for |
|
|
fixed by preset |
Seed for per-step candidate feature sampling when |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
Support-Vector Models#
Support-vector models are sklearn-backed and live in the base dependency set.
They are useful when nonlinear margins or robust epsilon-insensitive losses
are preferred over a pure least-squares fit. The forecasting runner treats
them as ordinary supervised models: call model(X, y, **params), tune only
model-owned hyperparameters through model_selection, and let window decide the
train/validation/test dates.
Forecasting-runner example:
pre = macroforecast.preprocessing.preprocess_spec(
transform="none",
outliers="none",
impute="mean",
standardize="zscore",
standardize_columns="predictors",
)
features = macroforecast.feature_engineering.feature_spec(
target="y",
horizon=1,
predictors=["x1", "x2"],
lags=(0, 1),
)
result = macroforecast.forecasting.run(
panel,
"svr",
preprocessing=pre,
features=features,
window=macroforecast.window.last_block(validation_size=24),
model_selection=macroforecast.model_selection.grid({"C": [0.1, 1.0], "epsilon": [0.01, 0.1]}),
)
svr#
macroforecast.models.svr(
X,
y,
*,
kernel="rbf",
C=1.0,
epsilon=0.1,
gamma="scale",
degree=3,
coef0=0.0,
shrinking=True,
tol=1e-3,
cache_size=200.0,
max_iter=-1,
)
Fits sklearn SVR.
Backend: sklearn.svm.SVR.
kernel="precomputed" is intentionally not supported because macroforecast
ModelFit expects X to be a feature matrix with stable column names. Use
"linear", "poly", "rbf", or "sigmoid".
Item |
Value |
|---|---|
Input |
|
Output |
|
Internal scaling |
none |
|
|
Default model-selection method |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Kernel: |
|
|
yes |
Inverse regularization strength. |
|
|
yes |
Epsilon-insensitive tube width. |
|
|
yes |
Kernel coefficient for RBF/poly/sigmoid kernels. |
|
|
fixed by preset |
Polynomial kernel degree. |
|
|
fixed by preset |
Independent term for poly/sigmoid kernels. |
|
|
fixed by preset |
Whether to use the shrinking heuristic. |
|
|
fixed by preset |
Optimization tolerance. |
|
|
fixed by preset |
Kernel cache size in MB. |
|
|
fixed by preset |
Solver iteration cap; |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
linear_svr#
macroforecast.models.linear_svr(
X,
y,
*,
C=1.0,
epsilon=0.0,
loss="epsilon_insensitive",
tol=1e-4,
max_iter=10000,
random_state=0,
)
Fits sklearn LinearSVR. Use this when a linear support-vector loss is wanted
without kernel overhead.
Backend: sklearn.svm.LinearSVR.
Item |
Value |
|---|---|
Input |
|
Output |
|
Internal scaling |
none |
|
|
Default model-selection method |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Inverse regularization strength. |
|
|
yes |
Epsilon-insensitive tube width. |
|
|
fixed by preset |
LinearSVR loss function. |
|
|
fixed by preset |
Optimization tolerance. |
|
|
fixed by preset |
Solver iteration cap. |
|
|
fixed by preset |
Random seed; can be |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
nu_svr#
macroforecast.models.nu_svr(
X,
y,
*,
kernel="rbf",
C=1.0,
nu=0.5,
gamma="scale",
degree=3,
coef0=0.0,
shrinking=True,
tol=1e-3,
cache_size=200.0,
max_iter=-1,
)
Fits sklearn NuSVR, where nu controls the admissible training-error and
support-vector fractions.
Backend: sklearn.svm.NuSVR.
kernel="precomputed" is intentionally not supported for the same feature-matrix
contract reason as svr().
Item |
Value |
|---|---|
Input |
|
Output |
|
Internal scaling |
none |
|
|
Default model-selection method |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Kernel: |
|
|
yes |
Inverse regularization strength. |
|
|
yes |
Upper/lower control for training-error and support-vector fractions. |
|
|
yes |
Kernel coefficient for RBF/poly/sigmoid kernels. |
|
|
fixed by preset |
Polynomial kernel degree. |
|
|
fixed by preset |
Independent term for poly/sigmoid kernels. |
|
|
fixed by preset |
Whether to use the shrinking heuristic. |
|
|
fixed by preset |
Optimization tolerance. |
|
|
fixed by preset |
Kernel cache size in MB. |
|
|
fixed by preset |
Solver iteration cap; |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Neural Models#
nn, lstm, gru, transformer, hemisphere_nn, and density_hnn are all torch-backed neural-network models and require
macroforecast[deep]. nn is the feed-forward neural network for tabular
feature matrices; lstm and gru are recurrent neural networks that consume
trailing row sequences; transformer is a compact Transformer encoder using
the same trailing-row sequence contract. hemisphere_nn is a compact bagged
dual-head network for mean and variance forecasts, while density_hnn follows
the Aionx/Paper DensityHNN procedure with prior-DNN OOB volatility emphasis and
OOB volatility recalibration. The deep extra is intentionally separate from
macroforecast[all] because torch is large and platform-sensitive.
Torch recurrent example:
result = macroforecast.forecasting.run(
panel,
"lstm",
features=features,
window=macroforecast.window.last_block(validation_size=24),
params={"lstm": {"sequence_length": 4, "hidden_size": 32, "device": "auto"}},
model_selection={"lstm": None},
)
nn#
macroforecast.models.nn(
X,
y,
*,
hidden_layer_sizes=(100,),
activation="relu",
dropout=0.0,
learning_rate=0.001,
max_epochs=100,
batch_size=32,
weight_decay=0.0,
optimizer="adam",
loss="mse",
random_state=0,
device="auto",
)
Fits a torch-backed feed-forward neural-network regressor. The estimator
standardizes X and y inside each fit window and maps predictions back to
target units. Use feature engineering for lagged, rolling, PCA, or MARX-style
inputs before fitting this model.
Forecasting-runner example:
result = macroforecast.forecasting.run(
panel,
"nn",
features=features,
window=macroforecast.window.last_block(validation_size=24),
params={"nn": {"max_epochs": 100, "device": "auto"}},
model_selection=macroforecast.model_selection.grid({
"hidden_layer_sizes": [(32,), (64,)],
"weight_decay": [0.0, 0.0001],
}),
)
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Feed-forward hidden layer widths. |
|
|
fixed by preset |
Activation: |
|
|
yes |
Dropout rate between hidden layers. |
|
|
yes |
Optimizer learning rate. |
|
|
fixed by preset |
Training epoch cap. |
|
|
fixed by preset |
Mini-batch size. |
|
|
yes |
L2 weight decay. |
|
|
fixed by preset |
Torch optimizer: |
|
|
fixed by preset |
Torch loss: |
|
|
fixed by preset |
Random seed. |
|
|
fixed by preset |
Torch device: |
Preset |
|
|
|
|
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
lstm#
macroforecast.models.lstm(
X,
y,
*,
sequence_length=4,
hidden_size=32,
num_layers=1,
dropout=0.0,
learning_rate=0.001,
max_epochs=100,
batch_size=32,
random_state=0,
device="auto",
)
Fits a compact torch-backed LSTM regressor. sequence_length controls how
many trailing rows are passed to the recurrent network for each target date.
The fitted estimator stores the trailing training rows, so predict(X_test)
can create the first test sequences without the caller manually prepending
training history. The backend is a regular torch.nn.Module, switches to
train() during fitting and eval() during prediction, and uses device to
choose CPU or CUDA. The fit diagnostics include sequence_context, recording
sequence_length, fit_sample_size, train_tail_rows, and the
test_sequence_prefix policy. The prefix is always the last fitted rows only,
so the forecasting runner can pass the test feature block directly without
leaking future rows.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Trailing rows per recurrent sequence. |
|
|
yes |
Recurrent hidden-state width. |
|
|
fixed by preset |
Number of recurrent layers. |
|
|
fixed by preset |
Dropout between recurrent layers. |
|
|
yes |
Adam learning rate. |
|
|
fixed by preset |
Training epoch cap. |
|
|
fixed by preset |
Mini-batch size. |
|
|
fixed by preset |
Random seed. |
|
|
fixed by preset |
Torch device: |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
gru#
macroforecast.models.gru(
X,
y,
*,
sequence_length=4,
hidden_size=32,
num_layers=1,
dropout=0.0,
learning_rate=0.001,
max_epochs=100,
batch_size=32,
random_state=0,
device="auto",
)
Fits a compact torch-backed GRU regressor with the same input/output contract
as lstm.
transformer#
macroforecast.models.transformer(
X,
y,
*,
sequence_length=4,
hidden_size=32,
num_layers=1,
dropout=0.0,
learning_rate=0.001,
max_epochs=100,
batch_size=32,
random_state=0,
device="auto",
)
Fits a compact torch-backed Transformer encoder regressor. The input/output
contract matches lstm and gru: rows are standardized inside each fit
window, trailing sequences are built from the fitted sample, and predictions
for new rows are mapped back to target units. hidden_size is the
Transformer feed-forward width, not the input dimension; the encoder uses
d_model = n_features and nhead=1 to keep the public callable small and
stable for macro panels.
hemisphere_nn#
macroforecast.models.hemisphere_nn(
X,
y,
*,
lc=2,
lm=2,
lv=2,
neurons=64,
dropout=0.2,
learning_rate=0.001,
max_epochs=100,
n_estimators=100,
subsample=0.8,
nu=None,
variance_penalty=1.0,
patience=15,
validation_fraction=0.2,
random_state=0,
device="auto",
quantile_levels=(0.05, 0.5, 0.95),
)
Fits a compact Hemisphere neural network inspired by Goulet Coulombe, Frenette, and Klieber’s dual-head density-forecast architecture. The network has a shared common core, a mean head, and a positive variance head. The loss is Gaussian negative log likelihood plus a soft variance-emphasis penalty:
mean((y - h_m(X))^2 / h_v(X) + log h_v(X))
+ variance_penalty * (mean(h_v(X)) - nu * var(y))^2 / var(y)^2
predict() returns the ensemble mean forecast. The fitted estimator also
exposes predict_variance(X), predict_distribution(X), and
predict_quantiles(X, levels=None). The forecasting runner stores the variance
and quantile outputs in variance_prediction and quantile_predictions. The public
callable accepts legacy aliases lr, n_epochs, B, sub_rate,
lambda_emphasis, and val_frac; normalized metadata records them as
learning_rate, max_epochs, n_estimators, subsample,
variance_penalty, and validation_fraction.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Shared common-core depth. |
|
|
fixed by preset |
Mean-head depth after the common core. |
|
|
fixed by preset |
Variance-head depth after the common core. |
|
|
yes |
Hidden width. |
|
|
fixed by preset |
Dropout rate. |
|
|
yes |
Adam learning rate. |
|
|
fixed by preset |
Training epoch cap. |
|
|
yes |
Number of blocked-subsample bags. |
|
|
fixed by preset |
Blocked-subsample fraction. |
|
|
fixed by preset |
Variance-emphasis target ratio; |
|
|
fixed by preset |
Soft penalty on the variance-emphasis target. |
|
|
fixed by preset |
Early-stopping patience. |
|
|
fixed by preset |
Chronological validation fraction. |
|
|
fixed by preset |
Torch device. |
|
|
fixed by preset |
Default normal-approximation quantile levels returned by |
density_hnn#
macroforecast.models.density_hnn(
X,
y,
*,
common_layers=2,
mean_layers=2,
volatility_layers=2,
prior_layers=3,
neurons=400,
dropout=0.2,
learning_rate=0.001,
max_epochs=100,
n_estimators=100,
prior_estimators=50,
subsample=0.8,
block_size=8,
volatility_emphasis=None,
rescale_volatility=True,
patience=15,
random_state=0,
device="auto",
quantile_levels=(0.05, 0.5, 0.95),
volatility_clip=0.05,
)
Fits the Density Hemisphere Neural Network from Goulet Coulombe, Frenette, and
Klieber, “From Reactive to Proactive Volatility Modeling with Hemisphere Neural
Networks” (Journal of Applied Econometrics, 2025). The implementation is a
torch-native port of the public Aionx DensityHNN logic, not a TensorFlow
dependency wrapper. It is included because the method is a macro density
forecast model: macroforecast uses it to produce conditional means,
conditional variances, volatility forecasts, and normal-approximation
quantiles. It does not create portfolio weights.
The Aionx source-code correspondence is:
Aionx source item |
|
|---|---|
|
|
|
|
|
OOB forecasts use the Aionx denominator formula |
|
shared common core plus mean and volatility hemispheres; the volatility head is positive and normalized to the volatility-emphasis value. |
|
OOB log squared residuals are regressed on log predicted volatility squared, then all volatility forecasts are rescaled. |
The callable consumes the standard supervised model contract density_hnn(X, y, **params). In Aionx, lags and trend terms are created inside
DensityHNN.run(...); in macroforecast, lags, MARX/MAF features, PCA,
trends, and seasonal/time features should be built explicitly with
macroforecast.feature_engineering before calling the model or through the
forecasting runner. This keeps the model callable small and lets the same
feature construction be reused by other models.
Fit sequence:
Standardize
Xandyinside the fit window.Fit a prior DNN ensemble on blocked bootstrap samples.
Compute prior-DNN OOB mean squared error; this becomes the Aionx
volatility_emphasisunless the user supplies an override.Fit a DensityHNN ensemble with shared core, conditional-mean head, and conditional-volatility head.
Compute HNN OOB mean and volatility forecasts.
Recalibrate volatility using the Aionx log residual-square regression.
Return forecasts on the original target scale.
Output:
Method |
Output |
|---|---|
|
pandas Series of conditional mean forecasts through |
|
numpy array of conditional variance forecasts in target units squared. |
|
numpy array of conditional standard-deviation forecasts in target units. |
|
|
|
dictionary from quantile level to normal-approximation quantile forecast. |
Diagnostics:
Field |
Meaning |
|---|---|
|
Aionx volatility-emphasis value used by the HNN volatility head. |
|
Prior-DNN OOB mean squared error used when |
|
Intercept, slope, scaler, and OOB count for the log residual-square volatility recalibration. |
|
Fit-window OOB conditional mean, volatility, and variance table. |
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Shared common-core depth. |
|
|
fixed by preset |
Conditional-mean hemisphere depth. |
|
|
fixed by preset |
Conditional-volatility hemisphere depth. |
|
|
fixed by preset |
Prior-DNN hidden depth. |
|
|
yes |
Hidden width. The paper/Aionx default is 400; smaller values are useful for smoke tests. |
|
|
fixed by preset |
Dropout rate. |
|
|
yes |
Adam learning rate. |
|
|
fixed by preset |
Training epoch cap. |
|
|
yes |
DensityHNN bootstrap ensemble size. |
|
|
yes |
Prior-DNN bootstrap ensemble size used to estimate volatility emphasis. |
|
|
fixed by preset |
Blocked bootstrap sampling rate. |
|
|
fixed by preset |
Time-series block size. The paper uses blocked subsampling to preserve temporal dependence. |
|
|
fixed by preset |
|
|
|
fixed by preset |
Apply the blocked-OOB volatility reality-check recalibration. |
|
|
fixed by preset |
Early-stopping patience. |
|
|
fixed by preset |
Torch device. |
|
|
fixed by preset |
Default normal-approximation quantile levels. |
|
|
fixed by preset |
Minimum volatility used in Gaussian negative log likelihood, matching Aionx’s numerical-stability clip. |
Factor And Time-Series Models#
pls#
macroforecast.models.pls(
X,
y,
*,
n_components=3,
scale=True,
max_iter=500,
tol=1e-6,
control_columns=None,
include_constant=True,
drop_control_columns=True,
quadratic_factors=False,
)
Fits partial least squares regression. Unlike unsupervised PCA, PLS uses the
target while constructing latent components, so it belongs in models rather
than preprocessing or feature_engineering.
n_components is treated as a requested upper bound. At fit time, the model
resolves it to min(requested, n_factor_predictors, n_observations) so the
default is safe for small feature sets. Metadata records both
requested_n_components and resolved_n_components; n_components stores the
resolved value used by sklearn.cross_decomposition.PLSRegression.
Implementation map:
Item |
Value |
|---|---|
Backend |
|
Paper-code comparison |
Hounyo-Li |
Control handling |
Optional |
Difference from MATLAB code |
MATLAB |
PC2 support |
Set |
Hounyo-Li PLS baseline:
alphawhat = y_insample * wt_insample' * inv(wt_insample * wt_insample')
y_resid = y_insample - alphawhat * wt_insample
[~,~,~,~,~,~,~,stats] = plsregress(X_insample', y_resid', K)
B = stats.W
Fhat = B' * X_insample
alphahat = y_resid * Fhat' * inv(Fhat * Fhat')
yhat = (alphahat * B') * X_out + alphawhat * wt_out
macroforecast.models.pls() mirrors this as:
MATLAB step |
|
|---|---|
|
|
residualize |
|
|
|
|
|
|
|
add |
|
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Requested maximum number of latent PLS components. |
|
|
fixed by preset |
Whether to standardize predictors before PLS. |
|
|
fixed by preset |
NIPALS iteration cap. |
|
|
fixed by preset |
NIPALS convergence tolerance. |
|
|
fixed by preset |
Optional X columns used as forecasting controls. |
|
|
fixed by preset |
Whether to include a constant in the control block. |
|
|
fixed by preset |
Whether controls are excluded from the PLS block. |
|
|
fixed by preset |
Whether to add the Hounyo-Li PC2 squared-factor forecast head. |
Preset |
|
|---|---|
|
|
|
|
|
|
scaled_pca#
macroforecast.models.scaled_pca(
X,
y,
*,
n_components=3,
scale=True,
control_columns=None,
include_constant=True,
drop_control_columns=True,
winsorize_slopes=None,
quadratic_factors=False,
)
Fits Huang, Jiang, Li, Tong, and Zhou scaled PCA (sPCA) with a linear
forecast head. The factor extraction step follows the original spcaest.m
contract: standardize predictors, estimate one marginal predictive slope for
each predictor, scale each standardized predictor by that slope, then run PCA
on the scaled panel.
Mathematical contract:
Let \(X \in \mathbb{R}^{T \times N}\) be the model-window predictor matrix and
\(y \in \mathbb{R}^{T}\) be the target. With scale=True, each predictor is
standardized inside the active model window using MATLAB’s default sample
standard deviation convention:
For each predictor, estimate the marginal predictive slope from an intercept regression:
Build the scaled panel:
Then compute principal components with Huang’s normalization \(\hat F'\hat F/T = I\):
For forecasting, macroforecast regresses the target residual after optional
controls on these factors, then projects new scaled observations into the
same factor space. This forecast head is the package wrapper; the factor
extraction itself matches Huang’s spcaest.m design.
Set quadratic_factors=True to reproduce the Hounyo-Li scaledPCA_PC2.m
forecast head:
For multiple factors, the squared term is applied componentwise, matching the
MATLAB code’s alphahat2 * ((leftvector * scaleXs_outofsample).^2) contract.
Original-code match:
Huang |
|
|---|---|
|
model-window predictor standardization with |
|
|
|
closed-form marginal slope stored in |
|
|
|
|
|
|
Scaling note: Huang’s spcaest.m standardizes only X before estimating the
marginal slopes, while the target stays in its raw units. Therefore
scaling_slopes_ in scaled_pca is in target-scale units. This differs from
the Hounyo-Li macro SsPCA code below, where both X and the target are
standardized before the slope step. The two slope vectors can differ by the
target scale, but that is a global scalar difference for the factor/forecast
structure; the practical scaling logic is the same once predictions are
mapped back to the target units.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of Huang scaled-PCA factors. |
|
|
fixed by preset |
Whether to standardize predictors inside the model. |
|
|
fixed by preset |
Optional X columns used as forecasting controls. |
|
|
fixed by preset |
Whether to include a constant in the control block. |
|
|
fixed by preset |
Whether controls are excluded from the PCA block. |
|
|
fixed by preset |
Optional percentile winsorization for scaling slopes. |
|
|
fixed by preset |
Whether to add the Hounyo-Li PC2 squared-factor forecast head. |
The presets tune n_components. Inspect exact candidate lists with
describe_model("scaled_pca").
supervised_pca#
macroforecast.models.supervised_pca(
X,
y,
*,
n_components=3,
n_selected=50,
min_abs_corr=0.0,
scale=True,
control_columns=None,
include_constant=True,
drop_control_columns=True,
preselect="none",
t_threshold=1.28,
elastic_net_alpha=0.0002,
elastic_net_l1_ratio=0.5,
quadratic_factors=False,
random_state=0,
)
Fits original-style supervised PCA (SPCA). The implementation follows the
MATLAB reproducibility code structure from Hounyo and Li’s IJF weak-factor
package: residualize the target on optional controls, rank the current
predictors by absolute correlation with the current target residual, select
n_selected predictors, extract one SVD factor, project both target and
predictor residuals, then repeat for n_components.
This is different from feature_engineering.pca_features(), which is
unsupervised and belongs in macroforecast.feature_engineering. Use this model when
the target is intentionally allowed to guide component construction inside
each model fit window.
Mathematical contract:
Let \(X \in \mathbb{R}^{T \times N}\) be the model-window predictor matrix,
\(y \in \mathbb{R}^{T}\) be the target, and \(W \in \mathbb{R}^{T \times c}\) be
the optional control block. With scale=True, each predictor and the target
are standardized inside the active model window using the same sample standard
deviation convention as MATLAB std(...,0,dim).
First residualize the target on controls. The paper code writes this with an
ordinary inverse; macroforecast uses the Moore-Penrose inverse for numerical
stability when the control block is singular or nearly singular.
For component \(k = 1,\ldots,K\), compute residual correlations, select a subset \(I_k\), extract one SVD loading, and project:
Prediction for a new row \(x_*\) and controls \(w_*\) is:
Set quadratic_factors=True for the Hounyo-Li SPCA_PC2.m variant. In that
case each extraction step also estimates
and updates the target residual as
Original-code match:
MATLAB variable / step |
|
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Verification status: unit tests include a compact MATLAB-style reference
recursion for both SPCA and SsPCA and compare generated predictions against
models.supervised_pca() and models.supervised_scaled_pca().
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of sequential supervised components. |
|
|
yes |
Predictors selected at each SPCA step. |
|
|
yes |
Minimum absolute residual correlation retained before PCA. |
|
|
fixed by preset |
Whether to standardize predictors and target inside the model. |
|
|
fixed by preset |
Optional X columns used as forecasting controls. |
|
|
fixed by preset |
Whether to include a constant in the control block. |
|
|
fixed by preset |
Whether controls are excluded from the PCA block. |
|
|
fixed by preset |
Optional pre-selection: |
|
|
fixed by preset |
Hard t-stat pre-selection threshold. |
|
|
fixed by preset |
Elastic-net pre-selection penalty. |
|
|
fixed by preset |
Elastic-net pre-selection L1 ratio. |
|
|
fixed by preset |
Whether to add the Hounyo-Li PC2 squared-factor forecast head. |
|
|
fixed by preset |
Elastic-net pre-selection random seed. |
The presets tune n_components, n_selected, and min_abs_corr.
Inspect exact candidate lists with describe_model("supervised_pca").
supervised_scaled_pca#
macroforecast.models.supervised_scaled_pca(
X,
y,
*,
n_components=3,
n_selected=50,
min_abs_corr=0.0,
scale=True,
control_columns=None,
include_constant=True,
drop_control_columns=True,
preselect="none",
t_threshold=1.28,
elastic_net_alpha=0.0002,
elastic_net_l1_ratio=0.5,
quadratic_factors=False,
random_state=0,
)
Fits Hounyo-Li supervised scaled PCA (SsPCA). This adds the paper’s
predictive-slope scaling step before the SPCA loop: each standardized
predictor is first multiplied by its marginal predictive slope for the target.
The scaled panel is then passed through the same iterative supervised
selection, SVD factor extraction, and projection loop as supervised_pca.
Mathematical contract:
After the within-window standardization used above, estimate one marginal predictive slope per predictor:
Build the supervised-scaled panel
Then run the same SPCA recursion as supervised_pca with
\(R_X^{(0)} = X^{\mathrm{scaled}}\). The forecast is therefore
This corresponds to Hounyo-Li SsPCA as implemented in the local MATLAB
package: scaledPCA_emp002.m supplies the predictive-slope scaling idea,
SPCA_emp002.m supplies the supervised selection/projection recursion, and
SsPCA_emp002.m combines the two by applying SPCA to scaleXs.
Source checked: the local MATLAB reproducibility package for Hounyo and Li,
SsPCA_emp002.m, SsPCA_tune.m, SPCA_emp002.m, scaledPCA_emp002.m, and
inflation_linear_tune.m. The Python implementation is a clean port of the
algorithmic contract, not copied MATLAB code.
Set quadratic_factors=True for the SsPCA_PC2.m variant. This keeps the same
predictive-slope scaling and supervised selection recursion, then adds the
componentwise squared-factor forecast head used in the paper’s PC2 scripts.
Original-code match for the scaling step:
MATLAB variable / step |
|
|---|---|
|
|
|
closed-form marginal slope |
|
|
|
|
|
|
Target scaling note: the Hounyo-Li macro code standardizes the target and
predictors before computing beta_scaled. Huang’s spcaest.m standardizes
only predictors and keeps the target raw. Consequently, supervised_scaled_pca
stores standardized-target slopes when scale=True, while scaled_pca stores
raw-target slopes. These stored slope magnitudes are not directly comparable
without the target standard deviation. For factor construction and forecast
generation, however, the difference is a global target-scale multiplier rather
than a different screening or projection rule.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of sequential SsPCA components. |
|
|
yes |
Predictors selected at each SPCA step after slope scaling. |
|
|
yes |
Minimum absolute residual correlation retained before PCA. |
|
|
fixed by preset |
Whether to standardize predictors and target inside the model. |
|
|
fixed by preset |
Optional X columns used as forecasting controls. |
|
|
fixed by preset |
Whether to include a constant in the control block. |
|
|
fixed by preset |
Whether controls are excluded from the PCA block. |
|
|
fixed by preset |
Optional pre-selection: |
|
|
fixed by preset |
Hard t-stat pre-selection threshold. |
|
|
fixed by preset |
Elastic-net pre-selection penalty. |
|
|
fixed by preset |
Elastic-net pre-selection L1 ratio. |
|
|
fixed by preset |
Whether to add the Hounyo-Li PC2 squared-factor forecast head. |
|
|
fixed by preset |
Elastic-net pre-selection random seed. |
The original empirical MATLAB code uses lagged target plus constant controls.
In macroforecast, pass the lagged target as an X column and list it in
control_columns when that exact control block is needed.
ar#
macroforecast.models.ar(y, *, n_lag=1)
Fits a univariate autoregression on the target series.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Autoregressive lag order. |
Preset |
|
|---|---|
|
|
|
|
|
|
stlf#
macroforecast.models.stlf(y, *, period=None, sa_method="ets")
STL decomposition forecaster (R forecast::stlf). Seasonally adjusts the target
with STL, forecasts the seasonally-adjusted series (additive-trend exponential
smoothing, random-walk-drift fallback), and adds back the last seasonal cycle.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
no |
Seasonal period; inferred from the index frequency if omitted. |
|
|
no |
Forecaster for the seasonally-adjusted series. |
naive#
macroforecast.models.naive(y)
Random-walk baseline (R forecast::naive). Carries the last observed target
value forward, so the h-step path is constant at y_T. Target-only.
seasonal_naive#
macroforecast.models.seasonal_naive(y, *, period=None)
Seasonal-naive baseline (R forecast::snaive). Repeats the last full seasonal
cycle of length period, so step k returns the value from one season earlier.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
no |
Seasonal period |
random_walk_drift#
macroforecast.models.random_walk_drift(y)
Random-walk-with-drift baseline (R forecast::rwf(drift=TRUE)). Extrapolates the
last value by the average historical change: y_T + h * (y_T - y_1) / (T - 1).
var#
macroforecast.models.var(panel, *, target=None, n_lag=1, type="const", season=None)
Fits a VAR on a multivariate panel. target chooses the forecast output
column. If omitted, the first column is used. The callable now uses an internal
OLS implementation aligned with R vars::VAR and predict.varest: lagged
endogenous variables are stacked in lag order, deterministic terms are controlled
by R-style type, and predict() recursively rolls the VAR state forward for
point forecasts.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Target column in the panel. |
|
|
yes |
VAR lag order. |
|
|
fixed by preset |
R |
|
|
fixed by preset |
Optional centered seasonal dummies, matching |
Preset |
|
|---|---|
|
|
|
|
|
|
bvar_minnesota#
macroforecast.models.bvar_minnesota(
panel,
*,
target=None,
n_lag=1,
kappa0=2.0,
kappa1=0.5,
nu0=0.0,
s0=0.0,
iter=10000,
burnin=5000,
random_state=0,
)
Fits a Bayesian VAR posterior sampler with the Minnesota prior variance logic
used by R FAVAR::BVAR and bvartools::minnesota_prior. Saved posterior
coefficient and covariance draws are available in diagnostics; predict() uses
posterior-mean VAR coefficients for recursive point forecasts. BVAR forecasting
is not a macroforecast-only extension: CRAN BVAR exposes predict.bvar, while
this callable is macroforecast’s R-aligned ModelFit surface for the same class
of BVAR forecast object.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Target column in the panel. |
|
|
yes |
VAR lag order. |
|
|
yes |
Minnesota own-lag prior scale. |
|
|
yes |
Minnesota lag-decay exponent. |
|
|
fixed by preset |
Inverse-Wishart degrees-of-freedom prior parameter. |
|
|
fixed by preset |
Inverse-Wishart scale prior parameter. |
|
|
fixed by preset |
Total Gibbs iterations. |
|
|
fixed by preset |
Burn-in iterations removed before summaries. |
|
|
fixed by preset |
Random seed for posterior draws. |
bvar_normal_inverse_wishart#
macroforecast.models.bvar_normal_inverse_wishart(
panel,
*,
target=None,
n_lag=1,
b0=0.0,
vb0=0.0,
nu0=0.0,
s0=0.0,
iter=10000,
burnin=5000,
random_state=0,
)
Fits the same FAVAR-style Bayesian VAR posterior sampler with direct controls for coefficient prior mean/variance and inverse-Wishart covariance prior terms. Saved diagnostics include coefficient posterior mean, standard deviation, credible interval bounds, and posterior mean covariance.
ets#
macroforecast.models.ets(
y,
*,
error="add",
trend=None,
seasonal=None,
seasonal_periods=None,
damped_trend=False,
)
Fits a target-only statsmodels exponential-smoothing model through ETS-style
arguments. In forecasting.run(...), this model ignores X and fits on the
stage target vector.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Error component; currently additive by default. |
|
|
fixed by preset |
Optional trend component such as |
|
|
fixed by preset |
Optional seasonal component such as |
|
|
fixed by preset |
Seasonal period length. |
|
|
fixed by preset |
Whether to damp the trend component. |
Output is ModelFit; predict(X_future) uses only the number of requested
future rows and preserves the provided index.
holt_winters#
macroforecast.models.holt_winters(
y,
*,
trend="add",
seasonal=None,
seasonal_periods=None,
damped_trend=False,
)
Fits a target-only Holt-Winters exponential-smoothing model. In the forecasting runner it is a target-input model: feature matrices are used only to provide the forecast index and horizon length.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Trend component. |
|
|
fixed by preset |
Optional seasonal component. |
|
|
fixed by preset |
Seasonal period length. |
|
|
fixed by preset |
Whether to damp the trend component. |
Output is ModelFit; predictions are indexed like the supplied future frame.
theta_method#
macroforecast.models.theta_method(
y,
*,
period=None,
deseasonalize=True,
use_test=True,
)
Fits statsmodels’ target-only Theta method wrapper. Use it as a benchmark univariate model; it does not consume predictor columns.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Seasonal period passed to statsmodels. |
|
|
fixed by preset |
Whether statsmodels deseasonalizes before fitting. |
|
|
fixed by preset |
Whether statsmodels uses its internal seasonality test. |
Output is ModelFit; predict(X_future) returns a point forecast series.
dfm_mixed_mariano_murasawa#
macroforecast.models.dfm_mixed_mariano_murasawa(
panel,
*,
target=None,
metadata=None,
monthly_columns=None,
quarterly_columns=None,
unsupported="raise",
n_factors=1,
factor_order=1,
idiosyncratic_ar1=True,
standardize=True,
maxiter=500,
tolerance=1e-6,
)
Fits a monthly/quarterly dynamic factor model through
statsmodels.tsa.statespace.dynamic_factor_mq.DynamicFactorMQ. The callable
uses the Mariano-Murasawa state-space aggregation for quarterly variables by
ordering monthly columns first, quarterly columns second, and passing
k_endog_monthly to statsmodels.
R comparison: this is a backend-wrapper analogue of the mixed-frequency DFM
contract used by dfms::DFM(X, quarterly.vars=...) and archived
nowcasting::nowcast(method="EM"). Those R implementations require quarterly
series to be positioned after monthly series and impose the
Mariano-Murasawa [1, 2, 3, 2, 1] temporal aggregation restriction for
quarterly growth/flow variables. macroforecast delegates the Kalman/EM
likelihood to statsmodels rather than reimplementing the R/C++ filter code.
The preferred input is a native mixed-frequency bundle:
mixed = mf.data.combine(monthly_bundle, quarterly_bundle, frequency="native")
fit = mf.models.dfm_mixed_mariano_murasawa(mixed, target="GDPC1")
metadata["native_frequency_by_column"] is used to split monthly and quarterly
columns. If metadata are absent, the function infers frequencies from observed
date spacing. Explicit monthly_columns and quarterly_columns override
metadata. Unsupported frequencies raise by default; set unsupported="drop" to
drop those columns before fitting.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Forecasted panel column. Defaults to first quarterly column, otherwise first column. |
|
|
fixed by preset |
Metadata with |
|
|
fixed by preset |
Explicit monthly columns. |
|
|
fixed by preset |
Explicit quarterly columns. |
|
|
fixed by preset |
Unsupported frequency policy: |
|
|
yes |
Number of dynamic factors. |
|
|
yes |
VAR order for factor dynamics. |
|
|
fixed by preset |
Model idiosyncratic components as AR(1). |
|
|
fixed by preset |
Let statsmodels standardize observed variables before fitting. |
|
|
fixed by preset |
EM iteration cap. |
|
|
fixed by preset |
EM convergence tolerance. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
Diagnostics include filtered factors, fitted target values when available, target residuals, likelihood, and fitted parameter estimates.
dfm_unrestricted_midas#
macroforecast.models.dfm_unrestricted_midas(
panel,
*,
target,
metadata=None,
lag_columns=None,
lags=(0, 1, 2),
factor_lags=(0,),
target_frequency="quarterly",
anchor_position="period_end",
n_factors=1,
factor_order=1,
idiosyncratic_ar1=True,
standardize=True,
maxiter=500,
tolerance=1e-6,
alpha=0.0,
fit_intercept=True,
drop_missing=True,
)
Fits a composite mixed-frequency model:
Fit
dfm_mixed_mariano_murasawa(...)on the native mixed-frequency panel.Extract filtered DFM factors at the target anchor dates.
Add optional observed lag blocks from
mixed_frequency_lags(...).Fit
unrestricted_midas(...)as the forecast head.
This is a convenience composite, not a new state-space likelihood. The returned
fit’s predict() method accepts a prepared feature matrix with the same
columns as fit.estimator.design_. The lower-level
fit.estimator.predict_from_panel(...) method rebuilds the composite design
from a native mixed-frequency panel. forecasting.run(...) uses that method:
it fits the MIDAS head on the training panel, masks the test target values, then
refits the DFM on the available native panel so current monthly information can
enter the test-origin factor design without using the held-out target.
R comparison: this is the explicit callable version of a two-stage workflow,
not a single R estimator. The first stage follows the DFM contract above. The
forecast head is aligned with midasr::midas_u when alpha=0; alpha>0 is a
macroforecast ridge extension.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
required |
fixed by preset |
Forecasted target column. |
|
|
fixed by preset |
Metadata with native frequencies; normally supplied by |
|
|
fixed by preset |
Observed columns added as unrestricted MIDAS lags. |
|
|
yes |
Native-frequency lags for observed columns. |
|
|
yes |
Monthly lags of filtered DFM factors. |
|
|
fixed by preset |
Frequency used to position target anchor dates. |
|
|
fixed by preset |
Anchor positioning; useful for FRED-QD quarter-start dates. |
|
|
yes |
Number of DynamicFactorMQ factors. |
|
|
yes |
VAR order for factor dynamics. |
|
|
fixed by preset |
Model DFM idiosyncratic components as AR(1). |
|
|
fixed by preset |
Let DynamicFactorMQ standardize observed variables. |
|
|
fixed by preset |
DFM EM iteration cap. |
|
|
fixed by preset |
DFM EM convergence tolerance. |
|
|
yes |
Ridge penalty on the unrestricted MIDAS head. |
|
|
fixed by preset |
Whether the unrestricted MIDAS head includes an intercept. |
|
|
fixed by preset |
Drop incomplete composite design rows before fitting the head. |
midas_almon#
macroforecast.models.midas_almon(
X,
y,
*,
polynomial_order=2,
theta=None,
alpha=0.0,
fit_intercept=True,
)
Fits a MIDAS regression where each lag group is compressed with normalized exponential Almon weights before a linear or ridge head is fit.
R comparison: midasr::midas_r(..., nealmon) jointly estimates the aggregate
scale and Almon shape by nonlinear least squares. macroforecast keeps the shape
fixed as a hyperparameter and estimates only the aggregate regression
coefficient in a linear/ridge head. The weight shape matches the scale-free part
of midasr::nealmon.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Degree of the Almon polynomial. |
|
|
fixed by preset |
Shape coefficients for the scale-free part of |
|
|
yes |
Ridge penalty on the regression head. |
|
|
fixed by preset |
Whether the regression head includes an intercept. |
Weight formula:
h_j = j, j = 1, ..., d
w_j = exp(theta_1 h_j + ... + theta_p h_j^p) / sum_j exp(...)
If theta is supplied, it must contain polynomial_order values. The aggregate
coefficient scale is estimated by the regression head, corresponding to the
first scale parameter in midasr::nealmon.
midas_beta#
macroforecast.models.midas_beta(
X,
y,
*,
beta_params=(1.0, 1.0),
alpha=0.0,
fit_intercept=True,
)
Fits a MIDAS regression where each lag group is compressed with normalized beta weights before a linear or ridge head is fit.
R comparison: this uses the scale-free form of midasr::nbetaMT with
p=(1, a, b, 0): endpoints are shifted by machine epsilon, the beta density is
normalized, and the aggregate scale is estimated by the regression head.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Positive beta-shape parameters |
|
|
yes |
Ridge penalty on the regression head. |
|
|
fixed by preset |
Whether the regression head includes an intercept. |
Weight formula:
z_j = (j - 1) / (d - 1), with endpoint epsilon adjustment
w_j = z_j^(a-1) (1-z_j)^(b-1) / sum_j z_j^(a-1) (1-z_j)^(b-1)
Both beta parameters must be strictly positive.
midas_step#
macroforecast.models.midas_step(
X,
y,
*,
n_steps=3,
step_bounds=None,
step_weights=None,
alpha=0.0,
fit_intercept=True,
)
Fits a MIDAS regression where lags are grouped into piecewise-constant step
blocks. If step_bounds and step_weights are omitted, the lag range is split
into n_steps blocks with equal raw step heights, then normalized to a
scale-free weight vector.
R comparison: midasr::polystep(p, d, m, a) repeats raw step coefficients
between interior cut points. macroforecast exposes the same idea through
step_bounds=a and step_weights=p, then normalizes the resulting shape
because the aggregate scale is estimated by the regression head.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of lag buckets when |
|
|
fixed by preset |
Optional interior cut points, matching |
|
|
fixed by preset |
Optional raw step heights, one per bucket. |
|
|
yes |
Ridge penalty on the regression head. |
|
|
fixed by preset |
Whether the regression head includes an intercept. |
n_steps must be positive. If supplied, step_bounds must be strictly
increasing and smaller than the number of lag columns; step_weights must
contain one value per resulting bucket.
restricted_midas#
macroforecast.models.restricted_midas(
X,
y,
*,
weighting="almon",
polynomial_order=2,
start_params=None,
n_steps=3,
step_bounds=None,
fit_intercept=True,
maxiter=1000,
tolerance=1e-8,
)
Fits a nonlinear restricted MIDAS regression over an explicit lag matrix. This
is the direct callable counterpart to midasr::midas_r when the formula has
already been expanded into columns such as PAYEMS_lag0, PAYEMS_lag1, and
PAYEMS_lag2.
R comparison: midasr::midas_r maps each low-dimensional restriction parameter
vector into full lag coefficients and minimizes the nonlinear least-squares
objective. restricted_midas() uses the same objective and the same nealmon,
nbetaMT, or polystep coefficient maps. It uses SciPy least_squares
instead of R’s default optim(method="BFGS"), so optimizer traces are not
bit-identical, but the restricted regression equation is the same. Formula
parsing, AR* common-factor terms, HAC covariance, model tables, and S3 forecast
utilities are not reproduced here.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Restriction map: |
|
|
yes |
Number of Almon shape terms after the aggregate scale parameter. |
|
|
fixed by preset |
Starting values. Pass one sequence for all lag groups, or a mapping from group name to sequence. |
|
|
yes |
Number of step buckets when |
|
|
fixed by preset |
Interior cut points for |
|
|
fixed by preset |
Whether to estimate an intercept outside the restricted lag coefficients. |
|
|
fixed by preset |
Maximum SciPy least-squares function evaluations. |
|
|
fixed by preset |
Shared |
Outputs include fitted values, residuals, unrestricted effective lag coefficients, the optimized restricted parameter vector, convergence metadata, and the lag-group metadata used to expand coefficients.
unrestricted_midas#
macroforecast.models.unrestricted_midas(
X,
y,
*,
alpha=0.0,
fit_intercept=True,
)
Fits an unrestricted MIDAS regression. Unlike the weighted variants, it does not collapse lag groups; each supplied lag column receives its own coefficient.
R comparison: this matches midasr::midas_u when alpha=0, because every lag
coefficient is free and the regression is ordinary least squares. alpha>0 is
a macroforecast ridge extension for high-dimensional lag matrices.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Ridge penalty. |
|
|
fixed by preset |
Whether the regression head includes an intercept. |
MIDAS Input Contract#
The MIDAS callables expect lag-grouped predictor columns, typically names like
PAYEMS_lag0, PAYEMS_lag1, and PAYEMS_lag2. Columns sharing the same
prefix before _lag# are collapsed into one weighted aggregate before a
linear or ridge regression is fit. This keeps mixed-frequency weighting as a
model choice while leaving calendar alignment and lag construction in
data, preprocessing, and feature_engineering.
These callables are small model functions, not workflow recipes. They do not
infer target anchors, release calendars, or future design matrices. Build the
lag matrix explicitly with mixed_frequency_lags(...), align X and y, then
call the model.
The weighted MIDAS callables midas_almon(), midas_beta(), and
midas_step() treat shape parameters as fixed or selected hyperparameters.
Use restricted_midas() when the shape and scale parameters should be
estimated jointly by nonlinear least squares, matching the midasr::midas_r
estimation target.
For unrestricted_midas(), build its input matrix with
mf.feature_engineering.mixed_frequency_lags(...) when the source data are
native mixed-frequency panels.
All MIDAS callables preserve lag metadata. Weighted variants record lag groups,
resolved weights, weighted aggregate column names, aggregate coefficients, and
effective lag coefficients. unrestricted_midas() records the original lag
groups and per-lag coefficients.
X_midas = mf.feature_engineering.mixed_frequency_lags(
mixed,
target="GDPC1",
columns=["PAYEMS", "INDPRO"],
lags=range(0, 12),
target_frequency="quarterly",
anchor_position="period_end",
drop_missing=True,
)
y = mixed.panel["GDPC1"].dropna()
y.index = y.index.to_period("Q").asfreq("M", how="end").to_timestamp()
aligned = X_midas.join(y.rename("GDPC1")).dropna()
fit = mf.models.midas_beta(
aligned.drop(columns="GDPC1"),
aligned["GDPC1"],
beta_params=(1.0, 2.0),
alpha=0.1,
)
fit.metadata["weights"]
fit.diagnostics["effective_lag_coefficients"]
far#
macroforecast.models.far(
X,
y,
*,
n_factors=3,
n_lag=1,
random_state=0,
)
Fits factor-augmented autoregression: PCA factors from X plus AR lags of
y.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of PCA factors. |
|
|
yes |
Autoregressive lag order. |
|
|
fixed by preset |
PCA random seed. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
favar#
macroforecast.models.favar(
X,
y,
*,
n_factors=2,
n_lag=2,
fctmethod="BBE",
slowcode=None,
factorprior=None,
varprior=None,
nburn=5000,
nrep=15000,
standardize=True,
random_state=0,
)
Fits a Bayesian FAVAR aligned with CRAN FAVAR::FAVAR: optional R-style
standardization, ExtrPC() factor extraction, BBE facrot() or BGM factor
identification, conjugate loading-equation draws, and the internal
FAVAR::BVAR posterior sampler for the [factors, y] VAR block.
Important boundary: BVAR forecasting is standard, and CRAN BVAR has
predict.bvar. The macroforecast-specific extension here is narrower:
CRAN FAVAR exposes summaries, coefficients, and impulse responses for favar
objects, but not predict.favar. Therefore macroforecast.models.favar(...).predict(...)
is a ModelFit forecast wrapper over the fitted FAVAR posterior VAR state using
posterior-mean coefficients.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of latent factors. |
|
|
yes |
VAR lag order on the target plus factors. |
|
|
fixed by preset |
Factor identification method: |
|
|
fixed by preset |
Boolean slow-variable mask required by BBE. |
|
|
fixed by preset |
Factor loading prior controls. |
|
|
fixed by preset |
BVAR prior controls for the factor VAR block. |
|
|
fixed by preset |
Burn-in iterations for posterior draws. |
|
|
fixed by preset |
Saved posterior draw count. |
|
|
fixed by preset |
Use R |
|
|
fixed by preset |
Random seed for posterior draws. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
Tree And Machine-Learning Models#
Tree Implementation Map#
Tree callables use backend wrappers, hybrid wrappers, or package-native code. Fit-time model ensembles such as bagging, subagging, stacking, Super Learner, and Booging live in Model Ensemble.
Model |
Implementation class |
Runtime backend |
|---|---|---|
|
backend wrapper |
|
|
backend wrapper |
|
|
backend wrapper |
|
|
backend wrapper |
|
|
optional backend wrapper |
|
|
optional backend wrapper |
|
|
package-native hybrid |
LGB+ competition algorithm aligned to |
|
package-native hybrid |
LGB^A+ alternating algorithm aligned to |
|
optional backend wrapper |
|
|
hybrid |
sklearn forest plus macroforecast leaf-target quantile store |
|
hybrid adapter |
vendored |
backend wrapper means the statistical estimator is delegated to the named
package and macroforecast standardizes callable input, metadata, diagnostics,
and persistence. hybrid means macroforecast owns part of the algorithmic
contract, such as resampling, leaf-distribution storage, feature augmentation,
or pandas-to-reference-package adaptation. package-native means the estimator
logic itself is implemented inside macroforecast.
decision_tree#
macroforecast.models.decision_tree(
X,
y,
*,
max_depth=None,
min_samples_leaf=1,
random_state=0,
)
Fits sklearn CART regression.
R parity is intentionally not claimed for the sklearn tree wrappers. The named
backend owns the estimator; macroforecast owns the pandas X, y contract,
metadata, diagnostics, and search-space registration.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Maximum tree depth. |
|
|
yes |
Minimum samples per terminal leaf. |
|
|
fixed by preset |
Tree random seed. |
Preset |
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
random_forest#
macroforecast.models.random_forest(
X,
y,
*,
n_estimators=200,
max_depth=None,
min_samples_leaf=1,
random_state=0,
n_jobs=1,
)
Fits sklearn random forest regression.
The fitted wrapper exposes sklearn feature importances through
fit.diagnostics["feature_importance"].
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of trees. |
|
|
yes |
Maximum depth per tree. |
|
|
yes |
Minimum samples per terminal leaf. |
|
|
fixed by preset |
Forest random seed. |
|
|
fixed by preset |
Parallel worker count. |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Default model-selection method: random.
extra_trees#
macroforecast.models.extra_trees(
X,
y,
*,
n_estimators=200,
max_depth=None,
min_samples_leaf=1,
random_state=0,
n_jobs=1,
)
Fits sklearn extremely randomized trees. Parameters and presets match
random_forest.
Default model-selection method: random.
gradient_boosting#
macroforecast.models.gradient_boosting(
X,
y,
*,
n_estimators=200,
learning_rate=0.1,
max_depth=3,
random_state=0,
)
Fits sklearn gradient-boosted regression trees.
This is one boosting estimator. Fit-time boosting ensembles such as Booging live in Model Ensemble.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of boosting stages. |
|
|
yes |
Shrinkage per stage. |
|
|
yes |
Maximum tree depth. |
|
|
fixed by preset |
Boosting random seed. |
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Default model-selection method: random.
mars#
macroforecast.models.mars(
X,
y,
*,
max_terms=20,
max_degree=1,
n_knots=10,
min_improvement=1e-6,
penalty=2.0,
prune=True,
)
Fits a package-native MARS-style hinge-basis regression. It uses forward
insertion of hinge basis pairs and optional backward pruning by generalized
cross-validation. This avoids the unmaintained pyearth dependency; it is a
clean internal implementation and does not claim bit-level equivalence to
other MARS backends.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Maximum number of basis terms including intercept. |
|
|
yes |
Maximum interaction degree. |
|
|
yes |
Candidate quantile knots per predictor. |
|
|
fixed by preset |
Forward-step relative RSS improvement floor. |
|
|
fixed by preset |
GCV pruning complexity penalty. |
|
|
fixed by preset |
Whether to prune terms by GCV. |
Default model-selection method: random.
xgboost#
macroforecast.models.xgboost(
X,
y,
*,
n_estimators=300,
learning_rate=0.1,
max_depth=6,
subsample=1.0,
random_state=0,
**kwargs,
)
Fits xgboost.XGBRegressor. Requires macroforecast[xgboost].
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of boosting stages. |
|
|
yes |
Shrinkage per stage. |
|
|
yes |
Maximum tree depth. |
|
|
yes |
Row subsample share. |
|
|
fixed by preset |
Boosting random seed. |
Preset spaces match gradient_boosting plus subsample=(0.6, 0.8, 1.0).
Default model-selection method: random.
lightgbm#
macroforecast.models.lightgbm(
X,
y,
*,
n_estimators=300,
learning_rate=0.1,
max_depth=-1,
num_leaves=31,
random_state=0,
**kwargs,
)
Fits lightgbm.LGBMRegressor. Requires macroforecast[lightgbm].
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of boosting stages. |
|
|
yes |
Shrinkage per stage. |
|
|
yes |
Maximum tree depth; |
|
|
yes |
Maximum leaves per tree. |
|
|
fixed by preset |
Boosting random seed. |
Preset |
|
|
|
|
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Default model-selection method: random.
lgb_plus#
Paper Citation And Source#
Primary paper:
Goulet Coulombe, Philippe. 2026. “LGB+: A Macroeconomic Forecasting Road Test.” Draft dated March 18, 2026. SSRN abstract 6439178. DOI:
10.2139/ssrn.6439178. Paper page: https://ssrn.com/abstract=6439178.
Reference implementation:
Source |
Role in |
|---|---|
Original R/Python implementation repository. |
|
|
Competition algorithm, |
|
Competition estimator class with in-class ensemble storage and step histories. |
|
Alternating algorithm and |
|
Alternating estimator class. |
Paper Motivation#
The paper targets a macro forecasting problem that appears repeatedly in small and medium macro samples. Tree boosting is strong for nonlinearities and interactions, but a large share of macro predictive content can be simple: autoregressive persistence, slowly moving accounting-like relationships, or near-mechanical links such as claims before unemployment and permits before housing starts. A standard tree booster can approximate those linear slopes, but it spends splits and boosting capacity to do work that a one-variable linear update could do cheaply.
LGB+ expands the boosting basis from “trees only” to “trees plus greedy linear updates.” The forecast remains additive:
yhat(x) = intercept + tree_component(x) + linear_component(x)
That additive form is not a cosmetic detail. It lets the user inspect the forecast through two channels:
Channel |
Intended role |
Caveat |
|---|---|---|
Linear |
Persistence, autoregressive slopes, near-accounting links, and other simple one-variable residual corrections. |
The split is operational, not metaphysical; a tree can still learn linear-looking structure. |
Tree |
Nonlinear states, interactions, thresholds, and regime-dependent effects. |
Tree gains can include simple structures if the linear candidate loses the competition. |
The paper emphasizes that the linear/nonlinear split should be read as an algorithmic decomposition generated by the boosting path, not as proof that the data-generating process is literally separated into pure linear and pure nonlinear blocks.
Paper Empirical Design#
The empirical road test uses transformed quarterly U.S. macro data from FRED-QD. The review file summarized the design as six targets: headline CPI inflation, GDP growth, unemployment, housing starts growth, industrial production, and the term spread. Predictors include FRED-QD transformations, four lags, MARX moving-average features, and principal components from the transformed panel. The out-of-sample design is expanding-window forecasting, with a pre-COVID evaluation period and a post-COVID stress period.
This matters for using macroforecast:
Paper object |
|
|---|---|
FRED-QD transformed panel |
|
Four lags |
|
MARX moving-average features |
|
Principal components |
|
Expanding OOS design |
|
Re-estimation schedule |
|
LGB+ model |
|
LGB^A+ model |
|
Linear/tree forecast decomposition |
|
Method In The Paper#
The paper has two closely related estimators.
LGB+ is the competition version. At each boosting step:
Step |
Operation |
|---|---|
1 |
Start from the current fitted value. |
2 |
Compute residuals on the training sample. |
3 |
Draw a row subsample. |
4 |
Fit one small LightGBM residual tree on the subsample. |
5 |
Fit one greedy univariate linear residual update on the same subsample. |
6 |
Evaluate both candidate updates using |
7 |
Accept only the lower-loss candidate. |
8 |
Record which channel won, the selected linear feature when relevant, and the candidate losses. |
LGB^A+ is the alternating version. It does not run a per-step competition.
Instead, each cycle applies a block of residual trees and then a greedy
univariate linear correction. This is computationally simpler and can be more
stable when the OOB judge is noisy in macro-sized samples.
Main Paper Findings To Keep In Mind#
The paper’s simulations and empirical road test support the following working interpretation:
Finding |
Practical implication |
|---|---|
In mostly linear designs, the linear channel can absorb much of the signal and avoid forcing trees to approximate simple slopes. |
Include autoregressive and near-accounting predictors explicitly; then inspect the linear channel. |
In nonlinear designs, the tree channel remains active and the hybrid does not have to behave like a linear model. |
LGB+ is a flexible hybrid, not a linear model with tree residuals fixed in advance. |
The linear channel is often useful for short-horizon unemployment and industrial production before COVID. |
Channel diagnostics can reveal whether gains come from persistence-like relations or nonlinear state recognition. |
In post-COVID stress periods, the linear channel can become harmful for some real-activity targets. |
Report channel-specific diagnostics; do not rely only on total RMSE. |
Forecasts can be decomposed natively into tree and linear pieces. |
Use |
What macroforecast Implements#
macroforecast.models.lgb_plus implements the competition estimator as a
package-native hybrid model. LightGBM supplies the residual tree candidate, but
the step loop, linear candidate, OOB/validation/training selection, ensemble
aggregation, channel accounting, and pandas metadata are implemented inside
macroforecast.
The implementation is deliberately not just a thin LGBMRegressor wrapper:
Feature |
Implemented? |
Notes |
|---|---|---|
Tree candidate per step |
yes |
Uses |
Greedy univariate linear candidate |
yes |
Uses the same no-intercept residual slope as the competition reference code. |
|
yes |
Kept from the R implementation. |
|
yes |
Default; requires |
|
yes |
Uses a fixed random validation split inside the current fit window. |
|
yes |
Available for reference parity, but not recommended for macro evaluation. |
Ensemble members |
yes |
Controlled by |
Linear component prediction |
yes |
|
Tree component prediction |
yes |
|
Channel diagnostics |
yes |
|
AXIL historical weights |
no |
This belongs in interpretation/forecast analysis later, not inside the estimator. |
Full paper table replication |
no |
The callable model is implemented; full empirical replication should be a separate replication package. |
macroforecast.models.lgb_plus(
X,
y,
*,
n_ensemble=10,
n_steps=200,
learning_rate=0.05,
subsample=0.7,
num_leaves=5,
min_data_in_leaf=20,
lambda_l2=0.1,
linear_candidate_fraction=0.5,
selection_method="oob",
val_fraction=0.2,
early_stop_patience=50,
aggregation="mean",
random_state=0,
verbose=False,
**kwargs,
)
Fits LGB+ competition boosting from Goulet Coulombe’s
philgoucou/lgbplus reference code.
This is not ordinary lightgbm with extra linear features. At every boosting
step the estimator builds two residual updates:
Candidate |
Reference-code operation |
Accepted when |
|---|---|---|
Tree |
Fit one small |
Candidate loss is no larger than the linear candidate. |
Linear |
Sample |
Candidate loss is lower than the tree candidate. |
The R reference file R/lgb_plus.R includes linear_candidate_fraction; the
Python reference file python/lgb_plus.py embeds the ensemble in the estimator
but does not expose that candidate-fraction argument. macroforecast combines
those two reference surfaces: n_ensemble controls independent runs and
linear_candidate_fraction controls greedy linear candidate subsampling.
Input is the standard supervised model contract:
Input |
Required shape |
Meaning |
|---|---|---|
|
|
Predictor matrix. DataFrame column names are preserved in diagnostics. |
|
|
Forecast target for the current fit window. |
Output is a ModelFit with model="lgb_plus". Use
fit.predict(X_new) for the total prediction. The estimator also exposes:
Method or diagnostic |
Output |
Meaning |
|---|---|---|
|
DataFrame |
|
|
ndarray |
One total prediction path per ensemble member. |
|
DataFrame |
Tree gain, linear selection count, and absolute linear update by feature. |
|
dict |
Total tree and linear steps plus per-member counts. |
|
dict |
Per-step candidate losses and selected channel metadata. |
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Independent boosting runs. Predictions are aggregated across runs. |
|
|
yes |
Maximum tree/linear competition steps per run. |
|
|
yes |
Shared shrinkage for accepted tree or linear updates. |
|
|
yes |
Row subsample share per step. |
|
|
yes |
Maximum leaves for the one-step LightGBM tree candidate. |
|
|
yes |
Minimum rows per LightGBM leaf. |
|
|
fixed by preset |
L2 penalty for LightGBM tree candidates. |
|
|
yes |
Fraction of predictors sampled before greedy linear selection. |
|
|
fixed by preset |
Candidate judge: |
|
|
fixed by preset |
Fixed validation share when |
|
|
fixed by preset |
Stop after this many non-improving selected losses; |
|
|
fixed by preset |
Ensemble aggregation: |
|
|
fixed by preset |
Base random seed. Each ensemble member increments it. |
|
none |
fixed by caller |
Additional |
Preset |
Main search dimensions |
|---|---|
|
|
|
Same dimensions with 5-10 members, 100-400 steps, and candidate fractions |
|
Same dimensions with up to 20 members, 600 steps, lower learning rates, and broader subsampling. |
Default model-selection method: random.
lgba_plus#
Paper Link#
lgba_plus is the macroforecast callable for the paper’s alternating
variant, LGB^A+. It uses the same paper and source references as lgb_plus:
Goulet Coulombe (2026), “LGB+: A Macroeconomic Forecasting Road Test,”
SSRN 6439178, DOI
10.2139/ssrn.6439178, and the
philgoucou/lgbplus source
repository.
What Changes Relative To lgb_plus#
The alternating version is easier to read and usually cheaper to fit:
Dimension |
|
|
|---|---|---|
Update schedule |
Tree and linear candidates compete; one winner advances. |
Every cycle applies both a tree block and one linear correction. |
Main count parameter |
|
|
Tree learning rate |
Shared |
|
Linear learning rate |
Shared |
|
Linear update |
No-intercept residual slope in the competition reference code. |
Intercept plus slope, matching the alternating reference code. |
Ensemble control |
|
|
Main diagnostic |
Winner path and candidate losses. |
Cycle path and selected linear feature after each tree block. |
Use lgba_plus when the goal is a stable hybrid path rather than estimating the
best tree/linear mix at every individual step. Use lgb_plus when the winner
sequence itself is part of the analysis.
What macroforecast Implements#
Feature |
Implemented? |
Notes |
|---|---|---|
Residual tree blocks |
yes |
Uses |
Greedy linear correction after every tree block |
yes |
Selects the largest absolute residual correlation. |
Separate tree and linear learning rates |
yes |
|
|
yes |
Folds the R |
Component prediction |
yes |
Same |
Channel importance |
yes |
Tree gain, linear selection count, and absolute linear update. |
Full AXIL dual interpretation |
no |
Planned for interpretation/forecast analysis rather than model fitting. |
macroforecast.models.lgba_plus(
X,
y,
*,
n_runs=1,
n_cycles=25,
trees_per_cycle=10,
lr_tree=0.02,
lr_linear=0.1,
num_leaves=15,
min_data_in_leaf=20,
subsample=1.0,
random_state=0,
verbose=False,
**kwargs,
)
Fits LGB^A+, the alternating variant from
philgoucou/lgbplus. Each cycle first
fits a block of LightGBM residual trees, then fits one greedy univariate linear
residual update with an intercept. Unlike lgb_plus, there is no per-step
winner selection: both channels are updated every cycle.
The R reference file R/lgb_plus_A.R also provides an ensemble helper
lgb_plus_A_ensemble. macroforecast folds that helper into this estimator via
n_runs, so the same callable can represent both a single alternating model and
an averaged alternating ensemble.
Input and output follow the same supervised contract as lgb_plus.
fit.estimator.predict_components(X_new) returns the total, intercept, tree,
and linear channels; fit.estimator.channel_importance() reports tree gain and
linear update frequency by feature.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Independent alternating runs. |
|
|
yes |
Tree-block plus linear-update cycles per run. |
|
|
yes |
Residual LightGBM trees per cycle. |
|
|
yes |
Shrinkage for tree-block predictions. |
|
|
yes |
Shrinkage for linear residual updates. |
|
|
yes |
Maximum leaves for each residual tree. |
|
|
yes |
Minimum rows per LightGBM leaf. |
|
|
yes |
LightGBM bagging fraction for tree blocks. |
|
|
fixed by preset |
Base random seed. Each run increments it. |
|
none |
fixed by caller |
Additional |
The linear slope is computed by the centered OLS identity
sum((x - mean(x)) * (r - mean(r))) / sum((x - mean(x))^2). This is
statistically equivalent to the R code’s cov(x, residual) / var(x) and avoids
the denominator mismatch in the reference Python expression that combines
np.cov(...) with x.var().
Preset |
Main search dimensions |
|---|---|
|
Short alternating runs for smoke or narrow-window use. |
|
1 or 5 runs, 10 or 25 cycles, and tree/linear learning-rate grids. |
|
Up to 10 runs, 50 cycles, broader tree-block size and learning-rate ranges. |
Default model-selection method: random.
catboost#
macroforecast.models.catboost(
X,
y,
*,
n_estimators=300,
learning_rate=0.1,
max_depth=6,
random_state=0,
verbose=False,
**kwargs,
)
Fits catboost.CatBoostRegressor. Requires macroforecast[catboost].
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of boosting stages. |
|
|
yes |
Shrinkage per stage. |
|
|
yes |
Tree depth. |
|
|
fixed by preset |
Boosting random seed. |
|
|
fixed by preset |
CatBoost console output flag. |
Preset spaces match gradient_boosting. Default model-selection method:
random.
Macro-Specific Tree Models#
quantile_regression_forest#
macroforecast.models.quantile_regression_forest(
X,
y,
*,
n_estimators=200,
max_depth=None,
min_samples_leaf=1,
random_state=0,
quantile_levels=(0.05, 0.5, 0.95),
)
Fits a random forest and stores per-leaf training-target distributions. The
underlying estimator exposes predict_quantiles(X, levels=None). The
forecasting runner stores those outputs in the quantile_predictions column
as per-row dictionaries keyed by quantile level.
The fitted wrapper also exposes sklearn forest feature importances through
fit.diagnostics["feature_importance"].
Quantiles use tree-equal leaf weighting: within each tree, all training observations that share the test row’s terminal leaf receive equal weight, and each tree contributes the same total weight. This avoids letting large leaves dominate the empirical quantile solely because they contain more observations.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
Number of trees. |
|
|
yes |
Maximum depth per tree. |
|
|
yes |
Minimum samples per terminal leaf. |
|
|
fixed by preset |
Forest random seed. |
|
|
fixed by preset |
Default levels returned by |
Preset spaces match random_forest. Default model-selection method: random.
macro_random_forest#
macroforecast.models.macro_random_forest(
X,
y,
*,
x_columns=None,
S_columns=None,
x_pos=None,
S_pos=None,
y_pos=0,
B=50,
minsize=10,
mtry_frac=1/3,
min_leaf_frac_of_x=1.0,
VI=False,
ERT=False,
quantile_rate=None,
S_priority_vec=None,
random_x=False,
trend_push=1,
howmany_random_x=1,
howmany_keep_best_VI=20,
cheap_look_at_GTVPs=True,
prior_var=None,
prior_mean=None,
subsampling_rate=0.75,
rw_regul=0.75,
keep_forest=False,
block_size=12,
fast_rw=True,
ridge_lambda=0.1,
HRW=0,
resampling_opt=2,
print_b=False,
parallelise=False,
n_cores=1,
**kwargs,
)
Adapter for Ryan Lucas’s MacroRandomForest reference backend. The reference
implementation is vendored from MacroRandomForest 1.0.6 under the MIT
license, with source attribution in
macroforecast.models._mrf_reference. Install the optional runtime
dependencies with macroforecast[macro_random_forest]. The adapter fits on
the in-sample X/y and calls the reference _ensemble_loop() during
predict(X_test). Repeated calls to predict() with the same test matrix
reuse the previous reference-backend output, so repeated result materialization
does not rerun the expensive forest loop.
If the reference backend returns multiple prediction columns for the requested test rows, the adapter averages them row by row. If the backend returns no recognized prediction field or fewer than the requested number of predictions, the adapter raises a runtime error instead of silently returning a misaligned forecast vector.
By default all columns in X are used both as the time-varying linear equation
variables (x_columns) and the forest state variables (S_columns). Pass
x_columns and S_columns when those sets should differ.
The reference backend distinguishes two predictor sets:
Argument |
Role |
|---|---|
|
Columns in the local linear forecasting equation. These are the variables whose coefficients are allowed to vary over time. |
|
State variables used by the forest to split the sample and estimate those local coefficients. |
Use either column names or reference-style positions for each role. Passing both
x_columns and x_pos, or both S_columns and S_pos, raises an error rather
than silently prioritizing one selector.
For example, a compact MRF can use a small local-linear equation but a wider state vector for the tree:
fit = macroforecast.models.macro_random_forest(
X_train,
y_train,
x_columns=["INDPRO_lag0", "UNRATE_lag0"],
S_columns=[
"INDPRO_lag0",
"UNRATE_lag0",
"CPIAUCSL_lag0",
"FEDFUNDS_lag0",
"S&P500_lag0",
],
B=50,
minsize=10,
mtry_frac=1.0,
ridge_lambda=0.1,
rw_regul=0.75,
parallelise=False,
print_b=False,
)
pred = fit.predict(X_test)
With the forecasting runner, pass model parameters through the model-keyed
params mapping. If you want fixed parameters rather than model-owned tuning,
also disable model selection for this model:
features = macroforecast.feature_engineering.feature_spec(
target="INDPRO",
horizon=1,
predictors=["UNRATE", "CPIAUCSL", "FEDFUNDS", "S&P500"],
lags=(0, 1),
)
window = macroforecast.window.spec(
estimation=macroforecast.window.estimation_expanding(min_size=120),
val=macroforecast.window.val_last_block(size=24),
test=macroforecast.window.test_origins(horizon=1, step=1),
)
result = macroforecast.forecasting.run(
panel,
"macro_random_forest",
window=window,
features=features,
params={
"macro_random_forest": {
"x_columns": ["UNRATE_lag0", "FEDFUNDS_lag0"],
"S_columns": [
"UNRATE_lag0",
"UNRATE_lag1",
"CPIAUCSL_lag0",
"FEDFUNDS_lag0",
"S&P500_lag0",
],
"B": 100,
"minsize": 10,
"mtry_frac": 1.0,
"parallelise": False,
"print_b": False,
}
},
model_selection={"macro_random_forest": None},
)
The reference implementation is sensitive to panel shape. Use numeric,
non-missing features after preprocessing and feature engineering. Keep at
least one x_columns variable, and prefer at least five S_columns variables;
with very small state sets, set mtry_frac=1.0 so at least one state variable
is considered at each split. Small training samples can also fail when
minsize is too large relative to the number of local-linear variables.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
fixed by preset |
Feature columns in the time-varying linear equation. |
|
|
fixed by preset |
Feature columns used as forest state variables. |
|
|
fixed by preset |
Reference-package predictor positions after the target column. |
|
|
fixed by preset |
Reference-package state positions after the target column. |
|
|
fixed by preset |
Fixed target position for the separated |
|
|
yes |
Number of MRF trees. |
|
|
yes |
Minimum node size before split attempts. |
|
|
yes |
Fraction of state variables considered at each split. |
|
|
yes |
Minimum leaf-size multiplier relative to local x dimension. |
|
|
fixed by preset |
Enable variable-importance split search mode. |
|
|
fixed by preset |
Enable extremely randomized tree split mode. |
|
|
fixed by preset |
Optional quantile rate for quantile-oriented output. |
|
|
fixed by preset |
Optional priority weights over state variables. |
|
|
fixed by preset |
Use random subsets of local-linear predictors. |
|
|
fixed by preset |
Reference-package trend-push option. |
|
|
fixed by preset |
Number of random local-linear predictor draws. |
|
|
fixed by preset |
Number of best VI candidates retained. |
|
|
fixed by preset |
Use the reference package’s cheaper GTVP inspection. |
|
|
fixed by preset |
Optional prior variances for local coefficients. |
|
|
fixed by preset |
Optional prior means for local coefficients. |
|
|
yes |
Subsample share used by each tree. |
|
|
yes |
Random-walk shrinkage strength. |
|
|
fixed by preset |
Keep full reference forest object in memory. |
|
|
fixed by preset |
Reference-package block size for time-series resampling. |
|
|
fixed by preset |
Use fast random-walk regularization path. |
|
|
yes |
Ridge penalty for local linear fits. |
|
|
fixed by preset |
Reference-package hierarchical random-walk option. |
|
|
yes |
Reference MRF resampling option. |
|
|
fixed by preset |
Whether to use reference-package parallel execution. |
|
|
fixed by preset |
Worker count for the reference package. |
|
|
fixed by preset |
Reference-package progress printing. |
The MRF presets tune B, minsize, mtry_frac,
min_leaf_frac_of_x, subsampling_rate, rw_regul, ridge_lambda, and
resampling_opt; inspect the exact candidate lists with
describe_model("macro_random_forest").
Volatility Models#
Volatility model fits return VolatilityFit. In addition to
predict_variance(horizon=...), their diagnostics include fitted parameter
estimates under params and the in-sample conditional_volatility path when
the backend exposes it.
These models are for volatility/variance forecasting, not ordinary conditional
mean macro forecasting. They accept a univariate return-like series y and
return both a point-mean prediction interface and a variance forecast interface.
Function |
Implementation |
R/source comparison |
Boundary |
|---|---|---|---|
|
Optional Python |
Same surface as |
Backend controls solver details, distribution aliases, convergence behavior, and forecast internals. |
|
Optional Python |
Same surface as |
Backend controls solver details, distribution aliases, convergence behavior, and forecast internals. |
|
Internal compact Gaussian log-linear MLE. |
Aligned with the p=q=1 Hansen-Huang-Shek / |
Not a full |
Common Output#
fit = macroforecast.models.garch11(y)
variance = fit.predict_variance(horizon=12)
sigma = fit.conditional_volatility
metadata = fit.to_metadata()
Output |
Type |
Meaning |
|---|---|---|
|
|
Fitted wrapper with |
|
|
Conditional mean prediction. For these models this is usually a constant mean from the volatility backend. |
|
|
Variance forecast indexed from |
|
|
In-sample conditional volatility path if available. |
|
|
Fitted parameters. Names depend on the backend/model. |
|
|
Same path as |
garch11#
macroforecast.models.garch11(
y,
*,
X=None,
p=1,
q=1,
mean_model="constant",
dist="normal",
rescale=False,
)
Fits GARCH using the optional arch package. Requires macroforecast[arch].
The default is GARCH(1,1):
For higher p/q, the lag orders are passed directly to
arch.arch_model(vol="GARCH", p=p, q=q).
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
GARCH innovation lag order. |
|
|
yes |
GARCH variance lag order. |
|
|
manual |
Conditional mean model. |
|
|
yes |
Innovation distribution. |
|
|
fixed by preset |
|
Preset |
|
|
|
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Implementation notes:
Item |
Value |
|---|---|
Backend |
|
Required extra |
|
R comparison |
|
Internal likelihood? |
No. macroforecast validates orders, passes inputs to |
Minimum data |
30 non-missing observations. |
egarch#
macroforecast.models.egarch(
y,
*,
X=None,
p=1,
o=0,
q=1,
mean_model="constant",
dist="normal",
rescale=False,
)
Fits EGARCH using the optional arch package. Requires macroforecast[arch].
The backend receives:
arch.arch_model(y, vol="EGARCH", p=p, o=o, q=q, mean=mean_model, dist=dist)
For EGARCH(1,1), the log-variance structure is backend-defined by arch;
conceptually it is the exponential GARCH family where log variance reacts to
standardized shock magnitude and asymmetry terms rather than modeling variance
directly in levels.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
yes |
EGARCH innovation lag order. |
|
|
yes |
Asymmetric innovation lag order. |
|
|
yes |
EGARCH variance lag order. |
|
|
manual |
Conditional mean model. |
|
|
yes |
Innovation distribution. |
|
|
fixed by preset |
|
Preset |
|
|
|
|
|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Implementation notes:
Item |
Value |
|---|---|
Backend |
|
Required extra |
|
R comparison |
|
Internal likelihood? |
No. macroforecast validates orders, passes inputs to |
Minimum data |
30 non-missing observations. |
realized_garch#
macroforecast.models.realized_garch(
y,
*,
X=None,
rv=None,
realized_variance=None,
max_iter=2000,
n_starts=5,
random_state=0,
)
Fits a compact p=q=1 Gaussian log-linear realized-GARCH joint likelihood.
Provide rv directly or set realized_variance to the column in X
containing the realized measure. If neither is supplied, macroforecast uses
y ** 2 as an explicit rv_proxy so the caller can still inspect the model
contract. For empirical realized-GARCH work, pass a true realized variance or
realized volatility measure.
The implemented state and measurement equations are:
This matches the compact rugarch realGARCH recursion in
rugarch/src/filters.c::realgarchfilter() for the p=q=1 case:
lagged log realized volatility enters through alpha, lagged log latent
variance enters through beta, and the measurement equation uses
xi, delta, eta1, and eta2. The stationarity-style persistence diagnostic
is:
The multi-step variance forecast uses the conditional expectation recursion:
the first step uses the latest observed realized measure, then future
tau(z_t) and measurement shocks are set to zero so
E[\log x_t \mid h_t] = \xi + \delta \log h_t.
Parameter |
Default |
Tunable |
Meaning |
|---|---|---|---|
|
|
manual |
Column name for realized variance. |
|
|
fixed by preset |
Optimizer iteration cap. |
|
|
yes |
Number of optimizer starting points. |
|
|
fixed by preset |
Optimizer random seed. |
Preset |
|
|---|---|
|
|
|
|
|
|
Implementation notes:
Item |
Value |
|---|---|
Backend |
Internal SciPy |
Required extra |
None beyond base SciPy stack. |
R comparison |
Compact p=q=1 version of |
Parameter names |
|
Restrictions |
|
Minimum data |
30 aligned observations of |
Example:
fit = macroforecast.models.realized_garch(
returns,
rv=realized_variance,
max_iter=2000,
n_starts=5,
random_state=0,
)
fit.diagnostics["params"]
fit.predict_variance(horizon=12)
Omitted From The Clean Model API#
Legacy name |
Decision |
|---|---|
|
Removed. Use |
|
Removed. Use |
var_select_order– VAR lag-order selection by AIC/BIC/HQ/FPE (vars::VARselect), via statsmodelsVAR.select_order.gjr_garch– GJR-GARCH (Glosten-Jagannathan-Runkle) asymmetric/leverage volatility (arch GARCH, o>0; rugarch gjrGARCH).tgarch– Threshold GARCH (TGARCH/Zakoian), absolute-value (power=1) asymmetric volatility.risk_forecast– Value-at-Risk and Expected Shortfall forecast from a fitted volatility model (Normal / standardized-t).value_at_risk– lower-tail VaR return quantile(s) from a fitted volatility model.expected_shortfall– Expected Shortfall (mean return below VaR) from a fitted volatility model.news_impact_curve– Engle-Ng (1993) news impact curve: conditional variance vs lagged shock for a fitted GARCH-family model.garch_roll– rolling one-step volatility / VaR backtest with periodic refit and coverage summary (rugarch::ugarchroll).var_roots– VAR stability check: moduli of the companion-matrix eigenvalues, spectral radius, and is_stable (vars::roots).var_restrict– restricted VAR by sequential elimination of insignificant regressors with restriction matrix (vars::restrict).arima– (seasonal) ARIMA model via statsmodels, order (p,d,q) and seasonal_order (P,D,Q,m).auto_arima– automatic (seasonal) ARIMA order selection (forecast::auto.arima): KPSS-based d, AICc grid over (p,q[,P,Q]).
arima#
macroforecast.models.arima(y, *, order=(1, 0, 0), seasonal_order=(0, 0, 0, 0), trend=None)
auto_arima#
macroforecast.models.auto_arima(y, *, max_p=5, max_q=5, max_d=2, seasonal=False, m=1, ic="aicc", trend=None)
gjr_garch#
macroforecast.models.gjr_garch(y, *, X=None, p=1, o=1, q=1, mean_model="constant", dist="normal", rescale=False)
tgarch#
macroforecast.models.tgarch(y, *, X=None, p=1, o=1, q=1, mean_model="constant", dist="normal", rescale=False)