# macroforecast.models [Back to reference](index.md) `macroforecast.models` contains direct callable model fits. Each function accepts pandas data, fits immediately, and returns a fitted result object with `predict()`. `lasso_path` is intentionally not a public model family. Use `lasso()` with a chosen `alpha`, or use `get_model("lasso")` with `model_selection.select_params()` to choose `alpha` from the lasso-owned search space. ## Return Objects Model functions return `ModelFit`. ```python macroforecast.models.ModelFit( estimator, model, feature_names=(), target_name=None, metadata={}, diagnostics={}, ) ``` | Attribute | Type | Description | | --- | --- | --- | | `estimator` | object | Fitted underlying estimator. | | `model` | str | Canonical model name. | | `feature_names` | tuple[str, ...] | Feature columns used at fit time. | | `target_name` | str or `None` | Target name when available. | | `metadata` | dict | Fit metadata such as `n_obs`, `alpha`, or tree budget. | | `diagnostics` | dict | Model-specific fitted diagnostics that can be collected safely. | `ModelFit.predict(X)` returns a `pandas.Series` named `"prediction"` and keeps the index of the provided `X` when `X` is a DataFrame. For custom fitted objects used in `forecasting.run(...)`, `predict(X_test)` may return an array-like object or a pandas `Series`/single-column `DataFrame`. Array-like output is treated positionally. Pandas output must either be indexed exactly like `X_test.index` or use the default positional index `RangeIndex(len(X_test))`. Any other index raises an error rather than silently creating missing forecasts. The same DataFrame index rule applies to `predict_quantiles(X_test)` when a fitted object exposes quantile forecasts. `ModelFit.to_dict()` returns JSON-ready fit metadata. It records the canonical model name, the underlying estimator class name, the fitted feature names, the target name, `n_features`, the fit `metadata`, and JSON-ready `diagnostics`. It does not serialize the fitted estimator itself. ```python fit = macroforecast.models.ridge(X, y, alpha=0.5) fit.to_dict() ``` Output shape: ```python { "model": "ridge", "estimator": "sklearn.linear_model._ridge.Ridge", "feature_names": ["x1", "x2"], "target_name": "y", "n_features": 2, "metadata": {"n_obs": 120, "alpha": 0.5}, "diagnostics": { "fitted_values": {"name": "fitted", "index": [...], "data": [...]}, "residuals": {"name": "residual", "index": [...], "data": [...]}, "metrics": {"n": 120, "mae": 0.04, "mse": 0.003, "rmse": 0.055}, "coefficients": {"name": "coefficient", "index": ["x1", "x2"], "data": [...]}, "selected_features": ["x1", "x2"], }, } ``` `ModelFit.to_metadata()` wraps the same block under `{"model": ..., "fit": ...}` for downstream forecasting and result records. Volatility functions return `VolatilityFit`, which extends `ModelFit` with `predict_variance(horizon=1)` and `conditional_volatility`. Some model families expose backend estimator classes as public symbols. These are useful when users need estimator-native attributes, custom wrappers, or type checks; the normal user entry point remains the lowercase fit function. | Estimator class | Fit function | Meaning | | --- | --- | --- | | `MARSRegressor` | `mars(...)` | Internal MARS-style spline regressor. | | `LGBPlusRegressor` | `lgb_plus(...)` | Competition-based LGB+ hybrid tree/linear boosting backend. | | `LGBAPlusRegressor` | `lgba_plus(...)` | Alternating LGB^A+ hybrid tree/linear boosting backend. | | `QuantileRegressionForestRegressor` | `quantile_regression_forest(...)` | Quantile forest backend. | | `MacroRandomForestRegressor` | `macro_random_forest(...)` | Macro Random Forest backend. | | `SupervisedPCARegressor` | `supervised_pca(...)` | Supervised PCA backend. | | `SupervisedScaledPCARegressor` | `supervised_scaled_pca(...)` | Supervised scaled-PCA backend. | | `GARCHEstimator` | `garch11(...)` and `egarch(...)` | ARCH/GARCH volatility backend. | | `RealizedGARCHEstimator` | `realized_garch(...)` | Realized-GARCH backend. | | `SupervisedAggregationRegressor` | `supervised_aggregation(...)` and wrappers | Generic assemblage/Albacore-style constrained aggregation backend. | ### Fit Persistence `models` owns the low-level persistence format for fitted model objects. Forecasting runners decide which fitted object should be saved and which window, selection, and parameter metadata should be attached. ```python saved = macroforecast.models.save_fit( fit, "trained_model/ridge/origin_0_h1_20000131.pkl", metadata={ "alias": "ridge", "params": {"alpha": 0.1}, "model_selection": selection_metadata, }, ) loaded = macroforecast.models.load_fit(saved.model_path) ``` | Function | Input | Output | Meaning | | --- | --- | --- | --- | | `save_fit(fit, model_path, metadata_path=None, metadata=None)` | Fitted model object and output paths. | `SavedModel` | Writes pickle plus JSON sidecar. | | `load_fit(model_path)` | Pickle path. | fitted object | Loads a saved fit. | `SavedModel.to_dict()` returns `model_path`, `metadata_path`, and `save_error`. If a custom/local model cannot be pickled, `model_path` is `None`, `save_error` records the failure, and the JSON sidecar is still written. The sidecar always includes the available `fit.to_metadata()` block when the fit exposes it. ### Fit Diagnostics Diagnostics are collected on a best-effort basis. A model only records values that the fitted backend exposes and that can be computed without changing the fit. Missing keys mean the model does not expose that diagnostic, not that the fit failed. Common keys: | Key | Recorded when | Meaning | | --- | --- | --- | | `fitted_values` | Estimator exposes `predict()` on the training matrix. | In-sample fitted values indexed like the aligned target. | | `residuals` | `fitted_values` is available. | Training residuals, `y - fitted`. | | `metrics` | `residuals` is available. | Residual count, mean, standard deviation, MAE, MSE, and RMSE. | | `coefficients` | Estimator exposes `coef_`. | Coefficients indexed by feature name when possible. | | `intercept` | Estimator exposes `intercept_`. | Scalar or list intercept. | | `selected_features` | Nonzero coefficients or estimator selection metadata is available. | Selected feature names. | | `feature_importance` | Estimator exposes `feature_importances_`. | Tree-style importances sorted descending. | | `factor_loadings` | Estimator exposes factor loadings, loadings, or components. | Factor/PCA loading matrix when available. | | `component_selected_features` | Estimator exposes component-level selection metadata. | Selected source features for supervised PCA-style components. | | `training_history` | Estimator records iterative training history. | Epoch loss or backend-specific training trace. | | `conditional_volatility` | Volatility estimator exposes fitted conditional volatility. | In-sample conditional volatility path. | | `params` | Volatility estimator exposes fitted parameter estimates. | Fitted volatility-model parameters. | Example: ```python fit = macroforecast.models.random_forest(X, y, n_estimators=200) fit.diagnostics["feature_importance"].head() ``` ## Model Specs And Hyperparameter Spaces Model functions fit immediately. Model specs are the model-selection objects: they keep the fit callable together with model-owned defaults, tunable parameters, and preset search spaces. ```python model = macroforecast.models.get_model("lasso", preset="standard") result = macroforecast.model_selection.select_params( model, X, y, window=macroforecast.window.expanding(min_train_size=120), metric=macroforecast.metrics.rmse, ) fit = model(X, y, **result.best_params) ``` ### ModelSpec ```python macroforecast.models.ModelSpec( name, family, fit_func, default_params={}, parameters=(), search_spaces={}, default_search_method="grid", default_preset="standard", input_kind="supervised", preset="standard", params={}, backend="internal", requires_extra=None, requires_scaling=False, recommended_preprocessing=(), ) ``` | Attribute | Type | Meaning | | --- | --- | --- | | `name` | str | Canonical model name. | | `family` | str | Model family such as `linear`, `tree`, `factor`, or `volatility`. | | `fit_func` | callable | Underlying fit function. | | `default_params` | dict | Model-owned default keyword arguments. | | `parameters` | tuple | `ModelParameter` descriptions. | | `search_spaces` | dict | Preset-specific hyperparameter candidates. | | `default_search_method` | str | Search method normally used for the model. | | `default_preset` | str | Default hyperparameter preset. | | `input_kind` | str | Input convention: `supervised`, `target`, `panel`, or `volatility`. | | `preset` | str | Active search-space preset. | | `params` | dict | User-fixed model parameters. | | `backend` | str | Implementation backend, for example `sklearn.svm.SVR` or `torch.nn.LSTM`. | | `requires_extra` | str or `None` | Optional dependency extra required to fit the model. | | `requires_scaling` | bool | Whether the model is scale-sensitive and expects explicit preprocessing. | | `recommended_preprocessing` | tuple[str, ...] | Short preprocessing notes attached to metadata. | `ModelSpec` is callable: ```python model = macroforecast.models.get_model("ridge", params={"alpha": 0.5}) fit = model(X, y) ``` `ModelSpec.to_dict()` returns a detailed JSON-ready specification including defaults, fixed params, parameter descriptions, and all preset search spaces. `ModelSpec.to_metadata()` returns the compact runner-facing block: ```python { "model": "ridge", "model_family": "linear", "model_preset": "small", "input_kind": "supervised", "backend": "sklearn.linear_model.Ridge", "requires_extra": None, "requires_scaling": False, "recommended_preprocessing": [], "default_search_method": "cv_path", "default_params": {"alpha": 1.0}, "params": {"alpha": 0.5}, "search_space": {"alpha": [0.01, 0.1, 1.0]}, } ``` ### get_model ```python macroforecast.models.get_model(model, *, preset=None, params=None) ``` | Input | Type | Meaning | | --- | --- | --- | | `model` | str, callable, or `ModelSpec` | Model name, registered model callable, or existing spec. | | `preset` | str or `None` | Search-space preset to attach. | | `params` | dict or `None` | Fixed model parameters to attach. | | Output | Type | Meaning | | --- | --- | --- | | return | `ModelSpec` | Callable model spec with model-owned defaults and spaces. | ### custom_model Build a user-owned `ModelSpec` without registering a package model. ```python macroforecast.models.custom_model( name: str, fit_func, *, family: str = "custom", default_params: Mapping[str, object] | None = None, parameters: tuple[ModelParameter, ...] = (), search_spaces: dict[str, dict[str, tuple[object, ...]]] | None = None, default_search_method: str = "grid", default_preset: str = "standard", input_kind: str = "supervised", backend: str = "custom", requires_extra: str | None = None, requires_scaling: bool = False, recommended_preprocessing: tuple[str, ...] = (), description: str | None = None, ) -> ModelSpec ``` ### Callable Contract The default supervised contract is: ```python fit_func(X: pandas.DataFrame, y: pandas.Series, **params) -> fitted_object ``` The fitted object must expose: ```python fitted_object.predict(X_test) ``` `predict(X_test)` may return a pandas `Series`, a single-column `DataFrame`, or an array-like object with length `len(X_test)`. Pandas output must either use `X_test.index` or `RangeIndex(len(X_test))`; any other index is rejected by `forecasting.run(...)`. Set `input_kind` when the custom model follows another convention: | `input_kind` | Fit callable receives | Use case | | --- | --- | --- | | `"supervised"` | `fit_func(X, y, **params)` | Regression-style models. | | `"target"` | `fit_func(y, **params)` | Target-only time-series models. | | `"panel"` | `fit_func(panel, **params)` | Panel-input models. | | `"volatility"` | `fit_func(y, X=None, **params)` | Volatility or density models. | `search_spaces` uses the same model-owned preset contract as registered models: ```python model = mf.models.custom_model( "mean_model", mean_model, default_params={"offset": 0.0}, search_spaces={ "small": {"offset": (-0.1, 0.0, 0.1)}, "standard": {"offset": (-0.5, 0.0, 0.5)}, }, ) result = mf.forecasting.run( panel, {"mean": model}, window=window, features=features, preset={"mean": "small"}, ) ``` `custom_model()` does not mutate the global registry. Pass the returned `ModelSpec` directly to `forecasting.run(...)`, `model_selection.select_params(...)`, or `model_search_space(...)`. ### list_model_specs ```python macroforecast.models.list_model_specs(family=None) ``` Returns a DataFrame with one row per registered model: `name`, `family`, `input_kind`, `backend`, `requires_extra`, `requires_scaling`, `recommended_preprocessing`, `default_search_method`, `default_preset`, available `presets`, and `n_tunable`. ### describe_model ```python macroforecast.models.describe_model(model) ``` Returns a DataFrame with parameter-level documentation and preset search spaces. Example: | parameter | default | tunable | small_space | standard_space | | --- | --- | --- | --- | --- | | `alpha` | `1.0` | `True` | `(0.01, 0.1, 1.0)` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | | `max_iter` | `20000` | `False` | `None` | `None` | ### model_search_space ```python macroforecast.models.model_search_space(model, *, preset=None) ``` Returns the model-owned candidate dictionary for the selected preset. `MODEL_SPECS` is the public registry backing `get_model(...)`, `list_model_specs()`, `describe_model(...)`, and `model_search_space(...)`. ```python macroforecast.models.model_search_space("random_forest", preset="small") ``` Output: ```python { "n_estimators": (50, 100), "max_depth": (3, 5, None), "min_samples_leaf": (1, 3), } ``` ### Presets | Preset | Purpose | | --- | --- | | `small` | Fast smoke tests and short interactive checks. | | `standard` | Default analysis-scale search space. | | `wide` | Larger search space for more expensive runs. | ## Input And Output Conventions | Input kind | Callable shape | Use case | | --- | --- | --- | | `supervised` | `model(X, y, **params)` | Most regression, factor, and tree models. Fit-time ensembles use the same shape in `macroforecast.model_ensemble`. | | `target` | `model(y, **params)` | Univariate target-only models such as `ar`. | | `panel` | `model(panel, **params)` | Multivariate time-series models such as `var`. | | `volatility` | `model(y, X=None, **params)` | Return and volatility models. | For supervised models, `X` may be a pandas DataFrame, a 2-D array, or a `FeatureSet`. `y` may be a Series, 1-D array, or one-column DataFrame. If `X` is a `FeatureSet`, `y` can be omitted. All non-volatility model functions return `ModelFit`. Volatility functions return `VolatilityFit`. ## Scaling Policy The clean model API does not silently standardize predictors for models that are traditionally scale-sensitive. Instead, those models advertise `requires_scaling=True` through `ModelSpec`, `list_model_specs()`, `model_search_space()`, and `select_params()` metadata. `lasso(..., standardize=True)` and `elastic_net(..., standardize=True)` are explicit opt-in replication helpers; the default remains `False`. There are two different scaling locations: - `macroforecast.preprocessing.standardize_panel()` standardizes a panel before model fitting. If it is run on the full sample outside the forecasting runner, it uses full-sample moments. In a leakage-safe run, use runner preprocessing specs and window policies so the scaling state is fitted only on allowed rows. - `model(..., standardize=True)` standardizes inside that model's own fit call. It is useful when only selected models need model-local scaling, or when a lasso/elastic-net replication requires the penalty grid to be defined on window-local standardized predictors. For broader model-specific transformations beyond scaling, run separate model pipelines or use a model-pipeline runner layer rather than hiding those transformations inside a single estimator. Current scale-sensitive callable models: | Model | Backend | Scaling policy | | --- | --- | --- | | `svr` | `sklearn.svm.SVR` | Standardize predictors with `preprocessing.standardize_panel()` or a runner preprocessing spec before fitting. | | `linear_svr` | `sklearn.svm.LinearSVR` | Standardize predictors before fitting. | | `nu_svr` | `sklearn.svm.NuSVR` | Standardize predictors before fitting. | | `nn` | `torch.nn.Sequential` | Standardizes `X` and `y` inside each fit window and maps predictions back to target units. | | `transformer` | `torch.nn.TransformerEncoder` | Standardizes `X` and `y` inside each fit window and maps predictions back to target units. | | `hemisphere_nn` | torch dual-head dense network | Standardizes `X` inside each fit window, fits mean and variance heads, and returns point, variance, and normal-approximation quantile forecasts. | | `density_hnn` | torch-native Aionx DensityHNN port | Standardizes `X` and `y`, estimates prior-DNN OOB volatility emphasis, fits a density HNN ensemble, and returns point, variance, volatility, and quantile forecasts. | `nn`, `lstm`, `gru`, `transformer`, and `density_hnn` standardize `X` and `y` inside each fit window and map predictions back to the target scale. `hemisphere_nn` standardizes `X` and keeps the target in original units because its variance head is a compact density-forecast object. Their metadata records `requires_extra="deep"` and `requires_scaling=False`. ## Registered Model Catalog | Model | Family | Input kind | Default search | Presets | | --- | --- | --- | --- | --- | | `ols` | linear | supervised | `grid` | none | | `ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `nonneg_ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `shrink_to_target_ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `fused_difference_ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `supervised_aggregation` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `component_aggregation` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `rank_aggregation` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `assemblage_regression` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `albacore_components` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `albacore_ranks` | assemblage | supervised | `cv_path` | `small`, `standard`, `wide` | | `random_walk_ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `tvp_ridge` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `lasso` | linear | supervised | `cv_path` | `small`, `standard`, `wide` | | `elastic_net` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `adaptive_lasso` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `adaptive_elastic_net` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `group_lasso` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `sparse_group_lasso` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `bayesian_ridge` | linear | supervised | `grid` | none | | `huber` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `kernel_ridge` | nonparametric | supervised | `random` | `small`, `standard`, `wide` | | `knn` | nonparametric | supervised | `random` | `small`, `standard`, `wide` | | `glmboost` | linear | supervised | `grid` | `small`, `standard`, `wide` | | `svr` | support_vector | supervised | `random` | `small`, `standard`, `wide` | | `linear_svr` | support_vector | supervised | `random` | `small`, `standard`, `wide` | | `nu_svr` | support_vector | supervised | `random` | `small`, `standard`, `wide` | | `nn` | neural | supervised | `random` | `small`, `standard`, `wide` | | `lstm` | neural | supervised | `random` | `small`, `standard`, `wide` | | `gru` | neural | supervised | `random` | `small`, `standard`, `wide` | | `transformer` | neural | supervised | `random` | `small`, `standard`, `wide` | | `hemisphere_nn` | neural | supervised | `random` | `small`, `standard`, `wide` | | `density_hnn` | neural | supervised | `random` | `small`, `standard`, `wide` | | `pls` | composite | supervised | `grid` | `small`, `standard`, `wide` | | `scaled_pca` | composite | supervised | `grid` | `small`, `standard`, `wide` | | `supervised_pca` | composite | supervised | `grid` | `small`, `standard`, `wide` | | `supervised_scaled_pca` | composite | supervised | `grid` | `small`, `standard`, `wide` | | `ar` | timeseries | target | `grid` | `small`, `standard`, `wide` | | `var` | timeseries | panel | `grid` | `small`, `standard`, `wide` | | `bvar_minnesota` | timeseries | panel | `grid` | `small`, `standard`, `wide` | | `bvar_normal_inverse_wishart` | timeseries | panel | `grid` | `small`, `standard`, `wide` | | `ets` | timeseries | target | `grid` | none | | `holt_winters` | timeseries | target | `grid` | none | | `theta_method` | timeseries | target | `grid` | none | | `dfm_mixed_mariano_murasawa` | mixed_frequency | panel | `grid` | `small`, `standard`, `wide` | | `midas_almon` | mixed_frequency | supervised | `grid` | `small`, `standard`, `wide` | | `midas_beta` | mixed_frequency | supervised | `grid` | `small`, `standard`, `wide` | | `midas_step` | mixed_frequency | supervised | `grid` | `small`, `standard`, `wide` | | `restricted_midas` | mixed_frequency | supervised | `grid` | `small`, `standard`, `wide` | | `unrestricted_midas` | mixed_frequency | supervised | `grid` | `small`, `standard`, `wide` | | `dfm_unrestricted_midas` | mixed_frequency | panel | `grid` | `small`, `standard`, `wide` | | `far` | factor | supervised | `grid` | `small`, `standard`, `wide` | | `favar` | factor | supervised | `grid` | `small`, `standard`, `wide` | | `decision_tree` | tree | supervised | `grid` | `small`, `standard`, `wide` | | `random_forest` | tree | supervised | `random` | `small`, `standard`, `wide` | | `extra_trees` | tree | supervised | `random` | `small`, `standard`, `wide` | | `gradient_boosting` | tree | supervised | `random` | `small`, `standard`, `wide` | | `mars` | spline | supervised | `random` | `small`, `standard`, `wide` | | `xgboost` | tree | supervised | `random` | `small`, `standard`, `wide` | | `lightgbm` | tree | supervised | `random` | `small`, `standard`, `wide` | | `lgb_plus` | tree | supervised | `random` | `small`, `standard`, `wide` | | `lgba_plus` | tree | supervised | `random` | `small`, `standard`, `wide` | | `catboost` | tree | supervised | `random` | `small`, `standard`, `wide` | | `quantile_regression_forest` | tree | supervised | `random` | `small`, `standard`, `wide` | | `macro_random_forest` | tree | supervised | `random` | `small`, `standard`, `wide` | | `garch11` | volatility | volatility | `grid` | `small`, `standard`, `wide` | | `egarch` | volatility | volatility | `grid` | `small`, `standard`, `wide` | | `realized_garch` | volatility | volatility | `grid` | `small`, `standard`, `wide` | ## Linear Models ### Linear implementation map The `linear` family mixes thin sklearn wrappers, hybrid macroforecast code, and package-native solvers. `backend` in `ModelSpec` records that distinction so metadata exported by `describe_model()`, `list_model_specs()`, and saved `ModelFit` objects is inspectable. | Model | Implementation class | Runtime backend | | --- | --- | --- | | `ols` | external wrapper | `sklearn.linear_model.LinearRegression` | | `ridge` | external wrapper | `sklearn.linear_model.Ridge` | | `lasso` | external wrapper | `sklearn.linear_model.Lasso` | | `elastic_net` | external wrapper | `sklearn.linear_model.ElasticNet` | | `bayesian_ridge` | external wrapper | `sklearn.linear_model.BayesianRidge` | | `huber` | external wrapper | `sklearn.linear_model.HuberRegressor` | | `adaptive_lasso` | hybrid | macroforecast adaptive weights, final `sklearn.linear_model.Lasso` | | `adaptive_elastic_net` | hybrid | macroforecast adaptive weights, final `sklearn.linear_model.ElasticNet` | | `nonneg_ridge` | package-native | augmented ridge design solved by `scipy.optimize.nnls` | | `shrink_to_target_ridge` | package-native | custom objective solved by `scipy.optimize.minimize(method="SLSQP")` | | `fused_difference_ridge` | package-native | custom difference-penalty objective solved by SLSQP | | `supervised_aggregation`, `component_aggregation`, `rank_aggregation`, `assemblage_regression`, `albacore_components`, `albacore_ranks` | package-native | Albacore/assemblage-derived constrained aggregation objectives solved by SLSQP | | `random_walk_ridge` | package-native | expanded time-varying design solved by `numpy.linalg.lstsq` | | `tvp_ridge` | package-native | Python port of `TVPRidge` R `tvp.ridge`, `Zfun`, `dualGRR`, and CV helpers | | `group_lasso` | package-native | proximal-gradient group-lasso solver | | `sparse_group_lasso` | package-native | proximal-gradient sparse-group-lasso solver | | `glmboost` | package-native | componentwise L2 boosting loop | `external wrapper` means the statistical estimator is delegated to an external package and macroforecast only standardizes the callable contract, metadata, diagnostics, and persistence. `hybrid` means macroforecast owns the macro-level algorithmic transformation and delegates the final convex solver. `package-native` means the objective, iteration, or coefficient path logic is implemented inside macroforecast, using NumPy/SciPy only for basic numerical linear algebra or generic optimization. ### R source comparison map The following R sources are the comparison surface for the linear models where macroforecast owns nontrivial logic. These sources are not vendored into the package. They are used as independent algorithm references; the Python code keeps short source cues in comments and implements the corresponding mathematical objective in macroforecast's callable API. | macroforecast model | R package/source to inspect | Comparison target | Current equivalence status | | --- | --- | --- | --- | | `adaptive_lasso` | `glmnet`, `R/glmnet.R`: | Adaptive lasso is expressible in R by computing initial weights and passing them as `penalty.factor` to lasso. | Same fixed-weight objective after macroforecast standardizes `X`, rescales columns, and normalizes weights to mean one; macroforecast fits one chosen `alpha`, while `glmnet` usually builds a lambda path. | | `adaptive_elastic_net` | `glmnet`, `R/glmnet.R`: | Adaptive elastic net uses the same adaptive weights with elastic-net mixing. | Same fixed-weight idea with mean-one penalty weights; macroforecast delegates the final fit to sklearn `ElasticNet` rather than glmnet's path solver. | | `nonneg_ridge` | `nnls`, `R/nnls.R`: ; also `glmnet` lower bounds | NNLS solves least squares under coefficient non-negativity. | Equivalent to NNLS on the augmented design `[X; sqrt(alpha) I]` and response `[y; 0]`, after optional centering. | | `shrink_to_target_ridge` | `penalized`: ; target ridge family in `rags2ridges`: | Compare target-shrinkage/tikhonov logic, not an identical regression API. | No exact same R regression callable found. macroforecast solves `||y-Xb||^2 + alpha ||b-b0||^2` with optional simplex/nonnegative constraints. | | `fused_difference_ridge` | fused L2 ridge family in `rags2ridgesFused.R`: | Compare L2 fusion/smoothness penalty structure. | Not identical domain: R source is primarily fused ridge for precision matrices; macroforecast applies an L2 finite-difference penalty directly to regression coefficients. | | `component_aggregation`, `albacore_components` | `assemblage`, `R/assemblage_v240228.R`: `nonneg.ridge.sum1` | Nonnegative component weights, optional target-weight shrinkage, sum-to-one basket constraint. | Same fixed-alpha objective family as the R CVXR fit: SSE plus feature-std-scaled target shrinkage, `w >= 0`, and `sum(w)=1`. R owns block CV for lambda; macroforecast delegates alpha selection to model selection/forecasting. | | `rank_aggregation`, `albacore_ranks` | `assemblage`, `R/assemblage_v240228.R`: `x.transformation`, `nonneg.ridge.meanD` | Sort components into rank space, estimate nonnegative smooth rank weights with a mean-matching constraint. | Same fixed-alpha rank objective family: row sorting, fused difference penalty on scaled rank weights, `w >= 0`, and `mean(Xw)=mean(y)`. | | `supervised_aggregation`, `assemblage_regression` | `assemblage`, `R/assemblage_v240228.R`: `assemblage`, `nonneg.ridge`, `nonneg.ridge.mean`, `nonneg.ridge.sum1`, `nonneg.ridge.meanD` | Generic component/rank supervised aggregation. | Exposes the reusable primitives without requiring inflation data. Paper-specific inflation semantics live in the `albacore_*` wrappers. | | `random_walk_ridge` | `walker`, `R/walker.R`: | Random-walk coefficients in a time-varying regression. | Same modeling prior idea, different inference: `walker` is Bayesian/state-space via Stan; macroforecast computes the penalized least-squares MAP-style coefficient path and predicts with the final vector. | | `tvp_ridge` | `TVPRidge`, local source `wiki/raw/paper_code/coulombe_site_github_20260530/tvpridge/R/MV2SRR_v210407.R`; upstream | Goulet Coulombe TVP ridge / two-step ridge regression. | Direct Python port of the R `tvp.ridge` pipeline: `Zfun` expansion, `dualGRR` dual/primal generalized ridge, R-style random-fold CV helpers, 2SRR coefficient-innovation reweighting, residual-volatility reweighting, and R return fields. | | `group_lasso` | `grpreg`, `R/grpreg.R`: | Group penalty over coefficient blocks. | Same group-lasso penalty family for Gaussian loss; macroforecast uses a single-alpha proximal-gradient solver rather than a full regularization path. | | `sparse_group_lasso` | `sparsegl`, `R/sparsegl.R`: | Sparse group lasso objective with group and feature-level penalties. | Same penalty decomposition; macroforecast uses one selected `alpha` and `l1_ratio`, while `sparsegl` is a full path solver with additional bounds/families. | | `glmboost` | `mboost`, `R/mboost.R`: ; component learner in `R/bolscw.R`; Goulet Coulombe et al. (2021) Appendix A.6 for random candidate sampling | Componentwise gradient/L2 boosting. | Same Gaussian componentwise L2 update: center predictors by default, select the base learner by normalized correlation, and apply shrinkage. The paper's per-step random candidate rule is expressed as `candidate_sampling="random"`, `candidate_fraction=1/3`, `candidate_cap=200`, `candidate_rounding="floor"`, not as a hidden preset. macroforecast omits formula handling, weights, families, hat values, and stopping machinery. | The direct implementations should be reviewed against the objective, scaling, intercept handling, penalty normalization, and solver stopping rule separately. Matching an R package name is not enough: several R implementations solve a path problem, a Bayesian state-space problem, or a matrix-estimation problem, whereas macroforecast exposes a single callable forecasting estimator. ### ols ```python macroforecast.models.ols(X, y) ``` Fits ordinary least squares. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Backend | `sklearn.linear_model.LinearRegression` | | Default params | none | | Tunable params | none | | Preset search spaces | none | ### ridge ```python macroforecast.models.ridge(X, y, *, alpha=1.0) ``` Fits ridge regression with an L2 penalty. Backend: `sklearn.linear_model.Ridge`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | L2 penalty strength. | | Preset | `alpha` | | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0)` | Default model-selection method: `cv_path`. ### nonneg_ridge ```python macroforecast.models.nonneg_ridge(X, y, *, alpha=1.0, fit_intercept=True) ``` Fits ridge regression with coefficients constrained to be non-negative. This uses SciPy NNLS on an augmented ridge design, so it does not require `cvxpy`. Backend: package-native augmented ridge design plus `scipy.optimize.nnls`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | L2 penalty strength. | | `fit_intercept` | `True` | fixed by preset | Fit an intercept outside the constrained coefficients. | Default model-selection method: `cv_path`. ### shrink_to_target_ridge ```python macroforecast.models.shrink_to_target_ridge( X, y, *, alpha=1.0, prior_target=None, simplex=False, nonneg=False, fit_intercept=True, max_iter=1000, tol=1e-9, ) ``` Fits a ridge-type model where coefficients are shrunk toward a user-specified target vector. `prior_target` can be a scalar, a sequence ordered like `X` columns, or a mapping from column name to target coefficient. If `prior_target=None`, the target is zero, except under `simplex=True`, where the target is a uniform coefficient vector. `simplex=True` constrains coefficients to sum to one and uses no intercept; `nonneg=True` also enforces non-negative coefficients. The solver is SciPy SLSQP. Backend: package-native objective plus `scipy.optimize.minimize(method="SLSQP")`. R comparison: this is a regression analogue of target-ridge/Tikhonov shrinkage. `rags2ridges` uses target ridge for covariance and precision matrices, not a direct `X, y` regression callable, but the same target idea is present: shrink an estimated parameter object toward a target rather than toward zero. In the unconstrained regression case, macroforecast solves ```text min_beta ||y - X beta||^2 + alpha ||beta - beta0||^2 ``` with closed-form normal equation ```text (X'X + alpha I) beta = X'y + alpha beta0 ``` after optional centering for the intercept. `simplex=True` changes the problem into a forecast-combination form: coefficients must sum to one, no intercept is fit, and `prior_target=None` means a uniform target vector. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Strength of shrinkage toward `prior_target`. | | `prior_target` | `None` | fixed by preset | Scalar, sequence, mapping, or `None`. | | `simplex` | `False` | fixed by preset | Constrain coefficients to sum to one. | | `nonneg` | `False` | fixed by preset | Constrain coefficients to be non-negative. | | `fit_intercept` | `True` | fixed by preset | Fit an intercept unless `simplex=True`. | | `max_iter` | `1000` | fixed by preset | SLSQP iteration cap. | | `tol` | `1e-9` | fixed by preset | SLSQP tolerance. | Default model-selection method: `cv_path`. ### fused_difference_ridge ```python macroforecast.models.fused_difference_ridge( X, y, *, alpha=1.0, difference_order=1, mean_equality=False, nonneg=False, fit_intercept=True, max_iter=1000, tol=1e-9, ) ``` Fits ridge regression with a finite-difference penalty on adjacent coefficients. This is useful when columns have an ordered meaning such as lag age, maturity, or horizon and neighboring coefficients should vary smoothly. `difference_order=1` penalizes first differences; larger orders penalize higher order coefficient curvature. `mean_equality=True` adds a conservation-style constraint that the fitted and observed sums match and uses no intercept. Backend: package-native finite-difference objective plus SLSQP. R comparison: `rags2ridges::ridgeP.fused` uses a fused L2 penalty for related precision matrices. `fused_difference_ridge()` uses the same penalty idea on a single ordered regression-coefficient vector. With no sign or equality constraints, macroforecast solves ```text min_beta ||y - X beta||^2 + alpha ||D beta||^2 ``` where `D` is the finite-difference matrix over adjacent coefficients. The closed-form normal equation is ```text (X'X + alpha D'D) beta = X'y ``` after optional centering for the intercept. `mean_equality=True` is a macro-forecasting conservation variant; it constrains fitted and observed sums to match and is intentionally outside the rags2ridges precision-matrix API. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Strength of the smoothness penalty. | | `difference_order` | `1` | fixed by preset | Finite-difference order applied to coefficients. | | `mean_equality` | `False` | fixed by preset | Constrain fitted and observed sums to match. | | `nonneg` | `False` | fixed by preset | Constrain coefficients to be non-negative. | | `fit_intercept` | `True` | fixed by preset | Fit an intercept unless `mean_equality=True`. | | `max_iter` | `1000` | fixed by preset | SLSQP iteration cap. | | `tol` | `1e-9` | fixed by preset | SLSQP tolerance. | Default model-selection method: `cv_path`. ## Assemblage / Supervised Aggregation This family is derived from Goulet Coulombe, Klieber, Barrette, and Goebel, *Maximally Forward-Looking Core Inflation*, and the R package `assemblage`. The package splits the paper model into generic reusable primitives plus thin inflation-specific wrappers. The generic problem is: ```text given components X_t and future aggregate target y_t,h, learn weights w so X_t w predicts y_t,h ``` This is not ordinary ridge in disguise. The weights can be constrained to be nonnegative, sum to one, match the target mean, shrink toward reference basket weights, or vary smoothly across ranks. For inflation, those weights form an Albacore core-inflation measure. Outside inflation, the same functions can aggregate sectors, states, industries, survey items, or regional indicators. ### supervised_aggregation ```python macroforecast.models.supervised_aggregation( X, y, *, space="component", penalty="ridge", alpha=1.0, reference_weights=None, nonneg=True, simplex=False, mean_match=False, difference_order=1, fit_intercept=False, penalty_scale="feature_std", max_iter=1000, tol=1e-9, ) ``` | Parameter | Default | Choices | Meaning | | --- | --- | --- | --- | | `space` | `"component"` | `"component"`, `"rank"` | Use named components or row-wise sorted order statistics. | | `penalty` | `"ridge"` | `"ridge"`, `"target_shrinkage"`, `"fused_difference"` | Coefficient penalty family. | | `alpha` | `1.0` | nonnegative float | Penalty strength; tune with `model_selection` or `forecasting`. | | `reference_weights` | `None` | mapping, sequence, `Series`, or `None` | Target weights for `target_shrinkage`. | | `nonneg` | `True` | bool | Enforce `w_j >= 0`. | | `simplex` | `False` | bool | Enforce `sum(w)=1`. | | `mean_match` | `False` | bool | Enforce `mean(Xw)=mean(y)`. | | `difference_order` | `1` | positive int | Difference order for fused rank weights. | | `fit_intercept` | `False` | bool | Fit an intercept outside the aggregation weights when no equality constraint is active. | | `penalty_scale` | `"feature_std"` | `"feature_std"`, `"none"` | Match the R assemblage convention by scaling penalties with feature standard deviations. | Output: `ModelFit`. The fitted estimator exposes `coef_`, `weights_`, and, for rank space, `rank_weight_curve_`. Diagnostics include fitted values, residuals, metrics, and coefficient weights. ### component_aggregation ```python macroforecast.models.component_aggregation( X, y, *, alpha=1.0, reference_weights=None, penalty=None, simplex=True, nonneg=True, penalty_scale="feature_std", max_iter=1000, tol=1e-9, ) ``` Component-space aggregation estimates weights on named columns. With `reference_weights` supplied, `penalty=None` selects `target_shrinkage`, making this the generic version of Albacorecomps. Without reference weights, it is a nonnegative simplex ridge basket. R source cue: `nonneg.ridge.sum1` in `assemblage_v240228.R`. ### rank_aggregation ```python macroforecast.models.rank_aggregation( X, y, *, alpha=1.0, penalty="fused_difference", mean_match=True, nonneg=True, difference_order=1, penalty_scale="feature_std", max_iter=1000, tol=1e-9, ) ``` Rank-space aggregation sorts each row of `X` before fitting, then learns weights on `rank_1`, `rank_2`, ... rather than on named components. This is a generic supervised trimmed-mean model. The fitted object stores `estimator.rank_weight_curve_`, a table with rank, percentile, and weight. R source cue: `x.transformation` plus `nonneg.ridge.meanD`. ### assemblage_regression ```python macroforecast.models.assemblage_regression( X, y, *, space="component", alpha=1.0, reference_weights=None, penalty=None, max_iter=1000, tol=1e-9, ) ``` Convenience wrapper over `component_aggregation()` and `rank_aggregation()`. Use it when the model family is known to be assemblage-style but the final choice between component and rank space is part of the experiment design. ### albacore_components ```python macroforecast.models.albacore_components( X, y, *, reference_weights=None, alpha=1.0, max_iter=1000, tol=1e-9, ) ``` Inflation-specific wrapper for component-space Albacore. `X` should be a panel of price-component changes, `y` should be the forward average headline inflation target, and `reference_weights` should be official basket or expenditure weights when available. The wrapper sets `nonneg=True`, `simplex=True`, `penalty="target_shrinkage"`, and `fit_intercept=False`. ### albacore_ranks ```python macroforecast.models.albacore_ranks( X, y, *, alpha=1.0, difference_order=1, max_iter=1000, tol=1e-9, ) ``` Inflation-specific wrapper for rank-space Albacore. `X` should be price component changes and `y` should be the forward average headline inflation target. The wrapper sorts components row by row, estimates nonnegative fused rank weights, and enforces the Albacoreranks mean-matching constraint. ### Low-Level Solver Helpers These return a weight `Series` rather than a full `ModelFit`: | Function | Meaning | | --- | --- | | `solve_nonnegative_ridge(X, y, alpha=...)` | R `nonneg.ridge`-style nonnegative ridge weights. | | `solve_simplex_ridge(X, y, alpha=...)` | Nonnegative weights constrained to sum to one. | | `solve_target_shrinkage_ridge(X, y, reference_weights=..., alpha=...)` | R `nonneg.ridge.sum1`-style component basket weights. | | `solve_mean_aligned_ridge(X, y, alpha=...)` | R `nonneg.ridge.mean`-style nonnegative mean-aligned weights. | | `solve_fused_difference_ridge(X, y, alpha=...)` | R `nonneg.ridge.meanD`-style fused rank-weight primitive. | These helpers are deliberately not inflation-specific. They exist so users can compose custom supervised aggregation models without taking the Albacore wrappers. ### random_walk_ridge ```python macroforecast.models.random_walk_ridge( X, y, *, alpha=1.0, initial_alpha=1.0, fit_intercept=True, ) ``` Fits a time-varying coefficient path with a random-walk penalty: ```text sum_t (y_t - x_t beta_t)^2 + initial_alpha * ||beta_1||^2 + alpha * sum_t ||beta_t - beta_{t-1}||^2 ``` Predictions use the final estimated coefficient vector. The full fitted path is stored on the estimator as `coef_path_`, and standard `ModelFit` diagnostics record the final coefficients, fitted values, and residuals. Backend: package-native expanded design solved by `numpy.linalg.lstsq`. R comparison: `walker::walker_rw1` is the closest R source. It treats coefficients as random-walk state variables and estimates a Bayesian posterior with Stan / state-space smoothing. `random_walk_ridge()` keeps the same RW1 prior idea but solves the penalized least-squares MAP-style objective as one augmented linear system over the full coefficient path: ```text min_{beta_1,...,beta_T} sum_t (y_t - x_t beta_t)^2 + initial_alpha ||beta_1||^2 + alpha sum_t ||beta_t - beta_{t-1}||^2 ``` The fitted `coef_path_` is the estimated path. `predict()` uses only the final coefficient vector, because this callable is a deterministic forecasting model, not a posterior simulation or Kalman-smoothing interface. `fit_intercept=True` centers the fit and recovers a static intercept from the final coefficient vector. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Penalty on adjacent coefficient changes. | | `initial_alpha` | `1.0` | fixed by preset | Penalty on the first coefficient vector. | | `fit_intercept` | `True` | fixed by preset | Fit an intercept outside the time-varying coefficient path. | Default model-selection method: `cv_path`. ### tvp_ridge ```python macroforecast.models.tvp_ridge( X, y, *, lambda_candidates=None, oosX=None, lambda2=0.1, kfold=5, cv_plot=False, cv_2srr=True, sig_u_param=0.75, sig_eps_param=0.75, ols_prior=False, random_state=1071, use_garch=True, ) ``` Fits Philippe Goulet Coulombe's TVP ridge / two-step ridge regression estimator from **Time-varying parameters as ridge regressions** (International Journal of Forecasting, DOI ). The implementation is a Python port of the R package `TVPRidge`, source file `R/MV2SRR_v210407.R`, local snapshot: ```text wiki/raw/paper_code/coulombe_site_github_20260530/tvpridge/R/MV2SRR_v210407.R ``` This is not a thin wrapper around `random_walk_ridge()`. Both models use a random-walk coefficient idea, but `tvp_ridge()` ports the paper's full estimator: | Stage | R function | Python implementation | | --- | --- | --- | | Basis expansion | `Zfun` | `_tvp_z_basis` | | Generalized ridge solve | `dualGRR` | `_dual_generalized_ridge` | | Initial lambda CV | `CV.KF.MV` | `_cv_kfold_multivariate` | | Second-step lambda CV | `cv.univariate` | `_cv_univariate` | | Dropout correction | `cumul_zeros` | `_cumul_zeros` | | Public callable | `tvp.ridge` | `tvp_ridge` / `TVPRidgeRegressor` | The estimated path solves the paper's time-varying parameter ridge problem: ```text min_{beta_1,...,beta_T} sum_t (y_t - x_t beta_t)^2 + lambda * sum_t ||beta_t - beta_{t-1}||^2 + lambda2 * ||beta_0||^2 ``` `lambda` controls the amount of time variation. Large values force smoother coefficient paths; small values allow more movement. `lambda2` is the soft penalty on the starting coefficient values. The R code standardizes by sample standard deviation without centering; macroforecast follows that convention and rescales coefficient paths and fitted values back to the original data scale. The 2SRR step follows the R package logic. First, the homogeneous ridge TVP is estimated. Then coefficient innovations are used to build coefficient-specific variance weights, and residual volatility weights are optionally estimated by a GARCH(1,1) backend. The model is refit with those weights. If Python package `arch` is unavailable or the GARCH fit fails, residual-volatility weights fall back to ones and the reason is recorded in `fit.estimator.diagnostics_["garch_status"]`; the ridge/2SRR fit still runs. Input: | Argument | Required | Expected object | Meaning | | --- | --- | --- | --- | | `X` | yes | pandas DataFrame, NumPy array, or `FeatureSet` | Predictor matrix with shape `T x K`. | | `y` | yes unless `X` is a `FeatureSet` | pandas Series or one-column DataFrame for the public `ModelFit` wrapper | Target series aligned to `X`. | | `lambda_candidates` | no | sequence of positive floats or `None` | Candidate values for the time-variation penalty. `None` uses the R default grid. | | `oosX` | no | one predictor vector of length `K` | Optional one-step forecast using the final coefficient vector. | Output: | Object | Type | Contents | | --- | --- | --- | | return value | `ModelFit` | Standard macroforecast fitted model wrapper. | | `fit.estimator.betas_rr_` | NumPy array, shape `M x (K+1) x T` | First-step ridge TVP coefficient paths, original scale. | | `fit.estimator.betas_2srr_` | NumPy array, shape `M x (K+1) x T` | 2SRR coefficient paths, original scale. | | `fit.estimator.lambdas_` | NumPy array | Initial CV lambda for each target. | | `fit.estimator.lambda_step2_` | NumPy array | Second-step lambda used after reweighting. | | `fit.estimator.yhat_rr_` | DataFrame | In-sample first-step fitted values. | | `fit.estimator.yhat_2srr_` | DataFrame | In-sample 2SRR fitted values. | | `fit.estimator.sig_eps_` | DataFrame | Normalized residual-volatility weights. | | `fit.estimator.forecast_` | NumPy array | Optional forecast when `oosX` is supplied. | | `fit.estimator.coef_path_` | DataFrame | Final 2SRR path for the first target, excluding intercept. | | `fit.estimator.coef_path_full_` | DataFrame | MultiIndex coefficient path including intercept and target names. | Prediction rule: | Call | Behavior | | --- | --- | | `fit.predict(X_train)` with the original training index | Returns the time-varying in-sample `yhat_2srr_` path. | | `fit.predict(X_new)` with new rows | Uses the final estimated coefficient vector `beta_T`. | Default parameters: | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `lambda_candidates` | `exp(linspace(-6, 20, 15))` | yes | R default candidate grid for the time-variation penalty. | | `lambda2` | `0.1` | fixed by preset | Soft penalty on starting coefficient values. | | `kfold` | `5` | fixed by preset | Number of random CV folds. | | `cv_2srr` | `True` | fixed by preset | Re-run lambda CV after variance reweighting. | | `sig_u_param` | `0.75` | fixed by preset | Shrinkage exponent for coefficient-innovation variance weights. | | `sig_eps_param` | `0.75` | fixed by preset | Shrinkage exponent for residual-volatility weights. | | `ols_prior` | `False` | fixed by preset | Shrink starting coefficients toward OLS rather than zero. | | `random_state` | `1071` | fixed by preset | Fold seed matching the R source's `set.seed(1071)` convention. | | `use_garch` | `True` | fixed by preset | Use optional Python `arch` GARCH(1,1) for residual-volatility weights. | R parity notes: | Topic | Status | | --- | --- | | Standardization | Matches R: divide `X` and `Y` by sample standard deviation, no centering. | | Basis columns | Matches R `Zfun`: innovation blocks by coefficient, then static intercept/predictor block. | | Dual/primal solve | Matches R `dualGRR` algebra, with `numpy.linalg.solve` and pseudo-inverse fallback for singular systems. | | CV folds | Same random-fold design and default seed, but NumPy's RNG is not bit-identical to R's `sample()`. | | GARCH volatility | Uses optional Python `arch` backend; if unavailable, records fallback and continues with homogeneous residual weights. | | Multivariate `Y` | Estimator internals preserve `M x (K+1) x T` arrays; the public `ModelFit` wrapper is optimized for one target, consistent with macroforecast's standard supervised callable. | Default model-selection method: `cv_path`. ### lasso ```python macroforecast.models.lasso( X, y, *, alpha=1.0, max_iter=20000, standardize=False, ) ``` Fits lasso regression with an L1 penalty. There is no `lasso_path()` model callable; use `get_model("lasso")` and `model_selection.select_params()`. Backend: `sklearn.linear_model.Lasso`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | L1 penalty strength. | | `max_iter` | `20000` | fixed by preset | Optimization iteration cap. | | `standardize` | `False` | fixed by preset | Standardize predictors inside the fitted estimator. Defaults to `False` because preprocessing/window policy usually owns scaling. Set `True` for lasso-style replications where scaling must be fit inside each model window. | | Preset | `alpha` | | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0)` | Default model-selection method: `cv_path`. ### elastic_net ```python macroforecast.models.elastic_net( X, y, *, alpha=1.0, l1_ratio=0.5, max_iter=20000, standardize=False, ) ``` Fits elastic net regression. Backend: `sklearn.linear_model.ElasticNet`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Overall penalty strength. | | `l1_ratio` | `0.5` | yes | L1 share of the elastic-net penalty. | | `max_iter` | `20000` | fixed by preset | Optimization iteration cap. | | `standardize` | `False` | fixed by preset | Standardize predictors inside the fitted estimator. Defaults to `False`; set `True` for glmnet/MATLAB-style elastic-net replications whose penalty grid assumes window-local standardized predictors. | | Preset | `alpha` | `l1_ratio` | | --- | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | `(0.25, 0.5, 0.75)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.1, 0.25, 0.5, 0.75, 0.9)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0)` | `(0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95)` | ### adaptive_lasso ```python macroforecast.models.adaptive_lasso( X, y, *, alpha=1.0, gamma=1.0, initial="ridge", initial_alpha=1.0, eps=1e-4, normalize_weights=True, max_iter=20000, tol=1e-4, random_state=None, ) ``` Fits adaptive lasso. The model first estimates initial coefficients with `initial="ridge"` or `initial="ols"`, builds feature weights `1 / (abs(beta_init) + eps) ** gamma`, and fits lasso on weighted standardized predictors. Predictions are mapped back to the original target scale. Backend: macroforecast adaptive-weight construction plus final `sklearn.linear_model.Lasso`. R/glmnet comparison: `glmnet` accepts the same idea through `penalty.factor`. It internally rescales penalty factors to sum to the number of predictors, so macroforecast defaults to `normalize_weights=True`, which rescales adaptive weights to mean one before fitting the final lasso. Set `normalize_weights=False` only when the absolute weight scale should change the effective penalty strength. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Final adaptive lasso penalty strength. | | `gamma` | `1.0` | yes | Exponent applied to initial coefficient weights. | | `initial` | `"ridge"` | manual | Initial model: `"ridge"` or `"ols"`. | | `initial_alpha` | `1.0` | fixed by preset | Initial ridge penalty. | | `eps` | `1e-4` | fixed by preset | Small denominator floor for adaptive weights. | | `normalize_weights` | `True` | fixed by preset | Rescale adaptive weights to mean one, matching `glmnet` penalty-factor scaling. | | `max_iter` | `20000` | fixed by preset | Final solver iteration cap. | | `tol` | `1e-4` | fixed by preset | Final solver convergence tolerance. | | `random_state` | `None` | fixed by preset | Final solver random seed. | | Preset | `alpha` | `gamma` | | --- | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | `(1.0,)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.5, 1.0, 2.0)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.5, 1.0, 1.5, 2.0)` | ### adaptive_elastic_net ```python macroforecast.models.adaptive_elastic_net( X, y, *, alpha=1.0, l1_ratio=0.5, gamma=1.0, initial="ridge", initial_alpha=1.0, eps=1e-4, normalize_weights=True, max_iter=20000, tol=1e-4, random_state=None, ) ``` Fits an adaptive elastic-net variant with the same initial coefficient weights as `adaptive_lasso`, followed by an elastic-net fit on weighted standardized predictors. Backend: macroforecast adaptive-weight construction plus final `sklearn.linear_model.ElasticNet`. R/glmnet comparison: this is the elastic-net analogue of adaptive lasso. `normalize_weights=True` gives the same mean-one penalty-factor convention as `glmnet`; the remaining difference is solver style, because macroforecast fits one selected `alpha` while `glmnet` usually estimates a regularization path. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Final adaptive elastic-net penalty strength. | | `l1_ratio` | `0.5` | yes | L1 share of the final elastic-net penalty. | | `gamma` | `1.0` | yes | Exponent applied to initial coefficient weights. | | `initial` | `"ridge"` | manual | Initial model: `"ridge"` or `"ols"`. | | `initial_alpha` | `1.0` | fixed by preset | Initial ridge penalty. | | `eps` | `1e-4` | fixed by preset | Small denominator floor for adaptive weights. | | `normalize_weights` | `True` | fixed by preset | Rescale adaptive weights to mean one, matching `glmnet` penalty-factor scaling. | | `max_iter` | `20000` | fixed by preset | Final solver iteration cap. | | `tol` | `1e-4` | fixed by preset | Final solver convergence tolerance. | | `random_state` | `None` | fixed by preset | Final solver random seed. | | Preset | `alpha` | `l1_ratio` | `gamma` | | --- | --- | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | `(0.25, 0.5, 0.75)` | `(1.0,)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.1, 0.25, 0.5, 0.75, 0.9)` | `(0.5, 1.0, 2.0)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.05, 0.1, 0.25, 0.5, 0.75, 0.9)` | `(0.5, 1.0, 1.5, 2.0)` | ### group_lasso ```python macroforecast.models.group_lasso( X, y, *, groups=None, alpha=1.0, group_weights=None, max_iter=5000, tol=1e-5, scale=True, ) ``` Fits group lasso with a package-native proximal-gradient solver. `groups` must contain one label per predictor column. If `groups=None`, each predictor is treated as its own group. Backend: package-native proximal-gradient solver. R comparison: this follows the Gaussian group-lasso objective used by `grpreg::grpreg(..., penalty = "grLasso")`: standardized predictors, group-level L2 shrinkage, and default group weights proportional to `sqrt(group_size)`. macroforecast fits one selected `alpha` and does not reproduce `grpreg`'s full path solver, GLM families, C backend, or within-group orthogonalization step. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `groups` | `None` | manual | One group label per predictor. | | `alpha` | `1.0` | yes | Group penalty strength. | | `group_weights` | `None` | manual | Optional group penalty weights; default is `sqrt(group_size)`. | | `max_iter` | `5000` | fixed by preset | Proximal-gradient iteration cap. | | `tol` | `1e-5` | fixed by preset | Proximal-gradient convergence tolerance. | | `scale` | `True` | fixed by preset | Whether to standardize predictors inside the model. | | Preset | `alpha` | | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0)` | ### sparse_group_lasso ```python macroforecast.models.sparse_group_lasso( X, y, *, groups=None, alpha=1.0, l1_ratio=0.5, group_weights=None, max_iter=5000, tol=1e-5, scale=True, ) ``` Fits sparse group lasso. `l1_ratio` controls the feature-level L1 share; the remaining penalty share is applied at the group level. Backend: package-native proximal-gradient solver. R comparison: this follows the sparse-group penalty decomposition used by `sparsegl::sparsegl`: a feature-level L1 part plus a group L2 part with default `sqrt(group_size)` group weights. macroforecast fits one selected `alpha` and `l1_ratio`; it does not reproduce `sparsegl`'s full lambda path, bounds, GLM families, or C++ backend. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `groups` | `None` | manual | One group label per predictor. | | `alpha` | `1.0` | yes | Total sparse-group penalty strength. | | `l1_ratio` | `0.5` | yes | Feature-level L1 share. | | `group_weights` | `None` | manual | Optional group penalty weights; default is `sqrt(group_size)`. | | `max_iter` | `5000` | fixed by preset | Proximal-gradient iteration cap. | | `tol` | `1e-5` | fixed by preset | Proximal-gradient convergence tolerance. | | `scale` | `True` | fixed by preset | Whether to standardize predictors inside the model. | | Preset | `alpha` | `l1_ratio` | | --- | --- | --- | | `small` | `(0.01, 0.1, 1.0)` | `(0.25, 0.5, 0.75)` | | `standard` | `(0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.1, 0.25, 0.5, 0.75, 0.9)` | | `wide` | `(0.0001, 0.001, 0.01, 0.1, 1.0, 10.0)` | `(0.05, 0.1, 0.25, 0.5, 0.75, 0.9)` | ### bayesian_ridge ```python macroforecast.models.bayesian_ridge(X, y) ``` Fits sklearn empirical-Bayes Bayesian ridge. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Backend | `sklearn.linear_model.BayesianRidge` | | Default params | sklearn defaults | | Tunable params | none in the clean preset catalog | | Preset search spaces | none | ### huber ```python macroforecast.models.huber(X, y, *, epsilon=1.35, max_iter=1000) ``` Fits robust Huber regression. Backend: `sklearn.linear_model.HuberRegressor`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `epsilon` | `1.35` | yes | Huber loss transition threshold. | | `max_iter` | `1000` | fixed by preset | Optimization iteration cap. | | Preset | `epsilon` | | --- | --- | | `small` | `(1.1, 1.35, 1.75)` | | `standard` | `(1.1, 1.35, 1.5, 1.75, 2.0)` | | `wide` | `(1.01, 1.1, 1.35, 1.5, 1.75, 2.0, 2.5)` | ## Kernel And Nonparametric Models These models are external sklearn wrappers, not package-native numerical solvers. They live in `macroforecast.models.nonparametric` and are re-exported from `macroforecast.models` and top-level `macroforecast`. ### kernel_ridge ```python macroforecast.models.kernel_ridge( X, y, *, alpha=1.0, kernel="linear", gamma=None, degree=3, coef0=1.0, ) ``` Fits sklearn kernel ridge regression. This model is scale-sensitive for nonlinear kernels, so standardize predictors before `rbf`, `poly`, or `sigmoid` kernels. Backend: `sklearn.kernel_ridge.KernelRidge`. R parity is intentionally not claimed for this callable. It is a thin sklearn backend wrapper; macroforecast owns only the pandas `X, y` contract, `ModelFit` metadata/diagnostics, and search-space registration. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Internal scaling | none | | `ModelSpec.requires_scaling` | `True` | | Default model-selection method | `random` | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `1.0` | yes | Ridge penalty strength. | | `kernel` | `"linear"` | search option | Kernel name. | | `gamma` | `None` | search option | Kernel coefficient. | | `degree` | `3` | search option | Polynomial kernel degree. | | `coef0` | `1.0` | fixed by preset | Independent term for polynomial/sigmoid kernels. | | Preset | `alpha` | `kernel` | extra searched params | | --- | --- | --- | --- | | `small` | `(0.1, 1.0, 10.0)` | `("linear", "rbf")` | none | | `standard` | `(0.01, 0.1, 1.0, 10.0)` | `("linear", "rbf", "poly")` | `gamma=(None, 0.01, 0.1)` | | `wide` | `(0.001, 0.01, 0.1, 1.0, 10.0, 100.0)` | `("linear", "rbf", "poly", "sigmoid")` | `gamma=(None, 0.001, 0.01, 0.1, 1.0)`, `degree=(2, 3, 4)` | ### knn ```python macroforecast.models.knn( X, y, *, n_neighbors=5, weights="uniform", metric="minkowski", p=2, ) ``` Fits sklearn k-nearest-neighbor regression. This is distance-based and should usually receive standardized predictors. Backend: `sklearn.neighbors.KNeighborsRegressor`. R parity is intentionally not claimed for this callable. It is a thin sklearn backend wrapper; macroforecast owns only the pandas `X, y` contract, small-window `n_neighbors` resolution, `ModelFit` metadata/diagnostics, and search-space registration. If the requested `n_neighbors` is larger than the fitted sample size, macroforecast resolves it down to `n_obs` before constructing the sklearn estimator. The fit metadata records the effective `n_neighbors` and, when different, `requested_n_neighbors`. This avoids small-window forecasting runs failing at prediction time. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Internal scaling | none | | `ModelSpec.requires_scaling` | `True` | | Default model-selection method | `random` | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_neighbors` | `5` | yes | Number of nearest neighbors. | | `weights` | `"uniform"` | yes | `"uniform"` or `"distance"`. | | `metric` | `"minkowski"` | fixed by preset | Distance metric. | | `p` | `2` | search option | Minkowski distance order. | | Preset | `n_neighbors` | `weights` | extra searched params | | --- | --- | --- | --- | | `small` | `(3, 5, 10)` | `("uniform", "distance")` | none | | `standard` | `(3, 5, 10, 20)` | `("uniform", "distance")` | `p=(1, 2)` | | `wide` | `(1, 3, 5, 10, 20, 40)` | `("uniform", "distance")` | `p=(1, 2)` | ## Linear Boosting ### glmboost ```python macroforecast.models.glmboost( X, y, *, n_iter=100, learning_rate=0.1, center=True, candidate_sampling="all", candidate_count=None, candidate_fraction=None, candidate_cap=None, candidate_min=1, candidate_rounding="floor", random_state=None, ) ``` Fits componentwise L2 boosting with linear base learners. Backend: package-native componentwise L2 boosting loop. The R comparison target is `mboost::glmboost`. macroforecast implements the matrix-input Gaussian path: predictors are centered by default, each iteration selects the column with the largest normalized correlation with the current residual, and the selected least-squares coefficient is shrunk by `learning_rate`. Candidate sampling is deliberately decomposed into separate arguments. For Goulet Coulombe, Leroux, Stevanovic, and Surprenant (2021), Appendix A.6, use `candidate_sampling="random"`, `candidate_fraction=1/3`, `candidate_cap=200`, and `candidate_rounding="floor"`, which gives `m=min(200, floor(n_features / 3))` sampled predictors at each boosting step. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_iter` | `100` | yes | Number of boosting iterations. | | `learning_rate` | `0.1` | yes | Shrinkage applied to each update. | | `center` | `True` | no | Center predictors before componentwise updates, matching `mboost::glmboost(..., center = TRUE)`. | | `candidate_sampling` | `"all"` | fixed by preset | `"all"` searches every usable predictor each step; `"random"` samples a candidate subset before selecting the best base learner. | | `candidate_count` | `None` | fixed by preset | Fixed sampled candidate count when `candidate_sampling="random"`. Mutually exclusive with `candidate_fraction`. | | `candidate_fraction` | `None` | fixed by preset | Fraction of predictors sampled each step when `candidate_sampling="random"`. | | `candidate_cap` | `None` | fixed by preset | Maximum sampled candidate count after resolving `candidate_count` or `candidate_fraction`. | | `candidate_min` | `1` | fixed by preset | Minimum sampled candidate count. | | `candidate_rounding` | `"floor"` | fixed by preset | Rounding rule for `candidate_fraction`: `"floor"`, `"ceil"`, or `"round"`. | | `random_state` | `None` | fixed by preset | Seed for per-step candidate feature sampling when `candidate_sampling="random"`. | | Preset | `n_iter` | `learning_rate` | | --- | --- | --- | | `small` | `(50, 100)` | `(0.05, 0.1)` | | `standard` | `(50, 100, 200, 500)` | `(0.01, 0.05, 0.1)` | | `wide` | `(50, 100, 200, 500, 1000)` | `(0.005, 0.01, 0.05, 0.1, 0.2)` | ## Support-Vector Models Support-vector models are sklearn-backed and live in the base dependency set. They are useful when nonlinear margins or robust epsilon-insensitive losses are preferred over a pure least-squares fit. The forecasting runner treats them as ordinary supervised models: call `model(X, y, **params)`, tune only model-owned hyperparameters through `model_selection`, and let `window` decide the train/validation/test dates. Forecasting-runner example: ```python pre = macroforecast.preprocessing.preprocess_spec( transform="none", outliers="none", impute="mean", standardize="zscore", standardize_columns="predictors", ) features = macroforecast.feature_engineering.feature_spec( target="y", horizon=1, predictors=["x1", "x2"], lags=(0, 1), ) result = macroforecast.forecasting.run( panel, "svr", preprocessing=pre, features=features, window=macroforecast.window.last_block(validation_size=24), model_selection=macroforecast.model_selection.grid({"C": [0.1, 1.0], "epsilon": [0.01, 0.1]}), ) ``` ### svr ```python macroforecast.models.svr( X, y, *, kernel="rbf", C=1.0, epsilon=0.1, gamma="scale", degree=3, coef0=0.0, shrinking=True, tol=1e-3, cache_size=200.0, max_iter=-1, ) ``` Fits sklearn `SVR`. Backend: `sklearn.svm.SVR`. `kernel="precomputed"` is intentionally not supported because macroforecast `ModelFit` expects `X` to be a feature matrix with stable column names. Use `"linear"`, `"poly"`, `"rbf"`, or `"sigmoid"`. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Internal scaling | none | | `ModelSpec.requires_scaling` | `True` | | Default model-selection method | `random` | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `kernel` | `"rbf"` | fixed by preset | Kernel: `"linear"`, `"poly"`, `"rbf"`, or `"sigmoid"`. | | `C` | `1.0` | yes | Inverse regularization strength. | | `epsilon` | `0.1` | yes | Epsilon-insensitive tube width. | | `gamma` | `"scale"` | yes | Kernel coefficient for RBF/poly/sigmoid kernels. | | `degree` | `3` | fixed by preset | Polynomial kernel degree. | | `coef0` | `0.0` | fixed by preset | Independent term for poly/sigmoid kernels. | | `shrinking` | `True` | fixed by preset | Whether to use the shrinking heuristic. | | `tol` | `1e-3` | fixed by preset | Optimization tolerance. | | `cache_size` | `200.0` | fixed by preset | Kernel cache size in MB. | | `max_iter` | `-1` | fixed by preset | Solver iteration cap; `-1` means no cap. | | Preset | `C` | `epsilon` | `gamma` | | --- | --- | --- | --- | | `small` | `(0.1, 1.0)` | `(0.01, 0.1)` | `("scale",)` | | `standard` | `(0.1, 1.0, 10.0)` | `(0.01, 0.1, 0.2)` | `("scale", "auto")` | | `wide` | `(0.01, 0.1, 1.0, 10.0, 100.0)` | `(0.001, 0.01, 0.1, 0.2)` | `("scale", "auto")` | ### linear_svr ```python macroforecast.models.linear_svr( X, y, *, C=1.0, epsilon=0.0, loss="epsilon_insensitive", tol=1e-4, max_iter=10000, random_state=0, ) ``` Fits sklearn `LinearSVR`. Use this when a linear support-vector loss is wanted without kernel overhead. Backend: `sklearn.svm.LinearSVR`. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Internal scaling | none | | `ModelSpec.requires_scaling` | `True` | | Default model-selection method | `random` | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `C` | `1.0` | yes | Inverse regularization strength. | | `epsilon` | `0.0` | yes | Epsilon-insensitive tube width. | | `loss` | `"epsilon_insensitive"` | fixed by preset | LinearSVR loss function. | | `tol` | `1e-4` | fixed by preset | Optimization tolerance. | | `max_iter` | `10000` | fixed by preset | Solver iteration cap. | | `random_state` | `0` | fixed by preset | Random seed; can be `None`. | | Preset | `C` | `epsilon` | | --- | --- | --- | | `small` | `(0.1, 1.0)` | `(0.0, 0.1)` | | `standard` | `(0.01, 0.1, 1.0, 10.0)` | `(0.0, 0.01, 0.1)` | | `wide` | `(0.001, 0.01, 0.1, 1.0, 10.0, 100.0)` | `(0.0, 0.001, 0.01, 0.1, 0.2)` | ### nu_svr ```python macroforecast.models.nu_svr( X, y, *, kernel="rbf", C=1.0, nu=0.5, gamma="scale", degree=3, coef0=0.0, shrinking=True, tol=1e-3, cache_size=200.0, max_iter=-1, ) ``` Fits sklearn `NuSVR`, where `nu` controls the admissible training-error and support-vector fractions. Backend: `sklearn.svm.NuSVR`. `kernel="precomputed"` is intentionally not supported for the same feature-matrix contract reason as `svr()`. | Item | Value | | --- | --- | | Input | `X`, `y` | | Output | `ModelFit` | | Internal scaling | none | | `ModelSpec.requires_scaling` | `True` | | Default model-selection method | `random` | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `kernel` | `"rbf"` | fixed by preset | Kernel: `"linear"`, `"poly"`, `"rbf"`, or `"sigmoid"`. | | `C` | `1.0` | yes | Inverse regularization strength. | | `nu` | `0.5` | yes | Upper/lower control for training-error and support-vector fractions. | | `gamma` | `"scale"` | yes | Kernel coefficient for RBF/poly/sigmoid kernels. | | `degree` | `3` | fixed by preset | Polynomial kernel degree. | | `coef0` | `0.0` | fixed by preset | Independent term for poly/sigmoid kernels. | | `shrinking` | `True` | fixed by preset | Whether to use the shrinking heuristic. | | `tol` | `1e-3` | fixed by preset | Optimization tolerance. | | `cache_size` | `200.0` | fixed by preset | Kernel cache size in MB. | | `max_iter` | `-1` | fixed by preset | Solver iteration cap; `-1` means no cap. | | Preset | `C` | `nu` | `gamma` | | --- | --- | --- | --- | | `small` | `(0.1, 1.0)` | `(0.25, 0.5)` | `("scale",)` | | `standard` | `(0.1, 1.0, 10.0)` | `(0.25, 0.5, 0.75)` | `("scale", "auto")` | | `wide` | `(0.01, 0.1, 1.0, 10.0, 100.0)` | `(0.1, 0.25, 0.5, 0.75, 0.9)` | `("scale", "auto")` | ## Neural Models `nn`, `lstm`, `gru`, `transformer`, `hemisphere_nn`, and `density_hnn` are all torch-backed neural-network models and require `macroforecast[deep]`. `nn` is the feed-forward neural network for tabular feature matrices; `lstm` and `gru` are recurrent neural networks that consume trailing row sequences; `transformer` is a compact Transformer encoder using the same trailing-row sequence contract. `hemisphere_nn` is a compact bagged dual-head network for mean and variance forecasts, while `density_hnn` follows the Aionx/Paper DensityHNN procedure with prior-DNN OOB volatility emphasis and OOB volatility recalibration. The `deep` extra is intentionally separate from `macroforecast[all]` because torch is large and platform-sensitive. Torch recurrent example: ```python result = macroforecast.forecasting.run( panel, "lstm", features=features, window=macroforecast.window.last_block(validation_size=24), params={"lstm": {"sequence_length": 4, "hidden_size": 32, "device": "auto"}}, model_selection={"lstm": None}, ) ``` ### nn ```python macroforecast.models.nn( X, y, *, hidden_layer_sizes=(100,), activation="relu", dropout=0.0, learning_rate=0.001, max_epochs=100, batch_size=32, weight_decay=0.0, optimizer="adam", loss="mse", random_state=0, device="auto", ) ``` Fits a torch-backed feed-forward neural-network regressor. The estimator standardizes `X` and `y` inside each fit window and maps predictions back to target units. Use feature engineering for lagged, rolling, PCA, or MARX-style inputs before fitting this model. Forecasting-runner example: ```python result = macroforecast.forecasting.run( panel, "nn", features=features, window=macroforecast.window.last_block(validation_size=24), params={"nn": {"max_epochs": 100, "device": "auto"}}, model_selection=macroforecast.model_selection.grid({ "hidden_layer_sizes": [(32,), (64,)], "weight_decay": [0.0, 0.0001], }), ) ``` | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `hidden_layer_sizes` | `(100,)` | yes | Feed-forward hidden layer widths. | | `activation` | `"relu"` | fixed by preset | Activation: `"identity"`, `"logistic"`, `"sigmoid"`, `"tanh"`, `"relu"`, or `"gelu"`. | | `dropout` | `0.0` | yes | Dropout rate between hidden layers. | | `learning_rate` | `0.001` | yes | Optimizer learning rate. | | `max_epochs` | `100` | fixed by preset | Training epoch cap. | | `batch_size` | `32` | fixed by preset | Mini-batch size. | | `weight_decay` | `0.0` | yes | L2 weight decay. | | `optimizer` | `"adam"` | fixed by preset | Torch optimizer: `"adam"`, `"sgd"`, or `"rmsprop"`. | | `loss` | `"mse"` | fixed by preset | Torch loss: `"mse"` or `"huber"`. | | `random_state` | `0` | fixed by preset | Random seed. | | `device` | `"auto"` | fixed by preset | Torch device: `"auto"`, `"cpu"`, or `"cuda"`. | | Preset | `hidden_layer_sizes` | `dropout` | `learning_rate` | `weight_decay` | | --- | --- | --- | --- | --- | | `small` | `((32,), (64,))` | `(0.0,)` | `(0.001,)` | `(0.0, 0.0001)` | | `standard` | `((64,), (100,), (64, 32))` | `(0.0, 0.1)` | `(0.0005, 0.001)` | `(0.0, 0.0001, 0.001)` | | `wide` | `((32,), (64,), (100,), (128,), (100, 50), (128, 64))` | `(0.0, 0.1, 0.25)` | `(0.0001, 0.0005, 0.001, 0.005)` | `(0.0, 0.00001, 0.0001, 0.001, 0.01)` | ### lstm ```python macroforecast.models.lstm( X, y, *, sequence_length=4, hidden_size=32, num_layers=1, dropout=0.0, learning_rate=0.001, max_epochs=100, batch_size=32, random_state=0, device="auto", ) ``` Fits a compact torch-backed LSTM regressor. `sequence_length` controls how many trailing rows are passed to the recurrent network for each target date. The fitted estimator stores the trailing training rows, so `predict(X_test)` can create the first test sequences without the caller manually prepending training history. The backend is a regular `torch.nn.Module`, switches to `train()` during fitting and `eval()` during prediction, and uses `device` to choose CPU or CUDA. The fit diagnostics include `sequence_context`, recording `sequence_length`, `fit_sample_size`, `train_tail_rows`, and the `test_sequence_prefix` policy. The prefix is always the last fitted rows only, so the forecasting runner can pass the test feature block directly without leaking future rows. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `sequence_length` | `4` | yes | Trailing rows per recurrent sequence. | | `hidden_size` | `32` | yes | Recurrent hidden-state width. | | `num_layers` | `1` | fixed by preset | Number of recurrent layers. | | `dropout` | `0.0` | fixed by preset | Dropout between recurrent layers. | | `learning_rate` | `0.001` | yes | Adam learning rate. | | `max_epochs` | `100` | fixed by preset | Training epoch cap. | | `batch_size` | `32` | fixed by preset | Mini-batch size. | | `random_state` | `0` | fixed by preset | Random seed. | | `device` | `"auto"` | fixed by preset | Torch device: `"auto"`, `"cpu"`, or `"cuda"`. | | Preset | `sequence_length` | `hidden_size` | `learning_rate` | | --- | --- | --- | --- | | `small` | `(2, 4)` | `(16, 32)` | `(0.001,)` | | `standard` | `(2, 4, 8)` | `(16, 32, 64)` | `(0.0005, 0.001)` | | `wide` | `(2, 4, 8, 12)` | `(16, 32, 64, 128)` | `(0.0001, 0.0005, 0.001, 0.005)` | ### gru ```python macroforecast.models.gru( X, y, *, sequence_length=4, hidden_size=32, num_layers=1, dropout=0.0, learning_rate=0.001, max_epochs=100, batch_size=32, random_state=0, device="auto", ) ``` Fits a compact torch-backed GRU regressor with the same input/output contract as `lstm`. ### transformer ```python macroforecast.models.transformer( X, y, *, sequence_length=4, hidden_size=32, num_layers=1, dropout=0.0, learning_rate=0.001, max_epochs=100, batch_size=32, random_state=0, device="auto", ) ``` Fits a compact torch-backed Transformer encoder regressor. The input/output contract matches `lstm` and `gru`: rows are standardized inside each fit window, trailing sequences are built from the fitted sample, and predictions for new rows are mapped back to target units. `hidden_size` is the Transformer feed-forward width, not the input dimension; the encoder uses `d_model = n_features` and `nhead=1` to keep the public callable small and stable for macro panels. ### hemisphere_nn ```python macroforecast.models.hemisphere_nn( X, y, *, lc=2, lm=2, lv=2, neurons=64, dropout=0.2, learning_rate=0.001, max_epochs=100, n_estimators=100, subsample=0.8, nu=None, variance_penalty=1.0, patience=15, validation_fraction=0.2, random_state=0, device="auto", quantile_levels=(0.05, 0.5, 0.95), ) ``` Fits a compact Hemisphere neural network inspired by Goulet Coulombe, Frenette, and Klieber's dual-head density-forecast architecture. The network has a shared common core, a mean head, and a positive variance head. The loss is Gaussian negative log likelihood plus a soft variance-emphasis penalty: ```text mean((y - h_m(X))^2 / h_v(X) + log h_v(X)) + variance_penalty * (mean(h_v(X)) - nu * var(y))^2 / var(y)^2 ``` `predict()` returns the ensemble mean forecast. The fitted estimator also exposes `predict_variance(X)`, `predict_distribution(X)`, and `predict_quantiles(X, levels=None)`. The forecasting runner stores the variance and quantile outputs in `variance_prediction` and `quantile_predictions`. The public callable accepts legacy aliases `lr`, `n_epochs`, `B`, `sub_rate`, `lambda_emphasis`, and `val_frac`; normalized metadata records them as `learning_rate`, `max_epochs`, `n_estimators`, `subsample`, `variance_penalty`, and `validation_fraction`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `lc` | `2` | fixed by preset | Shared common-core depth. | | `lm` | `2` | fixed by preset | Mean-head depth after the common core. | | `lv` | `2` | fixed by preset | Variance-head depth after the common core. | | `neurons` | `64` | yes | Hidden width. | | `dropout` | `0.2` | fixed by preset | Dropout rate. | | `learning_rate` | `0.001` | yes | Adam learning rate. | | `max_epochs` | `100` | fixed by preset | Training epoch cap. | | `n_estimators` | `100` | yes | Number of blocked-subsample bags. | | `subsample` | `0.8` | fixed by preset | Blocked-subsample fraction. | | `nu` | `None` | fixed by preset | Variance-emphasis target ratio; `None` uses `0.5`. | | `variance_penalty` | `1.0` | fixed by preset | Soft penalty on the variance-emphasis target. | | `patience` | `15` | fixed by preset | Early-stopping patience. | | `validation_fraction` | `0.2` | fixed by preset | Chronological validation fraction. | | `device` | `"auto"` | fixed by preset | Torch device. | | `quantile_levels` | `(0.05, 0.5, 0.95)` | fixed by preset | Default normal-approximation quantile levels returned by `predict_quantiles()`. | ### density_hnn ```python macroforecast.models.density_hnn( X, y, *, common_layers=2, mean_layers=2, volatility_layers=2, prior_layers=3, neurons=400, dropout=0.2, learning_rate=0.001, max_epochs=100, n_estimators=100, prior_estimators=50, subsample=0.8, block_size=8, volatility_emphasis=None, rescale_volatility=True, patience=15, random_state=0, device="auto", quantile_levels=(0.05, 0.5, 0.95), volatility_clip=0.05, ) ``` Fits the Density Hemisphere Neural Network from Goulet Coulombe, Frenette, and Klieber, "From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks" (Journal of Applied Econometrics, 2025). The implementation is a torch-native port of the public Aionx `DensityHNN` logic, not a TensorFlow dependency wrapper. It is included because the method is a macro density forecast model: `macroforecast` uses it to produce conditional means, conditional variances, volatility forecasts, and normal-approximation quantiles. It does not create portfolio weights. The Aionx source-code correspondence is: | Aionx source item | `macroforecast` implementation | | --- | --- | | `aionx.models.DensityHNN.prior_dnn_architecture` | `prior_estimators` plain DNN ensemble fitted before the HNN. | | `aionx.bootstrap.TimeSeriesBlockBootstrap` | `block_size` and `subsample` create time-series block bootstrap samples. | | `aionx.kerasnn.ensemble.OutOfBagPredictor` | OOB forecasts use the Aionx denominator formula `sum(oob forecast) / ((1 - subsample) * n_estimators)`. | | `aionx.models.DensityHNN.base_architecture` | shared common core plus mean and volatility hemispheres; the volatility head is positive and normalized to the volatility-emphasis value. | | `aionx.models.DensityHNN.volatility_rescaling_algorithm` | OOB log squared residuals are regressed on log predicted volatility squared, then all volatility forecasts are rescaled. | The callable consumes the standard supervised model contract `density_hnn(X, y, **params)`. In Aionx, lags and trend terms are created inside `DensityHNN.run(...)`; in `macroforecast`, lags, MARX/MAF features, PCA, trends, and seasonal/time features should be built explicitly with `macroforecast.feature_engineering` before calling the model or through the forecasting runner. This keeps the model callable small and lets the same feature construction be reused by other models. Fit sequence: 1. Standardize `X` and `y` inside the fit window. 2. Fit a prior DNN ensemble on blocked bootstrap samples. 3. Compute prior-DNN OOB mean squared error; this becomes the Aionx `volatility_emphasis` unless the user supplies an override. 4. Fit a DensityHNN ensemble with shared core, conditional-mean head, and conditional-volatility head. 5. Compute HNN OOB mean and volatility forecasts. 6. Recalibrate volatility using the Aionx log residual-square regression. 7. Return forecasts on the original target scale. Output: | Method | Output | | --- | --- | | `predict(X)` | pandas Series of conditional mean forecasts through `ModelFit`. | | `predict_variance(X)` | numpy array of conditional variance forecasts in target units squared. | | `predict_volatility(X)` | numpy array of conditional standard-deviation forecasts in target units. | | `predict_distribution(X)` | `(mean, variance)` arrays in target units. | | `predict_quantiles(X, levels=None)` | dictionary from quantile level to normal-approximation quantile forecast. | Diagnostics: | Field | Meaning | | --- | --- | | `fit.diagnostics["density"]["volatility_emphasis"]` | Aionx volatility-emphasis value used by the HNN volatility head. | | `fit.diagnostics["density"]["prior_oob_mse"]` | Prior-DNN OOB mean squared error used when `volatility_emphasis=None`. | | `fit.diagnostics["density"]["oob_rescaling"]` | Intercept, slope, scaler, and OOB count for the log residual-square volatility recalibration. | | `fit.estimator.oob_prediction_` | Fit-window OOB conditional mean, volatility, and variance table. | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `common_layers` | `2` | fixed by preset | Shared common-core depth. | | `mean_layers` | `2` | fixed by preset | Conditional-mean hemisphere depth. | | `volatility_layers` | `2` | fixed by preset | Conditional-volatility hemisphere depth. | | `prior_layers` | `3` | fixed by preset | Prior-DNN hidden depth. | | `neurons` | `400` | yes | Hidden width. The paper/Aionx default is 400; smaller values are useful for smoke tests. | | `dropout` | `0.2` | fixed by preset | Dropout rate. | | `learning_rate` | `0.001` | yes | Adam learning rate. | | `max_epochs` | `100` | fixed by preset | Training epoch cap. | | `n_estimators` | `100` | yes | DensityHNN bootstrap ensemble size. | | `prior_estimators` | `50` | yes | Prior-DNN bootstrap ensemble size used to estimate volatility emphasis. | | `subsample` | `0.8` | fixed by preset | Blocked bootstrap sampling rate. | | `block_size` | `8` | fixed by preset | Time-series block size. The paper uses blocked subsampling to preserve temporal dependence. | | `volatility_emphasis` | `None` | fixed by preset | `None` estimates the value from prior-DNN OOB MSE. Passing a float overrides it. Values outside Aionx's `[0.01, 1.0]` range are mapped to `0.99`, following the source code. | | `rescale_volatility` | `True` | fixed by preset | Apply the blocked-OOB volatility reality-check recalibration. | | `patience` | `15` | fixed by preset | Early-stopping patience. | | `device` | `"auto"` | fixed by preset | Torch device. | | `quantile_levels` | `(0.05, 0.5, 0.95)` | fixed by preset | Default normal-approximation quantile levels. | | `volatility_clip` | `0.05` | fixed by preset | Minimum volatility used in Gaussian negative log likelihood, matching Aionx's numerical-stability clip. | ## Factor And Time-Series Models ### pls ```python macroforecast.models.pls( X, y, *, n_components=3, scale=True, max_iter=500, tol=1e-6, control_columns=None, include_constant=True, drop_control_columns=True, quadratic_factors=False, ) ``` Fits partial least squares regression. Unlike unsupervised PCA, PLS uses the target while constructing latent components, so it belongs in `models` rather than `preprocessing` or `feature_engineering`. `n_components` is treated as a requested upper bound. At fit time, the model resolves it to `min(requested, n_factor_predictors, n_observations)` so the default is safe for small feature sets. Metadata records both `requested_n_components` and `resolved_n_components`; `n_components` stores the resolved value used by `sklearn.cross_decomposition.PLSRegression`. Implementation map: | Item | Value | | --- | --- | | Backend | `sklearn.cross_decomposition.PLSRegression` | | Paper-code comparison | Hounyo-Li `PLS_emp002.m` and `PLS_tune.m` | | Control handling | Optional `control_columns` block is fitted first, target residuals are passed to PLS, and the control forecast is added back. | | Difference from MATLAB code | MATLAB `plsregress` exposes `stats.W`; macroforecast uses sklearn PLS latent scores and records factor loadings/metadata. The forecasting contract is the same control-residualized PLS factor regression. | | PC2 support | Set `quadratic_factors=True` to add the `PLS_PC2.m` squared-factor forecast head. | Hounyo-Li PLS baseline: ```text alphawhat = y_insample * wt_insample' * inv(wt_insample * wt_insample') y_resid = y_insample - alphawhat * wt_insample [~,~,~,~,~,~,~,stats] = plsregress(X_insample', y_resid', K) B = stats.W Fhat = B' * X_insample alphahat = y_resid * Fhat' * inv(Fhat * Fhat') yhat = (alphahat * B') * X_out + alphawhat * wt_out ``` `macroforecast.models.pls()` mirrors this as: | MATLAB step | `macroforecast` implementation | | --- | --- | | `wt` | `control_columns` plus optional constant | | residualize `ytplush` on `wt` | `_PLSCompositeRegressor.control_coef_` and target residual | | `plsregress(..., K)` | `sklearn.cross_decomposition.PLSRegression` | | `Fhat` PLS scores | `transform(...)` / `x_scores_` | | `alphahat` on PLS scores | `factor_coefs_` | | add `alphawhat * wt_out` | `predict()` control block addition | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_components` | `3` | yes | Requested maximum number of latent PLS components. | | `scale` | `True` | fixed by preset | Whether to standardize predictors before PLS. | | `max_iter` | `500` | fixed by preset | NIPALS iteration cap. | | `tol` | `1e-6` | fixed by preset | NIPALS convergence tolerance. | | `control_columns` | `None` | fixed by preset | Optional X columns used as forecasting controls. | | `include_constant` | `True` | fixed by preset | Whether to include a constant in the control block. | | `drop_control_columns` | `True` | fixed by preset | Whether controls are excluded from the PLS block. | | `quadratic_factors` | `False` | fixed by preset | Whether to add the Hounyo-Li PC2 squared-factor forecast head. | | Preset | `n_components` | | --- | --- | | `small` | `(1, 2, 3)` | | `standard` | `(1, 2, 3, 5, 8)` | | `wide` | `(1, 2, 3, 5, 8, 10, 12, 20)` | ### scaled_pca ```python macroforecast.models.scaled_pca( X, y, *, n_components=3, scale=True, control_columns=None, include_constant=True, drop_control_columns=True, winsorize_slopes=None, quadratic_factors=False, ) ``` Fits Huang, Jiang, Li, Tong, and Zhou scaled PCA (sPCA) with a linear forecast head. The factor extraction step follows the original `spcaest.m` contract: standardize predictors, estimate one marginal predictive slope for each predictor, scale each standardized predictor by that slope, then run PCA on the scaled panel. Mathematical contract: Let $X \in \mathbb{R}^{T \times N}$ be the model-window predictor matrix and $y \in \mathbb{R}^{T}$ be the target. With `scale=True`, each predictor is standardized inside the active model window using MATLAB's default sample standard deviation convention: $$ X^s_{tj} = \frac{X_{tj}-\bar X_j}{s_j}. $$ For each predictor, estimate the marginal predictive slope from an intercept regression: $$ \hat\beta_j = \frac{\sum_{t=1}^{T}(X^s_{tj}-\bar X^s_j)(y_t-\bar y)} {\sum_{t=1}^{T}(X^s_{tj}-\bar X^s_j)^2}. $$ Build the scaled panel: $$ X^{\mathrm{sPCA}}_{tj} = X^s_{tj}\hat\beta_j. $$ Then compute principal components with Huang's normalization $\hat F'\hat F/T = I$: $$ X^{\mathrm{sPCA}}X^{\mathrm{sPCA}\prime} = UDU', \qquad \hat F = \sqrt{T}\,U_{[:,1:K]}. $$ For forecasting, `macroforecast` regresses the target residual after optional controls on these factors, then projects new scaled observations into the same factor space. This forecast head is the package wrapper; the factor extraction itself matches Huang's `spcaest.m` design. Set `quadratic_factors=True` to reproduce the Hounyo-Li `scaledPCA_PC2.m` forecast head: $$ \hat y_* = w_*\hat a_W + \hat\alpha_1 \hat f_* + \hat\alpha_2 \hat f_*^2. $$ For multiple factors, the squared term is applied componentwise, matching the MATLAB code's `alphahat2 * ((leftvector * scaleXs_outofsample).^2)` contract. Original-code match: | Huang `spcaest.m` step | `macroforecast` implementation | | --- | --- | | `Xs = standard(X)` | model-window predictor standardization with `ddof=1` | | `xvar = [ones(T,1) Xs(:,j)]` | `_marginal_slopes(factor_values, y_values)` | | `parm = xvar\target` and `beta(j)=parm(2)` | closed-form marginal slope stored in `scaling_slopes_` | | `scaleXs(:,j)=Xs(:,j)*beta(j)` | `scaled_values = factor_values * slopes` | | `pc_T(scaleXs,nfac)` | `_huang_scaled_pca_state(...)` | | `fhat=Fhat0(:,1:nfac)*sqrt(T)` | `factor_scores_` with `F'F/T=I` | Scaling note: Huang's `spcaest.m` standardizes only `X` before estimating the marginal slopes, while the target stays in its raw units. Therefore `scaling_slopes_` in `scaled_pca` is in target-scale units. This differs from the Hounyo-Li macro SsPCA code below, where both `X` and the target are standardized before the slope step. The two slope vectors can differ by the target scale, but that is a global scalar difference for the factor/forecast structure; the practical scaling logic is the same once predictions are mapped back to the target units. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_components` | `3` | yes | Number of Huang scaled-PCA factors. | | `scale` | `True` | fixed by preset | Whether to standardize predictors inside the model. | | `control_columns` | `None` | fixed by preset | Optional X columns used as forecasting controls. | | `include_constant` | `True` | fixed by preset | Whether to include a constant in the control block. | | `drop_control_columns` | `True` | fixed by preset | Whether controls are excluded from the PCA block. | | `winsorize_slopes` | `None` | fixed by preset | Optional percentile winsorization for scaling slopes. | | `quadratic_factors` | `False` | fixed by preset | Whether to add the Hounyo-Li PC2 squared-factor forecast head. | The presets tune `n_components`. Inspect exact candidate lists with `describe_model("scaled_pca")`. ### supervised_pca ```python macroforecast.models.supervised_pca( X, y, *, n_components=3, n_selected=50, min_abs_corr=0.0, scale=True, control_columns=None, include_constant=True, drop_control_columns=True, preselect="none", t_threshold=1.28, elastic_net_alpha=0.0002, elastic_net_l1_ratio=0.5, quadratic_factors=False, random_state=0, ) ``` Fits original-style supervised PCA (SPCA). The implementation follows the MATLAB reproducibility code structure from Hounyo and Li's IJF weak-factor package: residualize the target on optional controls, rank the current predictors by absolute correlation with the current target residual, select `n_selected` predictors, extract one SVD factor, project both target and predictor residuals, then repeat for `n_components`. This is different from `feature_engineering.pca_features()`, which is unsupervised and belongs in `macroforecast.feature_engineering`. Use this model when the target is intentionally allowed to guide component construction inside each model fit window. Mathematical contract: Let $X \in \mathbb{R}^{T \times N}$ be the model-window predictor matrix, $y \in \mathbb{R}^{T}$ be the target, and $W \in \mathbb{R}^{T \times c}$ be the optional control block. With `scale=True`, each predictor and the target are standardized inside the active model window using the same sample standard deviation convention as MATLAB `std(...,0,dim)`. First residualize the target on controls. The paper code writes this with an ordinary inverse; `macroforecast` uses the Moore-Penrose inverse for numerical stability when the control block is singular or nearly singular. $$ \hat a_W = W^{+}y, \qquad r_y^{(0)} = y - W\hat a_W, \qquad R_X^{(0)} = X. $$ For component $k = 1,\ldots,K$, compute residual correlations, select a subset $I_k$, extract one SVD loading, and project: $$ c_j^{(k)} = \left|\operatorname{corr}\left(R_{X,j}^{(k-1)}, r_y^{(k-1)}\right)\right|, \qquad I_k = \operatorname{top}_{q}\{c_j^{(k)}\}_{j=1}^{N}. $$ $$ u_k = \operatorname{first\ left\ singular\ vector} \left(R_{X,I_k}^{(k-1)\prime}\right), \qquad \ell_{k,j}=0\ \text{for}\ j \notin I_k. $$ $$ f_k = R_X^{(k-1)}\ell_k,\qquad \hat\alpha_k = \frac{r_y^{(k-1)\prime}f_k}{f_k'f_k},\qquad \hat\lambda_k = \frac{R_X^{(k-1)\prime}f_k}{f_k'f_k}. $$ $$ r_y^{(k)} = r_y^{(k-1)}-\hat\alpha_k f_k,\qquad R_X^{(k)} = R_X^{(k-1)}-f_k\hat\lambda_k'. $$ Prediction for a new row $x_*$ and controls $w_*$ is: $$ \hat y_* = w_*\hat a_W + x_*\left(\sum_{k=1}^{K}\hat\alpha_k\ell_k'\right). $$ Set `quadratic_factors=True` for the Hounyo-Li `SPCA_PC2.m` variant. In that case each extraction step also estimates $$ \hat\alpha_{2,k} = \frac{r_y^{(k-1)\prime}(f_k^2)}{(f_k^2)'(f_k^2)} $$ and updates the target residual as $$ r_y^{(k)} = r_y^{(k-1)} -\hat\alpha_{1,k}f_k -\hat\alpha_{2,k}f_k^2. $$ Original-code match: | MATLAB variable / step | `macroforecast` implementation | | --- | --- | | `alphawhat = yt_insample*wt_insample'*inv(wt_insample*wt_insample')` | `_least_squares_coef(control_values, y_values)` | | `COR = abs(corr(xt0', ytplush0'))` | `_absolute_correlations(work_x, work_y)` | | `idx_sorted(1:N1)` | `_selected_indices(..., n_selected=q)` | | `[U,~,~] = svds(xt0(II,:),1)` | `np.linalg.svd(work_x[:, selected], ...)` and first right singular vector | | `Fhat = leftvector * xt0` | `factor = work_x @ loading` | | `alphahat = ytplush0*Fhat' / (Fhat*Fhat')` | `alpha = work_y @ factor / (factor @ factor)` | | `lambdahat = xt0*Fhat' / (Fhat*Fhat')` | `lambdas = work_x.T @ factor / (factor @ factor)` | | `ytplush0 = ytplush0 - alphahat*Fhat` | `work_y = work_y - alpha * factor` | | `xt0 = xt0 - lambdahat * Fhat` | `work_x = work_x - np.outer(factor, lambdas)` | Verification status: unit tests include a compact MATLAB-style reference recursion for both SPCA and SsPCA and compare generated predictions against `models.supervised_pca()` and `models.supervised_scaled_pca()`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_components` | `3` | yes | Number of sequential supervised components. | | `n_selected` | `50` | yes | Predictors selected at each SPCA step. | | `min_abs_corr` | `0.0` | yes | Minimum absolute residual correlation retained before PCA. | | `scale` | `True` | fixed by preset | Whether to standardize predictors and target inside the model. | | `control_columns` | `None` | fixed by preset | Optional X columns used as forecasting controls. | | `include_constant` | `True` | fixed by preset | Whether to include a constant in the control block. | | `drop_control_columns` | `True` | fixed by preset | Whether controls are excluded from the PCA block. | | `preselect` | `"none"` | fixed by preset | Optional pre-selection: `"none"`, `"hard_tstat"`, or `"elastic_net"`. | | `t_threshold` | `1.28` | fixed by preset | Hard t-stat pre-selection threshold. | | `elastic_net_alpha` | `0.0002` | fixed by preset | Elastic-net pre-selection penalty. | | `elastic_net_l1_ratio` | `0.5` | fixed by preset | Elastic-net pre-selection L1 ratio. | | `quadratic_factors` | `False` | fixed by preset | Whether to add the Hounyo-Li PC2 squared-factor forecast head. | | `random_state` | `0` | fixed by preset | Elastic-net pre-selection random seed. | The presets tune `n_components`, `n_selected`, and `min_abs_corr`. Inspect exact candidate lists with `describe_model("supervised_pca")`. ### supervised_scaled_pca ```python macroforecast.models.supervised_scaled_pca( X, y, *, n_components=3, n_selected=50, min_abs_corr=0.0, scale=True, control_columns=None, include_constant=True, drop_control_columns=True, preselect="none", t_threshold=1.28, elastic_net_alpha=0.0002, elastic_net_l1_ratio=0.5, quadratic_factors=False, random_state=0, ) ``` Fits Hounyo-Li supervised scaled PCA (SsPCA). This adds the paper's predictive-slope scaling step before the SPCA loop: each standardized predictor is first multiplied by its marginal predictive slope for the target. The scaled panel is then passed through the same iterative supervised selection, SVD factor extraction, and projection loop as `supervised_pca`. Mathematical contract: After the within-window standardization used above, estimate one marginal predictive slope per predictor: $$ \hat\gamma_j = \frac{\sum_{t=1}^{T}(x_{tj}-\bar x_j)(y_t-\bar y)} {\sum_{t=1}^{T}(x_{tj}-\bar x_j)^2}. $$ Build the supervised-scaled panel $$ X^{\mathrm{scaled}}_j = \hat\gamma_j X_j, \qquad j=1,\ldots,N. $$ Then run the same SPCA recursion as `supervised_pca` with $R_X^{(0)} = X^{\mathrm{scaled}}$. The forecast is therefore $$ \hat y_* = w_*\hat a_W + x_*^{\mathrm{scaled}} \left(\sum_{k=1}^{K}\hat\alpha_k\ell_k'\right). $$ This corresponds to Hounyo-Li SsPCA as implemented in the local MATLAB package: `scaledPCA_emp002.m` supplies the predictive-slope scaling idea, `SPCA_emp002.m` supplies the supervised selection/projection recursion, and `SsPCA_emp002.m` combines the two by applying SPCA to `scaleXs`. Source checked: the local MATLAB reproducibility package for Hounyo and Li, `SsPCA_emp002.m`, `SsPCA_tune.m`, `SPCA_emp002.m`, `scaledPCA_emp002.m`, and `inflation_linear_tune.m`. The Python implementation is a clean port of the algorithmic contract, not copied MATLAB code. Set `quadratic_factors=True` for the `SsPCA_PC2.m` variant. This keeps the same predictive-slope scaling and supervised selection recursion, then adds the componentwise squared-factor forecast head used in the paper's PC2 scripts. Original-code match for the scaling step: | MATLAB variable / step | `macroforecast` implementation | | --- | --- | | `xvar = [ones(1,T); xt_standardized(j,:)]` | `_marginal_slopes(factor_values, y_values)` uses an intercept-equivalent centered OLS slope | | `parm = ytplush*xvar'*inv(xvar*xvar')` | closed-form marginal slope | | `beta_scaled(j) = parm(2)` | `scaling_slopes_` | | `scaleXs(j,:) = xt_standardized(j,:) * beta_scaled(j)` | `factor_values = factor_values * slopes` | | `SsPCA_emp002(scaleXs, ytplush, wt, Khat, number)` | `SupervisedScaledPCARegressor` then calls the same SPCA extraction path | Target scaling note: the Hounyo-Li macro code standardizes the target and predictors before computing `beta_scaled`. Huang's `spcaest.m` standardizes only predictors and keeps the target raw. Consequently, `supervised_scaled_pca` stores standardized-target slopes when `scale=True`, while `scaled_pca` stores raw-target slopes. These stored slope magnitudes are not directly comparable without the target standard deviation. For factor construction and forecast generation, however, the difference is a global target-scale multiplier rather than a different screening or projection rule. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_components` | `3` | yes | Number of sequential SsPCA components. | | `n_selected` | `50` | yes | Predictors selected at each SPCA step after slope scaling. | | `min_abs_corr` | `0.0` | yes | Minimum absolute residual correlation retained before PCA. | | `scale` | `True` | fixed by preset | Whether to standardize predictors and target inside the model. | | `control_columns` | `None` | fixed by preset | Optional X columns used as forecasting controls. | | `include_constant` | `True` | fixed by preset | Whether to include a constant in the control block. | | `drop_control_columns` | `True` | fixed by preset | Whether controls are excluded from the PCA block. | | `preselect` | `"none"` | fixed by preset | Optional pre-selection: `"none"`, `"hard_tstat"`, or `"elastic_net"`. | | `t_threshold` | `1.28` | fixed by preset | Hard t-stat pre-selection threshold. | | `elastic_net_alpha` | `0.0002` | fixed by preset | Elastic-net pre-selection penalty. | | `elastic_net_l1_ratio` | `0.5` | fixed by preset | Elastic-net pre-selection L1 ratio. | | `quadratic_factors` | `False` | fixed by preset | Whether to add the Hounyo-Li PC2 squared-factor forecast head. | | `random_state` | `0` | fixed by preset | Elastic-net pre-selection random seed. | The original empirical MATLAB code uses lagged target plus constant controls. In `macroforecast`, pass the lagged target as an X column and list it in `control_columns` when that exact control block is needed. ### ar ```python macroforecast.models.ar(y, *, n_lag=1) ``` Fits a univariate autoregression on the target series. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_lag` | `1` | yes | Autoregressive lag order. | | Preset | `n_lag` | | --- | --- | | `small` | `(1, 2, 4)` | | `standard` | `(1, 2, 4, 6, 12)` | | `wide` | `(1, 2, 3, 4, 6, 9, 12, 18, 24)` | ### stlf ```python macroforecast.models.stlf(y, *, period=None, sa_method="ets") ``` STL decomposition forecaster (R `forecast::stlf`). Seasonally adjusts the target with STL, forecasts the seasonally-adjusted series (additive-trend exponential smoothing, random-walk-drift fallback), and adds back the last seasonal cycle. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `period` | `None` | no | Seasonal period; inferred from the index frequency if omitted. | | `sa_method` | `"ets"` | no | Forecaster for the seasonally-adjusted series. | ### naive ```python macroforecast.models.naive(y) ``` Random-walk baseline (R `forecast::naive`). Carries the last observed target value forward, so the h-step path is constant at `y_T`. Target-only. ### seasonal_naive ```python macroforecast.models.seasonal_naive(y, *, period=None) ``` Seasonal-naive baseline (R `forecast::snaive`). Repeats the last full seasonal cycle of length `period`, so step `k` returns the value from one season earlier. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `period` | `None` | no | Seasonal period `m`; defaults to 1 (plain naive). | ### random_walk_drift ```python macroforecast.models.random_walk_drift(y) ``` Random-walk-with-drift baseline (R `forecast::rwf(drift=TRUE)`). Extrapolates the last value by the average historical change: `y_T + h * (y_T - y_1) / (T - 1)`. ### var ```python macroforecast.models.var(panel, *, target=None, n_lag=1, type="const", season=None) ``` Fits a VAR on a multivariate panel. `target` chooses the forecast output column. If omitted, the first column is used. The callable now uses an internal OLS implementation aligned with R `vars::VAR` and `predict.varest`: lagged endogenous variables are stacked in lag order, deterministic terms are controlled by R-style `type`, and `predict()` recursively rolls the VAR state forward for point forecasts. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `target` | `None` | fixed by preset | Target column in the panel. | | `n_lag` | `1` | yes | VAR lag order. | | `type` | `"const"` | fixed by preset | R `vars::VAR` deterministic terms: `"const"`, `"trend"`, `"both"`, or `"none"`. Short aliases `"c"`, `"t"`, `"ct"`, and `"n"` are accepted. | | `season` | `None` | fixed by preset | Optional centered seasonal dummies, matching `vars::VAR(season=...)`. | | Preset | `n_lag` | | --- | --- | | `small` | `(1, 2, 4)` | | `standard` | `(1, 2, 4, 6, 12)` | | `wide` | `(1, 2, 3, 4, 6, 9, 12, 18, 24)` | ### bvar_minnesota ```python macroforecast.models.bvar_minnesota( panel, *, target=None, n_lag=1, kappa0=2.0, kappa1=0.5, nu0=0.0, s0=0.0, iter=10000, burnin=5000, random_state=0, ) ``` Fits a Bayesian VAR posterior sampler with the Minnesota prior variance logic used by R `FAVAR::BVAR` and `bvartools::minnesota_prior`. Saved posterior coefficient and covariance draws are available in diagnostics; `predict()` uses posterior-mean VAR coefficients for recursive point forecasts. BVAR forecasting is not a macroforecast-only extension: CRAN `BVAR` exposes `predict.bvar`, while this callable is macroforecast's R-aligned ModelFit surface for the same class of BVAR forecast object. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `target` | `None` | fixed by preset | Target column in the panel. | | `n_lag` | `1` | yes | VAR lag order. | | `kappa0` | `2.0` | yes | Minnesota own-lag prior scale. | | `kappa1` | `0.5` | yes | Minnesota lag-decay exponent. | | `nu0` | `0.0` | fixed by preset | Inverse-Wishart degrees-of-freedom prior parameter. | | `s0` | `0.0` | fixed by preset | Inverse-Wishart scale prior parameter. | | `iter` | `10000` | fixed by preset | Total Gibbs iterations. | | `burnin` | `5000` | fixed by preset | Burn-in iterations removed before summaries. | | `random_state` | `0` | fixed by preset | Random seed for posterior draws. | ### bvar_normal_inverse_wishart ```python macroforecast.models.bvar_normal_inverse_wishart( panel, *, target=None, n_lag=1, b0=0.0, vb0=0.0, nu0=0.0, s0=0.0, iter=10000, burnin=5000, random_state=0, ) ``` Fits the same FAVAR-style Bayesian VAR posterior sampler with direct controls for coefficient prior mean/variance and inverse-Wishart covariance prior terms. Saved diagnostics include coefficient posterior mean, standard deviation, credible interval bounds, and posterior mean covariance. ### ets ```python macroforecast.models.ets( y, *, error="add", trend=None, seasonal=None, seasonal_periods=None, damped_trend=False, ) ``` Fits a target-only statsmodels exponential-smoothing model through ETS-style arguments. In `forecasting.run(...)`, this model ignores `X` and fits on the stage target vector. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `error` | `"add"` | fixed by preset | Error component; currently additive by default. | | `trend` | `None` | fixed by preset | Optional trend component such as `"add"`. | | `seasonal` | `None` | fixed by preset | Optional seasonal component such as `"add"`. | | `seasonal_periods` | `None` | fixed by preset | Seasonal period length. | | `damped_trend` | `False` | fixed by preset | Whether to damp the trend component. | Output is `ModelFit`; `predict(X_future)` uses only the number of requested future rows and preserves the provided index. ### holt_winters ```python macroforecast.models.holt_winters( y, *, trend="add", seasonal=None, seasonal_periods=None, damped_trend=False, ) ``` Fits a target-only Holt-Winters exponential-smoothing model. In the forecasting runner it is a target-input model: feature matrices are used only to provide the forecast index and horizon length. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `trend` | `"add"` | fixed by preset | Trend component. | | `seasonal` | `None` | fixed by preset | Optional seasonal component. | | `seasonal_periods` | `None` | fixed by preset | Seasonal period length. | | `damped_trend` | `False` | fixed by preset | Whether to damp the trend component. | Output is `ModelFit`; predictions are indexed like the supplied future frame. ### theta_method ```python macroforecast.models.theta_method( y, *, period=None, deseasonalize=True, use_test=True, ) ``` Fits statsmodels' target-only Theta method wrapper. Use it as a benchmark univariate model; it does not consume predictor columns. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `period` | `None` | fixed by preset | Seasonal period passed to statsmodels. | | `deseasonalize` | `True` | fixed by preset | Whether statsmodels deseasonalizes before fitting. | | `use_test` | `True` | fixed by preset | Whether statsmodels uses its internal seasonality test. | Output is `ModelFit`; `predict(X_future)` returns a point forecast series. ### dfm_mixed_mariano_murasawa ```python macroforecast.models.dfm_mixed_mariano_murasawa( panel, *, target=None, metadata=None, monthly_columns=None, quarterly_columns=None, unsupported="raise", n_factors=1, factor_order=1, idiosyncratic_ar1=True, standardize=True, maxiter=500, tolerance=1e-6, ) ``` Fits a monthly/quarterly dynamic factor model through `statsmodels.tsa.statespace.dynamic_factor_mq.DynamicFactorMQ`. The callable uses the Mariano-Murasawa state-space aggregation for quarterly variables by ordering monthly columns first, quarterly columns second, and passing `k_endog_monthly` to statsmodels. R comparison: this is a backend-wrapper analogue of the mixed-frequency DFM contract used by `dfms::DFM(X, quarterly.vars=...)` and archived `nowcasting::nowcast(method="EM")`. Those R implementations require quarterly series to be positioned after monthly series and impose the Mariano-Murasawa `[1, 2, 3, 2, 1]` temporal aggregation restriction for quarterly growth/flow variables. macroforecast delegates the Kalman/EM likelihood to statsmodels rather than reimplementing the R/C++ filter code. The preferred input is a native mixed-frequency bundle: ```python mixed = mf.data.combine(monthly_bundle, quarterly_bundle, frequency="native") fit = mf.models.dfm_mixed_mariano_murasawa(mixed, target="GDPC1") ``` `metadata["native_frequency_by_column"]` is used to split monthly and quarterly columns. If metadata are absent, the function infers frequencies from observed date spacing. Explicit `monthly_columns` and `quarterly_columns` override metadata. Unsupported frequencies raise by default; set `unsupported="drop"` to drop those columns before fitting. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `target` | `None` | fixed by preset | Forecasted panel column. Defaults to first quarterly column, otherwise first column. | | `metadata` | `None` | fixed by preset | Metadata with `native_frequency_by_column`; normally supplied by `DataBundle`. | | `monthly_columns` | `None` | fixed by preset | Explicit monthly columns. | | `quarterly_columns` | `None` | fixed by preset | Explicit quarterly columns. | | `unsupported` | `"raise"` | fixed by preset | Unsupported frequency policy: `"raise"` or `"drop"`. | | `n_factors` | `1` | yes | Number of dynamic factors. | | `factor_order` | `1` | yes | VAR order for factor dynamics. | | `idiosyncratic_ar1` | `True` | fixed by preset | Model idiosyncratic components as AR(1). | | `standardize` | `True` | fixed by preset | Let statsmodels standardize observed variables before fitting. | | `maxiter` | `500` | fixed by preset | EM iteration cap. | | `tolerance` | `1e-6` | fixed by preset | EM convergence tolerance. | | Preset | `n_factors` | `factor_order` | | --- | --- | --- | | `small` | `(1,)` | `(1,)` | | `standard` | `(1, 2)` | `(1, 2)` | | `wide` | `(1, 2, 3)` | `(1, 2, 3)` | Diagnostics include filtered factors, fitted target values when available, target residuals, likelihood, and fitted parameter estimates. ### dfm_unrestricted_midas ```python macroforecast.models.dfm_unrestricted_midas( panel, *, target, metadata=None, lag_columns=None, lags=(0, 1, 2), factor_lags=(0,), target_frequency="quarterly", anchor_position="period_end", n_factors=1, factor_order=1, idiosyncratic_ar1=True, standardize=True, maxiter=500, tolerance=1e-6, alpha=0.0, fit_intercept=True, drop_missing=True, ) ``` Fits a composite mixed-frequency model: 1. Fit `dfm_mixed_mariano_murasawa(...)` on the native mixed-frequency panel. 2. Extract filtered DFM factors at the target anchor dates. 3. Add optional observed lag blocks from `mixed_frequency_lags(...)`. 4. Fit `unrestricted_midas(...)` as the forecast head. This is a convenience composite, not a new state-space likelihood. The returned fit's `predict()` method accepts a prepared feature matrix with the same columns as `fit.estimator.design_`. The lower-level `fit.estimator.predict_from_panel(...)` method rebuilds the composite design from a native mixed-frequency panel. `forecasting.run(...)` uses that method: it fits the MIDAS head on the training panel, masks the test target values, then refits the DFM on the available native panel so current monthly information can enter the test-origin factor design without using the held-out target. R comparison: this is the explicit callable version of a two-stage workflow, not a single R estimator. The first stage follows the DFM contract above. The forecast head is aligned with `midasr::midas_u` when `alpha=0`; `alpha>0` is a macroforecast ridge extension. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `target` | required | fixed by preset | Forecasted target column. | | `metadata` | `None` | fixed by preset | Metadata with native frequencies; normally supplied by `DataBundle`. | | `lag_columns` | `None` | fixed by preset | Observed columns added as unrestricted MIDAS lags. | | `lags` | `(0, 1, 2)` | yes | Native-frequency lags for observed columns. | | `factor_lags` | `(0,)` | yes | Monthly lags of filtered DFM factors. | | `target_frequency` | `"quarterly"` | fixed by preset | Frequency used to position target anchor dates. | | `anchor_position` | `"period_end"` | fixed by preset | Anchor positioning; useful for FRED-QD quarter-start dates. | | `n_factors` | `1` | yes | Number of DynamicFactorMQ factors. | | `factor_order` | `1` | yes | VAR order for factor dynamics. | | `idiosyncratic_ar1` | `True` | fixed by preset | Model DFM idiosyncratic components as AR(1). | | `standardize` | `True` | fixed by preset | Let DynamicFactorMQ standardize observed variables. | | `maxiter` | `500` | fixed by preset | DFM EM iteration cap. | | `tolerance` | `1e-6` | fixed by preset | DFM EM convergence tolerance. | | `alpha` | `0.0` | yes | Ridge penalty on the unrestricted MIDAS head. | | `fit_intercept` | `True` | fixed by preset | Whether the unrestricted MIDAS head includes an intercept. | | `drop_missing` | `True` | fixed by preset | Drop incomplete composite design rows before fitting the head. | ### midas_almon ```python macroforecast.models.midas_almon( X, y, *, polynomial_order=2, theta=None, alpha=0.0, fit_intercept=True, ) ``` Fits a MIDAS regression where each lag group is compressed with normalized exponential Almon weights before a linear or ridge head is fit. R comparison: `midasr::midas_r(..., nealmon)` jointly estimates the aggregate scale and Almon shape by nonlinear least squares. macroforecast keeps the shape fixed as a hyperparameter and estimates only the aggregate regression coefficient in a linear/ridge head. The weight shape matches the scale-free part of `midasr::nealmon`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `polynomial_order` | `2` | yes | Degree of the Almon polynomial. | | `theta` | `None` | fixed by preset | Shape coefficients for the scale-free part of `midasr::nealmon`. `None` gives equal weights. | | `alpha` | `0.0` | yes | Ridge penalty on the regression head. | | `fit_intercept` | `True` | fixed by preset | Whether the regression head includes an intercept. | Weight formula: ```text h_j = j, j = 1, ..., d w_j = exp(theta_1 h_j + ... + theta_p h_j^p) / sum_j exp(...) ``` If `theta` is supplied, it must contain `polynomial_order` values. The aggregate coefficient scale is estimated by the regression head, corresponding to the first scale parameter in `midasr::nealmon`. ### midas_beta ```python macroforecast.models.midas_beta( X, y, *, beta_params=(1.0, 1.0), alpha=0.0, fit_intercept=True, ) ``` Fits a MIDAS regression where each lag group is compressed with normalized beta weights before a linear or ridge head is fit. R comparison: this uses the scale-free form of `midasr::nbetaMT` with `p=(1, a, b, 0)`: endpoints are shifted by machine epsilon, the beta density is normalized, and the aggregate scale is estimated by the regression head. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `beta_params` | `(1.0, 1.0)` | yes | Positive beta-shape parameters `(a, b)`. | | `alpha` | `0.0` | yes | Ridge penalty on the regression head. | | `fit_intercept` | `True` | fixed by preset | Whether the regression head includes an intercept. | Weight formula: ```text z_j = (j - 1) / (d - 1), with endpoint epsilon adjustment w_j = z_j^(a-1) (1-z_j)^(b-1) / sum_j z_j^(a-1) (1-z_j)^(b-1) ``` Both beta parameters must be strictly positive. ### midas_step ```python macroforecast.models.midas_step( X, y, *, n_steps=3, step_bounds=None, step_weights=None, alpha=0.0, fit_intercept=True, ) ``` Fits a MIDAS regression where lags are grouped into piecewise-constant step blocks. If `step_bounds` and `step_weights` are omitted, the lag range is split into `n_steps` blocks with equal raw step heights, then normalized to a scale-free weight vector. R comparison: `midasr::polystep(p, d, m, a)` repeats raw step coefficients between interior cut points. macroforecast exposes the same idea through `step_bounds=a` and `step_weights=p`, then normalizes the resulting shape because the aggregate scale is estimated by the regression head. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_steps` | `3` | yes | Number of lag buckets when `step_bounds` is omitted. | | `step_bounds` | `None` | fixed by preset | Optional interior cut points, matching `midasr::polystep(..., a=...)`. | | `step_weights` | `None` | fixed by preset | Optional raw step heights, one per bucket. | | `alpha` | `0.0` | yes | Ridge penalty on the regression head. | | `fit_intercept` | `True` | fixed by preset | Whether the regression head includes an intercept. | `n_steps` must be positive. If supplied, `step_bounds` must be strictly increasing and smaller than the number of lag columns; `step_weights` must contain one value per resulting bucket. ### restricted_midas ```python macroforecast.models.restricted_midas( X, y, *, weighting="almon", polynomial_order=2, start_params=None, n_steps=3, step_bounds=None, fit_intercept=True, maxiter=1000, tolerance=1e-8, ) ``` Fits a nonlinear restricted MIDAS regression over an explicit lag matrix. This is the direct callable counterpart to `midasr::midas_r` when the formula has already been expanded into columns such as `PAYEMS_lag0`, `PAYEMS_lag1`, and `PAYEMS_lag2`. R comparison: `midasr::midas_r` maps each low-dimensional restriction parameter vector into full lag coefficients and minimizes the nonlinear least-squares objective. `restricted_midas()` uses the same objective and the same `nealmon`, `nbetaMT`, or `polystep` coefficient maps. It uses SciPy `least_squares` instead of R's default `optim(method="BFGS")`, so optimizer traces are not bit-identical, but the restricted regression equation is the same. Formula parsing, AR* common-factor terms, HAC covariance, model tables, and S3 forecast utilities are not reproduced here. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `weighting` | `"almon"` | yes | Restriction map: `"almon"`/`"nealmon"`, `"beta"`/`"nbetaMT"`, or `"step"`/`"polystep"`. | | `polynomial_order` | `2` | yes | Number of Almon shape terms after the aggregate scale parameter. | | `start_params` | `None` | fixed by preset | Starting values. Pass one sequence for all lag groups, or a mapping from group name to sequence. | | `n_steps` | `3` | yes | Number of step buckets when `weighting="step"` and `step_bounds` is omitted. | | `step_bounds` | `None` | fixed by preset | Interior cut points for `polystep`-style step coefficients. | | `fit_intercept` | `True` | fixed by preset | Whether to estimate an intercept outside the restricted lag coefficients. | | `maxiter` | `1000` | fixed by preset | Maximum SciPy least-squares function evaluations. | | `tolerance` | `1e-8` | fixed by preset | Shared `xtol`, `ftol`, and `gtol` stopping tolerance. | Outputs include fitted values, residuals, unrestricted effective lag coefficients, the optimized restricted parameter vector, convergence metadata, and the lag-group metadata used to expand coefficients. ### unrestricted_midas ```python macroforecast.models.unrestricted_midas( X, y, *, alpha=0.0, fit_intercept=True, ) ``` Fits an unrestricted MIDAS regression. Unlike the weighted variants, it does not collapse lag groups; each supplied lag column receives its own coefficient. R comparison: this matches `midasr::midas_u` when `alpha=0`, because every lag coefficient is free and the regression is ordinary least squares. `alpha>0` is a macroforecast ridge extension for high-dimensional lag matrices. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `alpha` | `0.0` | yes | Ridge penalty. `0.0` gives an ordinary linear head. | | `fit_intercept` | `True` | fixed by preset | Whether the regression head includes an intercept. | ### MIDAS Input Contract The MIDAS callables expect lag-grouped predictor columns, typically names like `PAYEMS_lag0`, `PAYEMS_lag1`, and `PAYEMS_lag2`. Columns sharing the same prefix before `_lag#` are collapsed into one weighted aggregate before a linear or ridge regression is fit. This keeps mixed-frequency weighting as a model choice while leaving calendar alignment and lag construction in `data`, `preprocessing`, and `feature_engineering`. These callables are small model functions, not workflow recipes. They do not infer target anchors, release calendars, or future design matrices. Build the lag matrix explicitly with `mixed_frequency_lags(...)`, align `X` and `y`, then call the model. The weighted MIDAS callables `midas_almon()`, `midas_beta()`, and `midas_step()` treat shape parameters as fixed or selected hyperparameters. Use `restricted_midas()` when the shape and scale parameters should be estimated jointly by nonlinear least squares, matching the `midasr::midas_r` estimation target. For `unrestricted_midas()`, build its input matrix with `mf.feature_engineering.mixed_frequency_lags(...)` when the source data are native mixed-frequency panels. All MIDAS callables preserve lag metadata. Weighted variants record lag groups, resolved weights, weighted aggregate column names, aggregate coefficients, and effective lag coefficients. `unrestricted_midas()` records the original lag groups and per-lag coefficients. ```python X_midas = mf.feature_engineering.mixed_frequency_lags( mixed, target="GDPC1", columns=["PAYEMS", "INDPRO"], lags=range(0, 12), target_frequency="quarterly", anchor_position="period_end", drop_missing=True, ) y = mixed.panel["GDPC1"].dropna() y.index = y.index.to_period("Q").asfreq("M", how="end").to_timestamp() aligned = X_midas.join(y.rename("GDPC1")).dropna() fit = mf.models.midas_beta( aligned.drop(columns="GDPC1"), aligned["GDPC1"], beta_params=(1.0, 2.0), alpha=0.1, ) fit.metadata["weights"] fit.diagnostics["effective_lag_coefficients"] ``` ### far ```python macroforecast.models.far( X, y, *, n_factors=3, n_lag=1, random_state=0, ) ``` Fits factor-augmented autoregression: PCA factors from `X` plus AR lags of `y`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_factors` | `3` | yes | Number of PCA factors. | | `n_lag` | `1` | yes | Autoregressive lag order. | | `random_state` | `0` | fixed by preset | PCA random seed. | | Preset | `n_factors` | `n_lag` | | --- | --- | --- | | `small` | `(1, 2, 3)` | `(1, 2, 4)` | | `standard` | `(1, 2, 3, 5, 8)` | `(1, 2, 4, 6, 12)` | | `wide` | `(1, 2, 3, 5, 8, 10, 12)` | `(1, 2, 3, 4, 6, 9, 12, 18, 24)` | ### favar ```python macroforecast.models.favar( X, y, *, n_factors=2, n_lag=2, fctmethod="BBE", slowcode=None, factorprior=None, varprior=None, nburn=5000, nrep=15000, standardize=True, random_state=0, ) ``` Fits a Bayesian FAVAR aligned with CRAN `FAVAR::FAVAR`: optional R-style standardization, `ExtrPC()` factor extraction, BBE `facrot()` or BGM factor identification, conjugate loading-equation draws, and the internal `FAVAR::BVAR` posterior sampler for the `[factors, y]` VAR block. Important boundary: BVAR forecasting is standard, and CRAN `BVAR` has `predict.bvar`. The macroforecast-specific extension here is narrower: CRAN `FAVAR` exposes summaries, coefficients, and impulse responses for `favar` objects, but not `predict.favar`. Therefore `macroforecast.models.favar(...).predict(...)` is a ModelFit forecast wrapper over the fitted FAVAR posterior VAR state using posterior-mean coefficients. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_factors` | `2` | yes | Number of latent factors. | | `n_lag` | `2` | yes | VAR lag order on the target plus factors. | | `fctmethod` | `"BBE"` | fixed by preset | Factor identification method: `"BBE"` or `"BGM"`. | | `slowcode` | `None` | fixed by preset | Boolean slow-variable mask required by BBE. | | `factorprior` | `None` | fixed by preset | Factor loading prior controls. | | `varprior` | `None` | fixed by preset | BVAR prior controls for the factor VAR block. | | `nburn` | `5000` | fixed by preset | Burn-in iterations for posterior draws. | | `nrep` | `15000` | fixed by preset | Saved posterior draw count. | | `standardize` | `True` | fixed by preset | Use R `scale()` semantics for X and y before factor extraction. | | `random_state` | `0` | fixed by preset | Random seed for posterior draws. | | Preset | `n_factors` | `n_lag` | | --- | --- | --- | | `small` | `(1, 2, 3)` | `(1, 2, 4)` | | `standard` | `(1, 2, 3, 5, 8)` | `(1, 2, 4, 6, 12)` | | `wide` | `(1, 2, 3, 5, 8, 10, 12)` | `(1, 2, 3, 4, 6, 9, 12, 18, 24)` | ## Tree And Machine-Learning Models ### Tree Implementation Map Tree callables use backend wrappers, hybrid wrappers, or package-native code. Fit-time model ensembles such as bagging, subagging, stacking, Super Learner, and Booging live in [Model Ensemble](model_ensemble.md). | Model | Implementation class | Runtime backend | | --- | --- | --- | | `decision_tree` | backend wrapper | `sklearn.tree.DecisionTreeRegressor` | | `random_forest` | backend wrapper | `sklearn.ensemble.RandomForestRegressor` | | `extra_trees` | backend wrapper | `sklearn.ensemble.ExtraTreesRegressor` | | `gradient_boosting` | backend wrapper | `sklearn.ensemble.GradientBoostingRegressor` | | `xgboost` | optional backend wrapper | `xgboost.XGBRegressor` | | `lightgbm` | optional backend wrapper | `lightgbm.LGBMRegressor` | | `lgb_plus` | package-native hybrid | LGB+ competition algorithm aligned to `philgoucou/lgbplus`, using `lightgbm.train` for residual tree candidates | | `lgba_plus` | package-native hybrid | LGB^A+ alternating algorithm aligned to `philgoucou/lgbplus`, using `lightgbm.train` for residual tree blocks | | `catboost` | optional backend wrapper | `catboost.CatBoostRegressor` | | `quantile_regression_forest` | hybrid | sklearn forest plus macroforecast leaf-target quantile store | | `macro_random_forest` | hybrid adapter | vendored `macroforecast.models._mrf_reference.MacroRandomForest` | `backend wrapper` means the statistical estimator is delegated to the named package and macroforecast standardizes callable input, metadata, diagnostics, and persistence. `hybrid` means macroforecast owns part of the algorithmic contract, such as resampling, leaf-distribution storage, feature augmentation, or pandas-to-reference-package adaptation. `package-native` means the estimator logic itself is implemented inside macroforecast. ### decision_tree ```python macroforecast.models.decision_tree( X, y, *, max_depth=None, min_samples_leaf=1, random_state=0, ) ``` Fits sklearn CART regression. R parity is intentionally not claimed for the sklearn tree wrappers. The named backend owns the estimator; macroforecast owns the pandas `X, y` contract, metadata, diagnostics, and search-space registration. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `max_depth` | `None` | yes | Maximum tree depth. | | `min_samples_leaf` | `1` | yes | Minimum samples per terminal leaf. | | `random_state` | `0` | fixed by preset | Tree random seed. | | Preset | `max_depth` | `min_samples_leaf` | | --- | --- | --- | | `small` | `(3, 5, None)` | `(1, 3)` | | `standard` | `(3, 5, 10, None)` | `(1, 3, 5)` | | `wide` | `(2, 3, 5, 10, 20, None)` | `(1, 2, 3, 5, 10)` | ### random_forest ```python macroforecast.models.random_forest( X, y, *, n_estimators=200, max_depth=None, min_samples_leaf=1, random_state=0, n_jobs=1, ) ``` Fits sklearn random forest regression. The fitted wrapper exposes sklearn feature importances through `fit.diagnostics["feature_importance"]`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `200` | yes | Number of trees. | | `max_depth` | `None` | yes | Maximum depth per tree. | | `min_samples_leaf` | `1` | yes | Minimum samples per terminal leaf. | | `random_state` | `0` | fixed by preset | Forest random seed. | | `n_jobs` | `1` | fixed by preset | Parallel worker count. | | Preset | `n_estimators` | `max_depth` | `min_samples_leaf` | | --- | --- | --- | --- | | `small` | `(50, 100)` | `(3, 5, None)` | `(1, 3)` | | `standard` | `(100, 200, 500)` | `(3, 5, 10, None)` | `(1, 3, 5)` | | `wide` | `(100, 200, 500, 1000)` | `(3, 5, 10, 20, None)` | `(1, 2, 3, 5, 10)` | Default model-selection method: `random`. ### extra_trees ```python macroforecast.models.extra_trees( X, y, *, n_estimators=200, max_depth=None, min_samples_leaf=1, random_state=0, n_jobs=1, ) ``` Fits sklearn extremely randomized trees. Parameters and presets match `random_forest`. Default model-selection method: `random`. ### gradient_boosting ```python macroforecast.models.gradient_boosting( X, y, *, n_estimators=200, learning_rate=0.1, max_depth=3, random_state=0, ) ``` Fits sklearn gradient-boosted regression trees. This is one boosting estimator. Fit-time boosting ensembles such as Booging live in [Model Ensemble](model_ensemble.md). | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `200` | yes | Number of boosting stages. | | `learning_rate` | `0.1` | yes | Shrinkage per stage. | | `max_depth` | `3` | yes | Maximum tree depth. | | `random_state` | `0` | fixed by preset | Boosting random seed. | | Preset | `n_estimators` | `learning_rate` | `max_depth` | | --- | --- | --- | --- | | `small` | `(50, 100)` | `(0.05, 0.1)` | `(2, 3)` | | `standard` | `(100, 200, 500)` | `(0.03, 0.05, 0.1)` | `(2, 3, 5)` | | `wide` | `(100, 200, 500, 1000)` | `(0.01, 0.03, 0.05, 0.1)` | `(2, 3, 5, 8)` | Default model-selection method: `random`. ### mars ```python macroforecast.models.mars( X, y, *, max_terms=20, max_degree=1, n_knots=10, min_improvement=1e-6, penalty=2.0, prune=True, ) ``` Fits a package-native MARS-style hinge-basis regression. It uses forward insertion of hinge basis pairs and optional backward pruning by generalized cross-validation. This avoids the unmaintained `pyearth` dependency; it is a clean internal implementation and does not claim bit-level equivalence to other MARS backends. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `max_terms` | `20` | yes | Maximum number of basis terms including intercept. | | `max_degree` | `1` | yes | Maximum interaction degree. | | `n_knots` | `10` | yes | Candidate quantile knots per predictor. | | `min_improvement` | `1e-6` | fixed by preset | Forward-step relative RSS improvement floor. | | `penalty` | `2.0` | fixed by preset | GCV pruning complexity penalty. | | `prune` | `True` | fixed by preset | Whether to prune terms by GCV. | Default model-selection method: `random`. ### xgboost ```python macroforecast.models.xgboost( X, y, *, n_estimators=300, learning_rate=0.1, max_depth=6, subsample=1.0, random_state=0, **kwargs, ) ``` Fits `xgboost.XGBRegressor`. Requires `macroforecast[xgboost]`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `300` | yes | Number of boosting stages. | | `learning_rate` | `0.1` | yes | Shrinkage per stage. | | `max_depth` | `6` | yes | Maximum tree depth. | | `subsample` | `1.0` | yes | Row subsample share. | | `random_state` | `0` | fixed by preset | Boosting random seed. | Preset spaces match `gradient_boosting` plus `subsample=(0.6, 0.8, 1.0)`. Default model-selection method: `random`. ### lightgbm ```python macroforecast.models.lightgbm( X, y, *, n_estimators=300, learning_rate=0.1, max_depth=-1, num_leaves=31, random_state=0, **kwargs, ) ``` Fits `lightgbm.LGBMRegressor`. Requires `macroforecast[lightgbm]`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `300` | yes | Number of boosting stages. | | `learning_rate` | `0.1` | yes | Shrinkage per stage. | | `max_depth` | `-1` | yes | Maximum tree depth; `-1` means no limit. | | `num_leaves` | `31` | yes | Maximum leaves per tree. | | `random_state` | `0` | fixed by preset | Boosting random seed. | | Preset | `n_estimators` | `learning_rate` | `max_depth` | `num_leaves` | | --- | --- | --- | --- | --- | | `small` | `(50, 100)` | `(0.05, 0.1)` | `(-1, 3, 5)` | `(15, 31)` | | `standard` | `(100, 200, 500)` | `(0.03, 0.05, 0.1)` | `(-1, 3, 5, 10)` | `(15, 31, 63)` | | `wide` | `(100, 200, 500, 1000)` | `(0.01, 0.03, 0.05, 0.1)` | `(-1, 3, 5, 10, 20)` | `(15, 31, 63, 127)` | Default model-selection method: `random`. ### lgb_plus #### Paper Citation And Source Primary paper: > Goulet Coulombe, Philippe. 2026. "LGB+: A Macroeconomic Forecasting Road > Test." Draft dated March 18, 2026. SSRN abstract 6439178. > DOI: [`10.2139/ssrn.6439178`](https://doi.org/10.2139/ssrn.6439178). > Paper page: . Reference implementation: | Source | Role in `macroforecast` | | --- | --- | | [`philgoucou/lgbplus`](https://github.com/philgoucou/lgbplus) | Original R/Python implementation repository. | | `R/lgb_plus.R` | Competition algorithm, `linear_candidate_fraction`, component prediction helpers, and ensemble helper. | | `python/lgb_plus.py` | Competition estimator class with in-class ensemble storage and step histories. | | `R/lgb_plus_A.R` | Alternating algorithm and `lgb_plus_A_ensemble` helper. | | `python/lgb_plus_A.py` | Alternating estimator class. | #### Paper Motivation The paper targets a macro forecasting problem that appears repeatedly in small and medium macro samples. Tree boosting is strong for nonlinearities and interactions, but a large share of macro predictive content can be simple: autoregressive persistence, slowly moving accounting-like relationships, or near-mechanical links such as claims before unemployment and permits before housing starts. A standard tree booster can approximate those linear slopes, but it spends splits and boosting capacity to do work that a one-variable linear update could do cheaply. LGB+ expands the boosting basis from "trees only" to "trees plus greedy linear updates." The forecast remains additive: ```text yhat(x) = intercept + tree_component(x) + linear_component(x) ``` That additive form is not a cosmetic detail. It lets the user inspect the forecast through two channels: | Channel | Intended role | Caveat | | --- | --- | --- | | Linear | Persistence, autoregressive slopes, near-accounting links, and other simple one-variable residual corrections. | The split is operational, not metaphysical; a tree can still learn linear-looking structure. | | Tree | Nonlinear states, interactions, thresholds, and regime-dependent effects. | Tree gains can include simple structures if the linear candidate loses the competition. | The paper emphasizes that the linear/nonlinear split should be read as an algorithmic decomposition generated by the boosting path, not as proof that the data-generating process is literally separated into pure linear and pure nonlinear blocks. #### Paper Empirical Design The empirical road test uses transformed quarterly U.S. macro data from FRED-QD. The review file summarized the design as six targets: headline CPI inflation, GDP growth, unemployment, housing starts growth, industrial production, and the term spread. Predictors include FRED-QD transformations, four lags, MARX moving-average features, and principal components from the transformed panel. The out-of-sample design is expanding-window forecasting, with a pre-COVID evaluation period and a post-COVID stress period. This matters for using `macroforecast`: | Paper object | `macroforecast` stage | | --- | --- | | FRED-QD transformed panel | `macroforecast.data.load_fred_qd(...)` then `macroforecast.preprocessing.reprocess(...)`. | | Four lags | `macroforecast.feature_engineering.lag(...)` or runner `feature_spec(...)`. | | MARX moving-average features | `macroforecast.feature_engineering.moving_average_ladder(...)` or `marx_step(...)`. | | Principal components | `macroforecast.feature_engineering.pca_features(...)` or preprocessing/factor features when fit-window-safe execution is needed. | | Expanding OOS design | `macroforecast.window.estimation_expanding(...)` plus `test_origins(...)`. | | Re-estimation schedule | `macroforecast.window.estimation_expanding(..., retrain_every=...)`. | | LGB+ model | `macroforecast.models.lgb_plus(...)`. | | LGB^A+ model | `macroforecast.models.lgba_plus(...)`. | | Linear/tree forecast decomposition | `fit.estimator.predict_components(...)` and diagnostics in `ModelFit`. | #### Method In The Paper The paper has two closely related estimators. `LGB+` is the competition version. At each boosting step: | Step | Operation | | --- | --- | | 1 | Start from the current fitted value. | | 2 | Compute residuals on the training sample. | | 3 | Draw a row subsample. | | 4 | Fit one small LightGBM residual tree on the subsample. | | 5 | Fit one greedy univariate linear residual update on the same subsample. | | 6 | Evaluate both candidate updates using `oob`, fixed `validation`, or `training` loss. | | 7 | Accept only the lower-loss candidate. | | 8 | Record which channel won, the selected linear feature when relevant, and the candidate losses. | `LGB^A+` is the alternating version. It does not run a per-step competition. Instead, each cycle applies a block of residual trees and then a greedy univariate linear correction. This is computationally simpler and can be more stable when the OOB judge is noisy in macro-sized samples. #### Main Paper Findings To Keep In Mind The paper's simulations and empirical road test support the following working interpretation: | Finding | Practical implication | | --- | --- | | In mostly linear designs, the linear channel can absorb much of the signal and avoid forcing trees to approximate simple slopes. | Include autoregressive and near-accounting predictors explicitly; then inspect the linear channel. | | In nonlinear designs, the tree channel remains active and the hybrid does not have to behave like a linear model. | LGB+ is a flexible hybrid, not a linear model with tree residuals fixed in advance. | | The linear channel is often useful for short-horizon unemployment and industrial production before COVID. | Channel diagnostics can reveal whether gains come from persistence-like relations or nonlinear state recognition. | | In post-COVID stress periods, the linear channel can become harmful for some real-activity targets. | Report channel-specific diagnostics; do not rely only on total RMSE. | | Forecasts can be decomposed natively into tree and linear pieces. | Use `predict_components()` and store `ModelFit.diagnostics` when writing replication outputs. | #### What `macroforecast` Implements `macroforecast.models.lgb_plus` implements the competition estimator as a package-native hybrid model. LightGBM supplies the residual tree candidate, but the step loop, linear candidate, OOB/validation/training selection, ensemble aggregation, channel accounting, and pandas metadata are implemented inside `macroforecast`. The implementation is deliberately not just a thin `LGBMRegressor` wrapper: | Feature | Implemented? | Notes | | --- | --- | --- | | Tree candidate per step | yes | Uses `lightgbm.train(..., num_boost_round=1)`. | | Greedy univariate linear candidate | yes | Uses the same no-intercept residual slope as the competition reference code. | | `linear_candidate_fraction` | yes | Kept from the R implementation. | | `selection_method="oob"` | yes | Default; requires `subsample < 1`. | | `selection_method="validation"` | yes | Uses a fixed random validation split inside the current fit window. | | `selection_method="training"` | yes | Available for reference parity, but not recommended for macro evaluation. | | Ensemble members | yes | Controlled by `n_ensemble`; predictions aggregate by mean or median. | | Linear component prediction | yes | `predict_components(...).prediction_linear`. | | Tree component prediction | yes | `predict_components(...).prediction_tree`. | | Channel diagnostics | yes | `channel_summary`, `channel_importance`, and `training_history`. | | AXIL historical weights | no | This belongs in interpretation/forecast analysis later, not inside the estimator. | | Full paper table replication | no | The callable model is implemented; full empirical replication should be a separate replication package. | ```python macroforecast.models.lgb_plus( X, y, *, n_ensemble=10, n_steps=200, learning_rate=0.05, subsample=0.7, num_leaves=5, min_data_in_leaf=20, lambda_l2=0.1, linear_candidate_fraction=0.5, selection_method="oob", val_fraction=0.2, early_stop_patience=50, aggregation="mean", random_state=0, verbose=False, **kwargs, ) ``` Fits LGB+ competition boosting from Goulet Coulombe's [`philgoucou/lgbplus`](https://github.com/philgoucou/lgbplus) reference code. This is not ordinary `lightgbm` with extra linear features. At every boosting step the estimator builds two residual updates: | Candidate | Reference-code operation | Accepted when | | --- | --- | --- | | Tree | Fit one small `lightgbm.train(...)` residual tree with manual shrinkage. | Candidate loss is no larger than the linear candidate. | | Linear | Sample `linear_candidate_fraction` of features, choose the largest absolute residual correlation, and fit one no-intercept residual slope. | Candidate loss is lower than the tree candidate. | The R reference file `R/lgb_plus.R` includes `linear_candidate_fraction`; the Python reference file `python/lgb_plus.py` embeds the ensemble in the estimator but does not expose that candidate-fraction argument. `macroforecast` combines those two reference surfaces: `n_ensemble` controls independent runs and `linear_candidate_fraction` controls greedy linear candidate subsampling. Input is the standard supervised model contract: | Input | Required shape | Meaning | | --- | --- | --- | | `X` | `pandas.DataFrame`, `FeatureSet`, or array-like with shape `(n_obs, n_features)` | Predictor matrix. DataFrame column names are preserved in diagnostics. | | `y` | `pandas.Series` or array-like with length `n_obs` | Forecast target for the current fit window. | Output is a `ModelFit` with `model="lgb_plus"`. Use `fit.predict(X_new)` for the total prediction. The estimator also exposes: | Method or diagnostic | Output | Meaning | | --- | --- | --- | | `fit.estimator.predict_components(X_new)` | DataFrame | `prediction_total`, `prediction_init`, `prediction_tree`, `prediction_linear`. | | `fit.estimator.predict_individual(X_new)` | ndarray | One total prediction path per ensemble member. | | `fit.estimator.channel_importance()` | DataFrame | Tree gain, linear selection count, and absolute linear update by feature. | | `fit.diagnostics["channel_summary"]` | dict | Total tree and linear steps plus per-member counts. | | `fit.diagnostics["training_history"]` | dict | Per-step candidate losses and selected channel metadata. | | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_ensemble` | `10` | yes | Independent boosting runs. Predictions are aggregated across runs. | | `n_steps` | `200` | yes | Maximum tree/linear competition steps per run. | | `learning_rate` | `0.05` | yes | Shared shrinkage for accepted tree or linear updates. | | `subsample` | `0.7` | yes | Row subsample share per step. `selection_method="oob"` requires `< 1`. | | `num_leaves` | `5` | yes | Maximum leaves for the one-step LightGBM tree candidate. | | `min_data_in_leaf` | `20` | yes | Minimum rows per LightGBM leaf. | | `lambda_l2` | `0.1` | fixed by preset | L2 penalty for LightGBM tree candidates. | | `linear_candidate_fraction` | `0.5` | yes | Fraction of predictors sampled before greedy linear selection. | | `selection_method` | `"oob"` | fixed by preset | Candidate judge: `"oob"`, `"validation"`, or `"training"`. | | `val_fraction` | `0.2` | fixed by preset | Fixed validation share when `selection_method="validation"`. | | `early_stop_patience` | `50` | fixed by preset | Stop after this many non-improving selected losses; `None` disables. | | `aggregation` | `"mean"` | fixed by preset | Ensemble aggregation: `"mean"` or `"median"`. | | `random_state` | `0` | fixed by preset | Base random seed. Each ensemble member increments it. | | `**kwargs` | none | fixed by caller | Additional `lightgbm.train` parameters for residual tree candidates. | | Preset | Main search dimensions | | --- | --- | | `small` | `n_ensemble`, `n_steps`, `learning_rate`, `subsample`, `num_leaves`, `min_data_in_leaf`, `linear_candidate_fraction` over narrow ranges. | | `standard` | Same dimensions with 5-10 members, 100-400 steps, and candidate fractions `(0.33, 0.5, 1.0)`. | | `wide` | Same dimensions with up to 20 members, 600 steps, lower learning rates, and broader subsampling. | Default model-selection method: `random`. ### lgba_plus #### Paper Link `lgba_plus` is the `macroforecast` callable for the paper's alternating variant, LGB^A+. It uses the same paper and source references as `lgb_plus`: Goulet Coulombe (2026), "LGB+: A Macroeconomic Forecasting Road Test," SSRN 6439178, DOI [`10.2139/ssrn.6439178`](https://doi.org/10.2139/ssrn.6439178), and the [`philgoucou/lgbplus`](https://github.com/philgoucou/lgbplus) source repository. #### What Changes Relative To `lgb_plus` The alternating version is easier to read and usually cheaper to fit: | Dimension | `lgb_plus` | `lgba_plus` | | --- | --- | --- | | Update schedule | Tree and linear candidates compete; one winner advances. | Every cycle applies both a tree block and one linear correction. | | Main count parameter | `n_steps` per ensemble member. | `n_cycles` and `trees_per_cycle`. | | Tree learning rate | Shared `learning_rate`. | `lr_tree`. | | Linear learning rate | Shared `learning_rate`. | `lr_linear`. | | Linear update | No-intercept residual slope in the competition reference code. | Intercept plus slope, matching the alternating reference code. | | Ensemble control | `n_ensemble`. | `n_runs`. | | Main diagnostic | Winner path and candidate losses. | Cycle path and selected linear feature after each tree block. | Use `lgba_plus` when the goal is a stable hybrid path rather than estimating the best tree/linear mix at every individual step. Use `lgb_plus` when the winner sequence itself is part of the analysis. #### What `macroforecast` Implements | Feature | Implemented? | Notes | | --- | --- | --- | | Residual tree blocks | yes | Uses `lightgbm.train(..., num_boost_round=trees_per_cycle)`. | | Greedy linear correction after every tree block | yes | Selects the largest absolute residual correlation. | | Separate tree and linear learning rates | yes | `lr_tree`, `lr_linear`. | | `n_runs` alternating ensemble | yes | Folds the R `lgb_plus_A_ensemble` helper into the estimator API. | | Component prediction | yes | Same `predict_components()` output columns as `lgb_plus`. | | Channel importance | yes | Tree gain, linear selection count, and absolute linear update. | | Full AXIL dual interpretation | no | Planned for interpretation/forecast analysis rather than model fitting. | ```python macroforecast.models.lgba_plus( X, y, *, n_runs=1, n_cycles=25, trees_per_cycle=10, lr_tree=0.02, lr_linear=0.1, num_leaves=15, min_data_in_leaf=20, subsample=1.0, random_state=0, verbose=False, **kwargs, ) ``` Fits LGB^A+, the alternating variant from [`philgoucou/lgbplus`](https://github.com/philgoucou/lgbplus). Each cycle first fits a block of LightGBM residual trees, then fits one greedy univariate linear residual update with an intercept. Unlike `lgb_plus`, there is no per-step winner selection: both channels are updated every cycle. The R reference file `R/lgb_plus_A.R` also provides an ensemble helper `lgb_plus_A_ensemble`. `macroforecast` folds that helper into this estimator via `n_runs`, so the same callable can represent both a single alternating model and an averaged alternating ensemble. Input and output follow the same supervised contract as `lgb_plus`. `fit.estimator.predict_components(X_new)` returns the total, intercept, tree, and linear channels; `fit.estimator.channel_importance()` reports tree gain and linear update frequency by feature. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_runs` | `1` | yes | Independent alternating runs. | | `n_cycles` | `25` | yes | Tree-block plus linear-update cycles per run. | | `trees_per_cycle` | `10` | yes | Residual LightGBM trees per cycle. | | `lr_tree` | `0.02` | yes | Shrinkage for tree-block predictions. | | `lr_linear` | `0.1` | yes | Shrinkage for linear residual updates. | | `num_leaves` | `15` | yes | Maximum leaves for each residual tree. | | `min_data_in_leaf` | `20` | yes | Minimum rows per LightGBM leaf. | | `subsample` | `1.0` | yes | LightGBM bagging fraction for tree blocks. | | `random_state` | `0` | fixed by preset | Base random seed. Each run increments it. | | `**kwargs` | none | fixed by caller | Additional `lightgbm.train` parameters. | The linear slope is computed by the centered OLS identity `sum((x - mean(x)) * (r - mean(r))) / sum((x - mean(x))^2)`. This is statistically equivalent to the R code's `cov(x, residual) / var(x)` and avoids the denominator mismatch in the reference Python expression that combines `np.cov(...)` with `x.var()`. | Preset | Main search dimensions | | --- | --- | | `small` | Short alternating runs for smoke or narrow-window use. | | `standard` | 1 or 5 runs, 10 or 25 cycles, and tree/linear learning-rate grids. | | `wide` | Up to 10 runs, 50 cycles, broader tree-block size and learning-rate ranges. | Default model-selection method: `random`. ### catboost ```python macroforecast.models.catboost( X, y, *, n_estimators=300, learning_rate=0.1, max_depth=6, random_state=0, verbose=False, **kwargs, ) ``` Fits `catboost.CatBoostRegressor`. Requires `macroforecast[catboost]`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `300` | yes | Number of boosting stages. | | `learning_rate` | `0.1` | yes | Shrinkage per stage. | | `max_depth` | `6` | yes | Tree depth. | | `random_state` | `0` | fixed by preset | Boosting random seed. | | `verbose` | `False` | fixed by preset | CatBoost console output flag. | Preset spaces match `gradient_boosting`. Default model-selection method: `random`. ## Macro-Specific Tree Models ### quantile_regression_forest ```python macroforecast.models.quantile_regression_forest( X, y, *, n_estimators=200, max_depth=None, min_samples_leaf=1, random_state=0, quantile_levels=(0.05, 0.5, 0.95), ) ``` Fits a random forest and stores per-leaf training-target distributions. The underlying estimator exposes `predict_quantiles(X, levels=None)`. The forecasting runner stores those outputs in the `quantile_predictions` column as per-row dictionaries keyed by quantile level. The fitted wrapper also exposes sklearn forest feature importances through `fit.diagnostics["feature_importance"]`. Quantiles use tree-equal leaf weighting: within each tree, all training observations that share the test row's terminal leaf receive equal weight, and each tree contributes the same total weight. This avoids letting large leaves dominate the empirical quantile solely because they contain more observations. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `n_estimators` | `200` | yes | Number of trees. | | `max_depth` | `None` | yes | Maximum depth per tree. | | `min_samples_leaf` | `1` | yes | Minimum samples per terminal leaf. | | `random_state` | `0` | fixed by preset | Forest random seed. | | `quantile_levels` | `(0.05, 0.5, 0.95)` | fixed by preset | Default levels returned by `predict_quantiles()`. | Preset spaces match `random_forest`. Default model-selection method: `random`. ### macro_random_forest ```python macroforecast.models.macro_random_forest( X, y, *, x_columns=None, S_columns=None, x_pos=None, S_pos=None, y_pos=0, B=50, minsize=10, mtry_frac=1/3, min_leaf_frac_of_x=1.0, VI=False, ERT=False, quantile_rate=None, S_priority_vec=None, random_x=False, trend_push=1, howmany_random_x=1, howmany_keep_best_VI=20, cheap_look_at_GTVPs=True, prior_var=None, prior_mean=None, subsampling_rate=0.75, rw_regul=0.75, keep_forest=False, block_size=12, fast_rw=True, ridge_lambda=0.1, HRW=0, resampling_opt=2, print_b=False, parallelise=False, n_cores=1, **kwargs, ) ``` Adapter for Ryan Lucas's `MacroRandomForest` reference backend. The reference implementation is vendored from `MacroRandomForest` 1.0.6 under the MIT license, with source attribution in `macroforecast.models._mrf_reference`. Install the optional runtime dependencies with `macroforecast[macro_random_forest]`. The adapter fits on the in-sample `X/y` and calls the reference `_ensemble_loop()` during `predict(X_test)`. Repeated calls to `predict()` with the same test matrix reuse the previous reference-backend output, so repeated result materialization does not rerun the expensive forest loop. If the reference backend returns multiple prediction columns for the requested test rows, the adapter averages them row by row. If the backend returns no recognized prediction field or fewer than the requested number of predictions, the adapter raises a runtime error instead of silently returning a misaligned forecast vector. By default all columns in `X` are used both as the time-varying linear equation variables (`x_columns`) and the forest state variables (`S_columns`). Pass `x_columns` and `S_columns` when those sets should differ. The reference backend distinguishes two predictor sets: | Argument | Role | | --- | --- | | `x_columns` | Columns in the local linear forecasting equation. These are the variables whose coefficients are allowed to vary over time. | | `S_columns` | State variables used by the forest to split the sample and estimate those local coefficients. | Use either column names or reference-style positions for each role. Passing both `x_columns` and `x_pos`, or both `S_columns` and `S_pos`, raises an error rather than silently prioritizing one selector. For example, a compact MRF can use a small local-linear equation but a wider state vector for the tree: ```python fit = macroforecast.models.macro_random_forest( X_train, y_train, x_columns=["INDPRO_lag0", "UNRATE_lag0"], S_columns=[ "INDPRO_lag0", "UNRATE_lag0", "CPIAUCSL_lag0", "FEDFUNDS_lag0", "S&P500_lag0", ], B=50, minsize=10, mtry_frac=1.0, ridge_lambda=0.1, rw_regul=0.75, parallelise=False, print_b=False, ) pred = fit.predict(X_test) ``` With the forecasting runner, pass model parameters through the model-keyed `params` mapping. If you want fixed parameters rather than model-owned tuning, also disable model selection for this model: ```python features = macroforecast.feature_engineering.feature_spec( target="INDPRO", horizon=1, predictors=["UNRATE", "CPIAUCSL", "FEDFUNDS", "S&P500"], lags=(0, 1), ) window = macroforecast.window.spec( estimation=macroforecast.window.estimation_expanding(min_size=120), val=macroforecast.window.val_last_block(size=24), test=macroforecast.window.test_origins(horizon=1, step=1), ) result = macroforecast.forecasting.run( panel, "macro_random_forest", window=window, features=features, params={ "macro_random_forest": { "x_columns": ["UNRATE_lag0", "FEDFUNDS_lag0"], "S_columns": [ "UNRATE_lag0", "UNRATE_lag1", "CPIAUCSL_lag0", "FEDFUNDS_lag0", "S&P500_lag0", ], "B": 100, "minsize": 10, "mtry_frac": 1.0, "parallelise": False, "print_b": False, } }, model_selection={"macro_random_forest": None}, ) ``` The reference implementation is sensitive to panel shape. Use numeric, non-missing features after preprocessing and feature engineering. Keep at least one `x_columns` variable, and prefer at least five `S_columns` variables; with very small state sets, set `mtry_frac=1.0` so at least one state variable is considered at each split. Small training samples can also fail when `minsize` is too large relative to the number of local-linear variables. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `x_columns` | `None` | fixed by preset | Feature columns in the time-varying linear equation. | | `S_columns` | `None` | fixed by preset | Feature columns used as forest state variables. | | `x_pos` | `None` | fixed by preset | Reference-package predictor positions after the target column. | | `S_pos` | `None` | fixed by preset | Reference-package state positions after the target column. | | `y_pos` | `0` | fixed by preset | Fixed target position for the separated `X/y` adapter; must remain `0`. | | `B` | `50` | yes | Number of MRF trees. | | `minsize` | `10` | yes | Minimum node size before split attempts. | | `mtry_frac` | `1/3` | yes | Fraction of state variables considered at each split. | | `min_leaf_frac_of_x` | `1.0` | yes | Minimum leaf-size multiplier relative to local x dimension. | | `VI` | `False` | fixed by preset | Enable variable-importance split search mode. | | `ERT` | `False` | fixed by preset | Enable extremely randomized tree split mode. | | `quantile_rate` | `None` | fixed by preset | Optional quantile rate for quantile-oriented output. | | `S_priority_vec` | `None` | fixed by preset | Optional priority weights over state variables. | | `random_x` | `False` | fixed by preset | Use random subsets of local-linear predictors. | | `trend_push` | `1` | fixed by preset | Reference-package trend-push option. | | `howmany_random_x` | `1` | fixed by preset | Number of random local-linear predictor draws. | | `howmany_keep_best_VI` | `20` | fixed by preset | Number of best VI candidates retained. | | `cheap_look_at_GTVPs` | `True` | fixed by preset | Use the reference package's cheaper GTVP inspection. | | `prior_var` | `None` | fixed by preset | Optional prior variances for local coefficients. | | `prior_mean` | `None` | fixed by preset | Optional prior means for local coefficients. | | `subsampling_rate` | `0.75` | yes | Subsample share used by each tree. | | `rw_regul` | `0.75` | yes | Random-walk shrinkage strength. | | `keep_forest` | `False` | fixed by preset | Keep full reference forest object in memory. | | `block_size` | `12` | fixed by preset | Reference-package block size for time-series resampling. | | `fast_rw` | `True` | fixed by preset | Use fast random-walk regularization path. | | `ridge_lambda` | `0.1` | yes | Ridge penalty for local linear fits. | | `HRW` | `0` | fixed by preset | Reference-package hierarchical random-walk option. | | `resampling_opt` | `2` | yes | Reference MRF resampling option. | | `parallelise` | `False` | fixed by preset | Whether to use reference-package parallel execution. | | `n_cores` | `1` | fixed by preset | Worker count for the reference package. | | `print_b` | `False` | fixed by preset | Reference-package progress printing. | The MRF presets tune `B`, `minsize`, `mtry_frac`, `min_leaf_frac_of_x`, `subsampling_rate`, `rw_regul`, `ridge_lambda`, and `resampling_opt`; inspect the exact candidate lists with `describe_model("macro_random_forest")`. ## Volatility Models Volatility model fits return `VolatilityFit`. In addition to `predict_variance(horizon=...)`, their diagnostics include fitted parameter estimates under `params` and the in-sample `conditional_volatility` path when the backend exposes it. These models are for volatility/variance forecasting, not ordinary conditional mean macro forecasting. They accept a univariate return-like series `y` and return both a point-mean prediction interface and a variance forecast interface. | Function | Implementation | R/source comparison | Boundary | | --- | --- | --- | --- | | `garch11` | Optional Python `arch.arch_model` backend. | Same surface as `rugarch::ugarchspec(variance.model=list(model="sGARCH"))` plus `ugarchfit()`, but the likelihood is delegated to `arch`, not reimplemented. | Backend controls solver details, distribution aliases, convergence behavior, and forecast internals. | | `egarch` | Optional Python `arch.arch_model` backend. | Same surface as `rugarch::ugarchspec(variance.model=list(model="eGARCH"))` plus `ugarchfit()`, but the likelihood is delegated to `arch`, not reimplemented. | Backend controls solver details, distribution aliases, convergence behavior, and forecast internals. | | `realized_garch` | Internal compact Gaussian log-linear MLE. | Aligned with the p=q=1 Hansen-Huang-Shek / `rugarch` `realGARCH` state and measurement equations. | Not a full `rugarch` clone: no ARMA/ARFIMA mean, variance regressors, non-Gaussian distributions, fixed-parameter SE machinery, simulation/path/roll helpers, or xts-specific checks. | ### Common Output ```python fit = macroforecast.models.garch11(y) variance = fit.predict_variance(horizon=12) sigma = fit.conditional_volatility metadata = fit.to_metadata() ``` | Output | Type | Meaning | | --- | --- | --- | | `fit` | `VolatilityFit` | Fitted wrapper with `predict()`, `predict_variance()`, `summary()`, `to_dict()`, and `to_metadata()`. | | `fit.predict(X)` | `pandas.Series` | Conditional mean prediction. For these models this is usually a constant mean from the volatility backend. | | `fit.predict_variance(horizon)` | `pandas.Series` | Variance forecast indexed from `0` to `horizon - 1`. `horizon` must be positive. | | `fit.conditional_volatility` | `pandas.Series` or `None` | In-sample conditional volatility path if available. | | `fit.diagnostics["params"]` | `dict` | Fitted parameters. Names depend on the backend/model. | | `fit.diagnostics["conditional_volatility"]` | `pandas.Series` | Same path as `fit.conditional_volatility`, stored for metadata export. | ### garch11 ```python macroforecast.models.garch11( y, *, X=None, p=1, q=1, mean_model="constant", dist="normal", rescale=False, ) ``` Fits GARCH using the optional `arch` package. Requires `macroforecast[arch]`. The default is GARCH(1,1): $$ r_t = \mu + \epsilon_t,\qquad \epsilon_t = \sigma_t z_t $$ $$ \sigma_t^2 = \omega + \alpha_1 \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2. $$ For higher `p`/`q`, the lag orders are passed directly to `arch.arch_model(vol="GARCH", p=p, q=q)`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `p` | `1` | yes | GARCH innovation lag order. | | `q` | `1` | yes | GARCH variance lag order. | | `mean_model` | `"constant"` | manual | Conditional mean model. | | `dist` | `"normal"` | yes | Innovation distribution. | | `rescale` | `False` | fixed by preset | `arch` package rescale option. | | Preset | `p` | `q` | `dist` | | --- | --- | --- | --- | | `small` | `(1,)` | `(1,)` | `("normal", "t")` | | `standard` | `(1, 2)` | `(1, 2)` | `("normal", "t")` | | `wide` | `(1, 2, 3)` | `(1, 2, 3)` | `("normal", "t", "skewt")` | Implementation notes: | Item | Value | | --- | --- | | Backend | `arch.arch_model` | | Required extra | `macroforecast[arch]` | | R comparison | `rugarch::ugarchspec(variance.model=list(model="sGARCH", garchOrder=c(p, q)))` | | Internal likelihood? | No. macroforecast validates orders, passes inputs to `arch`, and records metadata/diagnostics. | | Minimum data | 30 non-missing observations. | ### egarch ```python macroforecast.models.egarch( y, *, X=None, p=1, o=0, q=1, mean_model="constant", dist="normal", rescale=False, ) ``` Fits EGARCH using the optional `arch` package. Requires `macroforecast[arch]`. The backend receives: ```python arch.arch_model(y, vol="EGARCH", p=p, o=o, q=q, mean=mean_model, dist=dist) ``` For EGARCH(1,1), the log-variance structure is backend-defined by `arch`; conceptually it is the exponential GARCH family where log variance reacts to standardized shock magnitude and asymmetry terms rather than modeling variance directly in levels. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `p` | `1` | yes | EGARCH innovation lag order. | | `o` | `0` | yes | Asymmetric innovation lag order. | | `q` | `1` | yes | EGARCH variance lag order. | | `mean_model` | `"constant"` | manual | Conditional mean model. | | `dist` | `"normal"` | yes | Innovation distribution. | | `rescale` | `False` | fixed by preset | `arch` package rescale option. | | Preset | `p` | `o` | `q` | `dist` | | --- | --- | --- | --- | --- | | `small` | `(1,)` | `(0, 1)` | `(1,)` | `("normal", "t")` | | `standard` | `(1, 2)` | `(0, 1)` | `(1, 2)` | `("normal", "t")` | | `wide` | `(1, 2, 3)` | `(0, 1, 2)` | `(1, 2, 3)` | `("normal", "t", "skewt")` | Implementation notes: | Item | Value | | --- | --- | | Backend | `arch.arch_model` | | Required extra | `macroforecast[arch]` | | R comparison | `rugarch::ugarchspec(variance.model=list(model="eGARCH", garchOrder=c(p, q)))` | | Internal likelihood? | No. macroforecast validates orders, passes inputs to `arch`, and records metadata/diagnostics. | | Minimum data | 30 non-missing observations. | ### realized_garch ```python macroforecast.models.realized_garch( y, *, X=None, rv=None, realized_variance=None, max_iter=2000, n_starts=5, random_state=0, ) ``` Fits a compact p=q=1 Gaussian log-linear realized-GARCH joint likelihood. Provide `rv` directly or set `realized_variance` to the column in `X` containing the realized measure. If neither is supplied, macroforecast uses `y ** 2` as an explicit `rv_proxy` so the caller can still inspect the model contract. For empirical realized-GARCH work, pass a true realized variance or realized volatility measure. The implemented state and measurement equations are: $$ \log h_t = \omega + \alpha \log x_{t-1} + \beta \log h_{t-1}, $$ $$ z_t = \frac{r_t - \mu}{\sqrt{h_t}}, \qquad \tau(z_t) = \eta_1 z_t + \eta_2(z_t^2 - 1), $$ $$ \log x_t = \xi + \delta \log h_t + \tau(z_t) + u_t, \qquad u_t \sim N(0, \sigma_u^2). $$ This matches the compact `rugarch` realGARCH recursion in `rugarch/src/filters.c::realgarchfilter()` for the p=q=1 case: lagged log realized volatility enters through `alpha`, lagged log latent variance enters through `beta`, and the measurement equation uses `xi`, `delta`, `eta1`, and `eta2`. The stationarity-style persistence diagnostic is: $$ \text{persistence} = \beta + \delta \alpha. $$ The multi-step variance forecast uses the conditional expectation recursion: the first step uses the latest observed realized measure, then future `tau(z_t)` and measurement shocks are set to zero so `E[\log x_t \mid h_t] = \xi + \delta \log h_t`. | Parameter | Default | Tunable | Meaning | | --- | --- | --- | --- | | `realized_variance` | `None` | manual | Column name for realized variance. | | `max_iter` | `2000` | fixed by preset | Optimizer iteration cap. | | `n_starts` | `5` | yes | Number of optimizer starting points. | | `random_state` | `0` | fixed by preset | Optimizer random seed. | | Preset | `n_starts` | | --- | --- | | `small` | `(3, 5)` | | `standard` | `(3, 5, 10)` | | `wide` | `(3, 5, 10, 20)` | Implementation notes: | Item | Value | | --- | --- | | Backend | Internal SciPy `L-BFGS-B` optimizer. | | Required extra | None beyond base SciPy stack. | | R comparison | Compact p=q=1 version of `rugarch::ugarchspec(variance.model=list(model="realGARCH"))` with `realizedVol`. | | Parameter names | `mu`, `omega`, `alpha`, `beta`, `xi`, `delta`, `eta_1`, `eta_2`, `log_sigma_u`, `persistence`. | | Restrictions | `alpha > 0`, `beta > 0`, `delta > 0`, and `beta + delta * alpha < 1` during optimization. | | Minimum data | 30 aligned observations of `y` and realized measure. | Example: ```python fit = macroforecast.models.realized_garch( returns, rv=realized_variance, max_iter=2000, n_starts=5, random_state=0, ) fit.diagnostics["params"] fit.predict_variance(horizon=12) ``` ## Omitted From The Clean Model API | Legacy name | Decision | | --- | --- | | `lasso_path` | Removed. Use `get_model("lasso")` and `model_selection.select_params()`. | | `pcr` | Removed. Use `feature_engineering.feature_spec(pca_components=...)` with a regression model. | - `var_select_order` -- VAR lag-order selection by AIC/BIC/HQ/FPE (vars::VARselect), via statsmodels `VAR.select_order`. - `gjr_garch` -- GJR-GARCH (Glosten-Jagannathan-Runkle) asymmetric/leverage volatility (arch GARCH, o>0; rugarch gjrGARCH). - `tgarch` -- Threshold GARCH (TGARCH/Zakoian), absolute-value (power=1) asymmetric volatility. - `risk_forecast` -- Value-at-Risk and Expected Shortfall forecast from a fitted volatility model (Normal / standardized-t). - `value_at_risk` -- lower-tail VaR return quantile(s) from a fitted volatility model. - `expected_shortfall` -- Expected Shortfall (mean return below VaR) from a fitted volatility model. - `news_impact_curve` -- Engle-Ng (1993) news impact curve: conditional variance vs lagged shock for a fitted GARCH-family model. - `garch_roll` -- rolling one-step volatility / VaR backtest with periodic refit and coverage summary (rugarch::ugarchroll). - `var_roots` -- VAR stability check: moduli of the companion-matrix eigenvalues, spectral radius, and is_stable (vars::roots). - `var_restrict` -- restricted VAR by sequential elimination of insignificant regressors with restriction matrix (vars::restrict). - `arima` -- (seasonal) ARIMA model via statsmodels, order (p,d,q) and seasonal_order (P,D,Q,m). - `auto_arima` -- automatic (seasonal) ARIMA order selection (forecast::auto.arima): KPSS-based d, AICc grid over (p,q[,P,Q]). ### arima ```python macroforecast.models.arima(y, *, order=(1, 0, 0), seasonal_order=(0, 0, 0, 0), trend=None) ``` ### auto_arima ```python macroforecast.models.auto_arima(y, *, max_p=5, max_q=5, max_d=2, seasonal=False, m=1, ic="aicc", trend=None) ``` ### gjr_garch ```python macroforecast.models.gjr_garch(y, *, X=None, p=1, o=1, q=1, mean_model="constant", dist="normal", rescale=False) ``` ### tgarch ```python macroforecast.models.tgarch(y, *, X=None, p=1, o=1, q=1, mean_model="constant", dist="normal", rescale=False) ```