# macroforecast.interpretation [Back to reference](index.md) `macroforecast.interpretation` owns post-fit interpretation helpers. These functions consume fitted models and feature matrices. They do not fit models, construct features, choose forecast windows, or run forecast-comparison tests. Use the namespace form: ```python import macroforecast as mf fit = mf.models.ridge(X_train, y_train) mf.interpretation.linear_coefficients(fit) ``` Every table returned by this module carries `attrs["macroforecast_metadata_schema"]` with `kind`, `version`, `method`, `model`, `n_features`, output `columns`, and function-specific metadata. ## Review Status `interpretation` functions are not all the same kind of evidence. Use this classification before treating a number as a paper table or structural claim. | Class | Functions | Reference implementation | Status | | --- | --- | --- | --- | | Native/backend extraction | `linear_coefficients`, `tree_importance`, `model_native_linear_coef`, `model_native_tree_importance` | scikit-learn-style `coef_` / `feature_importances_` estimator conventions | Direct attribute extraction. Validity depends on the fitted estimator. | | SHAP backend | `shap_values`, `shap_importance`, `shap_tree`, `shap_linear`, `shap_kernel`, `shap_deep` | Python `shap` package: `Explainer`, `TreeExplainer`, and `PermutationExplainer` | Backend call plus pandas reshaping. `shap_linear` and `shap_deep` are generic SHAP paths, not fixed `LinearExplainer` or `DeepExplainer` calls. | | Forecast-accuracy Shapley | `oshapley_pipeline`, `oshapley_from_forecast_result`, `oshapley_output`, `forecast_shapley_output`, `oshapley_vi`, `pbsv`, `performance_based_shapley_value`, `model_accordance_score`, `ishapley_vi`, `shapley_variable_importance`, `performance_shapley_value` | Python `anatomy` package and Borup, Goulet Coulombe, Rapach, Montes Schutte, and Schwenk-Nebbe (2022) | User-facing callables use oShapley/PBSV/MAS names. The optional backend is the Python `anatomy` package. | | Captum backend | `deep_lift` | Captum `DeepLift` | Direct Captum call. | | Standard model-agnostic diagnostics | `permutation_importance`, `lofo_importance(fit_func=...)`, `partial_dependence`, `individual_conditional_expectation`, `ice_curves`, `accumulated_local_effect`, `friedman_h_interaction` | scikit-learn permutation/PDP/ICE, R `pdp::partial`, R `ALEPlot`, R `iml`/`hstats` H-statistic conventions | Implements standard diagnostic definitions directly. Interpretation is predictive/associational, not causal. | | Approximate or fixed-model diagnostics | `permutation_importance_strobl`, `lofo_importance(fit_func=None)`, `cumulative_r2_contribution`, `rolling_recompute`, `bootstrap_jackknife`, non-linear `forecast_decomposition` fallback | Strobl conditional-permutation idea, LOFO refit idea, macroforecast fixed-model diagnostics | Metadata records approximation/fixed-model mode. Do not describe these as exact refit or additive decompositions. | | VAR/macroeconomic interpretation | `generalized_irf`, `orthogonalised_irf`, `fevd`, `historical_decomposition` | statsmodels VAR result API when available; Pesaran-Shin GIRF formula; internal statsmodels-like adapter for macroforecast VAR | `fevd` now falls back to manual orthogonalized FEVD rather than IRF. `historical_decomposition` is reduced-form, not structural identification. | | Neural manual attribution | `saliency_map`, `integrated_gradients`, `gradient_shap`, `lstm_hidden_state` | Captum-style torch-gradient methods | Manual torch autograd implementation, except `deep_lift`. Integrated gradients uses a straight-line Riemann approximation, not Captum's default Gauss-Legendre backend. | | OLS as attention | `ols_attention_weights`, `ridge_attention_weights`, `ols_attention_embedding`, `ols_attention_equivalence`, `attention_weights`, `dual_decomposition` | Goulet Coulombe (2026), "Ordinary Least Squares as an Attention Mechanism" | Closed-form linear algebra. `ols_attention_weights` uses exact OLS by default; ridge variants are stabilizing extensions, not new forecasting models. | | Dual interpretation | `dual`, `DualInterpretationResult`, `dual_interpretation`, `dual_from_forecast_result`, `observation_weights`, `observation_contributions`, `forecast_diagnostics`, `top_observations`, `group_observation_weights`, plus backward-compatible aliases `outcome_contributions`, `data_portfolio_diagnostics`, `top_episodes`, `episode_group_weights` | Goulet Coulombe, Goebel, and Klieber (2024), plus local `dual_python` / `DualML_R` reference code | Episode-based explanation: which historical observations the model uses. Use [Dual Interpretation](interpretation_dual.md) for the full contract. Ridge/KRR/RF routes are implemented directly; boosted-tree AXIL, NN embedding, and classification log-odds routes are documented as future extensions. | | Aggregation/report plumbing | `group_aggregate`, `lineage_attribution`, `transformation_attribution`, `custom_interpretation` | No external fitting backend; table aggregation and user callables | Aggregates existing evidence. `transformation_attribution` is not component-level causality unless the input table was designed as a component-removal experiment. | Every returned table includes this review classification in `attrs["macroforecast_metadata_schema"]["reference"]`. ## Function Summary | Function | Input | Output | Meaning | | --- | --- | --- | --- | | `linear_coefficients(model, sort=True)` | `ModelFit` or estimator with `coef_` | `DataFrame` | Native linear coefficient table. | | `model_native_linear_coef(model, sort=True)` | `ModelFit` or estimator with `coef_` | `DataFrame` | Legacy-name alias for native linear coefficients. | | `tree_importance(model, sort=True)` | `ModelFit` or estimator with `feature_importances_` | `DataFrame` | Native tree feature importance table. | | `model_native_tree_importance(model, sort=True)` | `ModelFit` or estimator with `feature_importances_` | `DataFrame` | Legacy-name alias for native tree importance. | | `permutation_importance(model, X, y, metric="mse", n_repeats=5, random_state=None)` | fitted predictor, feature frame, target vector | `DataFrame` | Loss degradation after permuting each feature. | | `permutation_importance_strobl(model, X, y, metric="mse", n_repeats=5, n_bins=5, random_state=None)` | fitted predictor, feature frame, target vector | `DataFrame` | Conditional permutation within bins of correlated features. | | `lofo_importance(model, X, y, fit_func=None, metric="mse")` | fitted predictor or refit callable, feature frame, target vector | `DataFrame` | Leave-one-feature-out or prediction-drop importance. | | `partial_dependence(model, X, features, grid_size=20)` | fitted predictor and feature frame | `DataFrame` | One-way manual partial-dependence curves. | | `individual_conditional_expectation(model, X, features, grid_size=20, center=False)` | fitted predictor and feature frame | long `DataFrame` | One-way individual conditional expectation curves. | | `ice_curves(model, X, features, grid_size=20, center=False)` | fitted predictor and feature frame | long `DataFrame` | Alias for `individual_conditional_expectation`. | | `accumulated_local_effect(model, X, feature, bins=10)` | fitted predictor and feature frame | `DataFrame` | First-order accumulated local effect curve. | | `friedman_h_interaction(model, X, features=None, grid_size=10)` | fitted predictor and feature frame | `DataFrame` | Pairwise Friedman-Popescu H interaction statistics. | | `shap_values(model, X, background=None, explainer="auto", check_additivity=True, **kwargs)` | fitted predictor and feature frame | long `DataFrame` | SHAP attribution values using optional `shap` backend. | | `shap_importance(model, X, ...)` | fitted predictor and feature frame | `DataFrame` | Mean absolute SHAP feature importance. | | `shap_tree`, `shap_linear`, `shap_kernel`, `shap_deep` | fitted predictor and feature frame | `DataFrame` | Convenience SHAP-importance wrappers. | | `oshapley_pipeline(X, y, models, window=None, ...)` | aligned feature matrix, target, model specs | `ForecastShapleyResult` | Convenience path that precomputes the backend object and returns forecast explanation, oShapley-VI, and PBSV tables. | | `oshapley_from_forecast_result(result, X, y, models, window=..., ...)` | `ForecastResult` plus explicit aligned feature matrix/target/model/window inputs | `ForecastResult` or `ForecastShapleyResult` | Build an oShapley/PBSV sidecar for a completed runner result. | | `oshapley_output(value, output="oshapley", loss="rmse", sidecar_name=None, ...)` | `ForecastResult` sidecar, `ForecastShapleyResult`, or precomputed backend object | `DataFrame`, `dict`, or summary object | Select one oShapley/PBSV output table without manually opening the sidecar. | | `forecast_shapley_output(...)` | same as `oshapley_output(...)` | same as `oshapley_output(...)` | Descriptive alias for `oshapley_output(...)`. | | `oshapley_provider(X, y, models, window=None, ...)` | aligned feature matrix, target, model specs | `AnatomyModelProvider` | Build the provider required by the optional backend precompute loop. | | `precompute_oshapley(X, y, models, window=None, ...)` | same as `oshapley_provider` | backend object | Build provider and run the optional backend precompute. | | `window_to_oshapley_subsets(window, index, train_source="fit")` | `WindowSpec` and index | backend subsets | Convert macroforecast train/test origins to exact backend subsets. | | `anatomy_output_transformer(output="forecast")` | raw forecast/loss output name or callable | `AnatomyModelOutputTransformer` | Build the optional backend transformer used only inside forecast-accuracy anatomy explanations. | | `anatomy_explain(anatomy, model_groups=None, metric="forecast", output="long")` | precomputed backend object or saved path | `DataFrame` | Low-level backend wrapper for raw Shapley explanations. | | `anatomy_model(model, feature_names=None)` | `ModelFit` or estimator with `predict` | `AnatomyModel` | Wrap a macroforecast fit as an anatomy prediction function. | | `window_to_anatomy_subsets(window, index, train_source="fit")` | `WindowSpec` and index | `AnatomySubsets` | Convert macroforecast train/test origins to exact anatomy subsets. | | `anatomy_provider(X, y, models, window=None, ...)` | aligned feature matrix, target, model specs | `AnatomyModelProvider` | Build the provider required by the upstream anatomy precompute loop. | | `precompute_anatomy(X, y, models, window=None, ...)` | same as `anatomy_provider` | `Anatomy` | Build provider and run `Anatomy(...).precompute(...)`. | | `anatomy_pipeline(X, y, models, window=None, ...)` | aligned feature matrix, target, model specs | `AnatomyPipelineResult` | Convenience path that precomputes anatomy and returns forecast explanation, oShapley-VI, and PBSV tables. | | `anatomy_from_forecast_result(result, X, y, models, window=..., ...)` | `ForecastResult` plus explicit aligned feature matrix/target/model/window inputs | `ForecastResult` or `AnatomyPipelineResult` | Build an anatomy sidecar for a completed runner result. | | `oshapley_vi(anatomy, ...)` | precomputed `anatomy.Anatomy` object or saved path | `DataFrame` | Backend oShapley-VI: mean absolute raw-forecast Shapley contribution. | | `pbsv(anatomy, loss="rmse", ...)` | precomputed `anatomy.Anatomy` object or saved path | `DataFrame` | Backend PBSV_p loss decomposition. | | `performance_based_shapley_value(...)` | same as `pbsv(...)` | `DataFrame` | Alias for backend PBSV_p. | | `ishapley_vi(contributions, ...)` | in-sample contribution table | `DataFrame` | Explicit iShapley-VI_p table adapter for MAS input. | | `shapley_variable_importance(contributions, ...)` | local contribution table | `DataFrame` | Mean absolute Shapley VI table, usable for in-sample or out-of-sample VI. | | `model_accordance_score(is_vi, oos_pbsv, ...)` | VI table/Series and PBSV table/Series | one-row `DataFrame` | Backend MAS and optional MAS p-value. | | `performance_shapley_value(contributions, y, loss="squared_error", ...)` | additive forecast-contribution table and aligned target | `DataFrame` | Package-native PBSV-style point-loss Shapley fallback from additive forecast contributions. | | `forecast_decomposition(model, X, row=-1)` | fitted predictor and feature frame | `DataFrame` | Linear contribution of each feature to one prediction. | | `cumulative_r2_contribution(model, X, y, feature_order=None)` | fitted predictor, feature frame, target vector | `DataFrame` | Sequential R-squared contribution under zero-filled inactive features. | | `rolling_recompute(model, X, y, window=None, step=None, method="permutation_importance")` | fitted predictor, feature frame, target vector | `DataFrame` | Recompute importance over rolling evaluation windows. | | `bootstrap_jackknife(model, X, y, fit_func=None, n_replications=50)` | fitted predictor or refit callable, feature frame, target vector | `DataFrame` | Bootstrap uncertainty summary for importance. | | `group_aggregate(table, groups=None, ...)` | feature-importance table | `DataFrame` | Aggregate feature importance by user or inferred groups. | | `lineage_attribution(table, lineage, level="pipeline_name")` | feature-importance table and lineage mapping | `DataFrame` | Aggregate importance by feature-engineering lineage metadata. | | `transformation_attribution(evaluation, ..., lower_is_better=True, baseline="worst")` | evaluation score table | `DataFrame` | Attribute score improvements to mutually exclusive preprocessing/feature pipelines. | | `attention_weights(X_train, X_test=None, ...)` | training and test feature frames | long `DataFrame` | OLS attention matrix weights. | | `ols_attention_weights(X_train, X_test=None, ...)` | training and test feature frames | long `DataFrame` | Exact OLS-as-attention weights from Goulet Coulombe (2026). | | `ridge_attention_weights(X_train, X_test=None, alpha=...)` | training and test feature frames | long `DataFrame` | Ridge-stabilized attention weights using the standard unpenalized-intercept ridge convention. | | `ols_attention_embedding(X_train, X_test=None, ...)` | training and test feature frames | long `DataFrame` | Whitened train/test embeddings whose inner products reconstruct attention weights. | | `ols_attention_equivalence(X_train, y_train, X_test=None, ...)` | train features/outcomes, optional test features, optional reference predictions | `DataFrame` | Audit that standard OLS/ridge predictions equal attention-weight predictions. | | `dual_decomposition(X_train, y_train, X_test=None, ...)` | train features/outcomes and optional test features | long `DataFrame` | OLS prediction as weighted training-outcome contributions. | | `dual.DualInterpretationResult` / `DualInterpretationResult` | output from `dual_interpretation(...)` | result object | Bundles observation weights, observation contributions, forecast diagnostics, top observations, group weights, and metadata. | | `dual.dual_interpretation(...)` / `dual_interpretation(...)` | model, train/test feature frames, train target | `DualInterpretationResult` | Run the dedicated dual interpretation path. Full contract: [Dual Interpretation](interpretation_dual.md). | | `dual.dual_from_forecast_result(...)` / `dual_from_forecast_result(...)` | completed `ForecastResult`, fitted model, explicit train/test feature frames, train target | `ForecastResult` or `DualInterpretationResult` | Build a dual sidecar for a completed runner result. | | `dual.observation_weights(model, X_train, X_test=None, method="auto", ...)` / `observation_weights(...)` | fitted model when needed, train/test feature frames | long `DataFrame` | DualML-style historical-observation weights for ridge, KRR, and random forest. | | `dual.observation_contributions(weights, y_train, center=False, include_base=False)` / `observation_contributions(...)` | observation-weight table and train target | long `DataFrame` | Convert observation weights into observation contributions whose rows sum to each forecast. | | `dual.forecast_diagnostics(weights, top_q=0.05)` / `forecast_diagnostics(...)` | observation-weight table | `DataFrame` | Forecast concentration, short position, leverage, gross leverage, and turnover. | | `dual.top_observations(weights, y_train=None, n=10, sort_by="abs_weight")` / `top_observations(...)` | observation weights or contributions | long `DataFrame` | Highest-weight historical observations per forecast date. | | `dual.group_observation_weights(weights, groups, y_train=None)` / `group_observation_weights(...)` | observation weights and user episode groups | `DataFrame` | Aggregate weights/contributions over regimes such as recessions or inflation episodes. | | `outcome_contributions`, `data_portfolio_diagnostics`, `top_episodes`, `episode_group_weights` | same as preferred dual names | `DataFrame` | Backward-compatible aliases. Prefer the paper-aligned names above in new code. | | `mrf_gtvp(model, X=None)` | fitted `macro_random_forest` after prediction | long `DataFrame` | Time-varying coefficient path from MacroRandomForest `betas`. | | `generalized_irf(model, n_periods=12, target=None)` | fitted VAR model | `DataFrame` | Pesaran-Shin generalized IRF importance. | | `orthogonalised_irf(model, n_periods=12, target=None)` | fitted VAR model | `DataFrame` | Cholesky orthogonalised IRF importance. | | `fevd(model, n_periods=12, target=None)` | fitted VAR model | `DataFrame` | Forecast error variance decomposition importance. | | `historical_decomposition(model, max_lag=12, target=None)` | fitted VAR model | `DataFrame` | Reduced-form MA residual contribution summary. | | `lasso_inclusion_frequency(model, X=None, y=None, fit_func=None, ...)` | lasso-style fit, optionally refit callable | `DataFrame` | Single-fit or bootstrap nonzero coefficient frequency. | | `lstm_hidden_state(model, X)` | fitted torch LSTM/GRU model and feature frame | `DataFrame` | Hidden-unit activation importance. | | `saliency_map`, `integrated_gradients`, `gradient_shap`, `deep_lift` | fitted torch model and feature frame | `DataFrame` | Gradient-based neural attribution. | | `custom_interpretation(model, X, func, y=None, name=None, **params)` | fitted predictor, feature frame, user callable | `DataFrame` | User-defined interpretation table with metadata attrs. | ## Native Model Interpretation ### linear_coefficients ```python macroforecast.interpretation.linear_coefficients(model, *, sort=True) ``` Input: a `ModelFit` or estimator exposing `coef_`. Output columns: | Column | Meaning | | --- | --- | | `feature` | Feature name. | | `coefficient` | Signed coefficient. | | `abs_coefficient` | Absolute coefficient used for sorting. | ### tree_importance ```python macroforecast.interpretation.tree_importance(model, *, sort=True) ``` Input: a `ModelFit` or estimator exposing `feature_importances_`. Output columns: | Column | Meaning | | --- | --- | | `feature` | Feature name. | | `importance` | Native estimator importance. | ## Model-Agnostic Importance ### permutation_importance ```python macroforecast.interpretation.permutation_importance( model, X, y, *, metric="mse", n_repeats=5, random_state=None, ) ``` Input: fitted predictor, feature `DataFrame`, and aligned target vector. Output columns: | Column | Meaning | | --- | --- | | `feature` | Permuted feature. | | `importance` | Mean loss increase after permutation. | | `std` | Standard deviation across repeats. | | `baseline_loss` | Loss before permutation. | | `n_repeats` | Repeat count. | Supported metric names: `mse`, `mae`. A custom callable can also be supplied. ### permutation_importance_strobl ```python macroforecast.interpretation.permutation_importance_strobl( model, X, y, *, metric="mse", n_repeats=5, n_bins=5, random_state=None, ) ``` Input: fitted predictor, feature `DataFrame`, and aligned target vector. Output columns: `feature`, `importance`, `std`, `baseline_loss`, `n_repeats`, `conditioning_feature`, and `n_bins`. This is a Strobl-style conditional-permutation approximation. For each feature, the function finds the most correlated companion feature and permutes inside quantile bins of that companion. The exact Strobl conditional permutation literature can condition on richer structures; this callable uses a single-companion quantile-bin rule and records `exact_reference_implementation=False` in metadata. ### lofo_importance ```python macroforecast.interpretation.lofo_importance( model, X, y, *, fit_func=None, metric="mse", sort=True, ) ``` Input: feature `DataFrame`, target vector, and either an already fitted model or a refit callable. If `fit_func` is supplied, it must have signature `fit_func(X_train, y_train) -> fitted_model`. Output columns: `feature`, `importance`, `baseline_loss`, `heldout_loss`, and `mode`. `mode="refit"` means true leave-one-feature-out refitting. `mode="prediction_drop"` means the fitted model is reused and the held-out feature is set to zero. The second mode is a fixed-model diagnostic, not an exact LOFO refit experiment. ## Effect Curves ### partial_dependence ```python macroforecast.interpretation.partial_dependence( model, X, *, features, grid_size=20, ) ``` Input: fitted predictor, feature `DataFrame`, and one feature or list of features. Output columns: `feature`, `value`, `prediction`. Grid strategy: `linear_min_max`. For each selected feature, macroforecast uses `grid_size` equally spaced values between the observed minimum and maximum in `X`. This is the same brute-force averaging definition as sklearn/R PDP, with a simpler explicit grid rule. ### individual_conditional_expectation / ice_curves ```python macroforecast.interpretation.individual_conditional_expectation( model, X, *, features, grid_size=20, center=False, ) macroforecast.interpretation.ice_curves( model, X, *, features, grid_size=20, center=False, ) ``` Input: fitted predictor, feature `DataFrame`, and one feature or list of features. Output columns: | Column | Meaning | | --- | --- | | `feature` | Feature varied along the grid. | | `row` | Original row position. | | `index` | Original pandas index value. | | `value` | Grid value assigned to the feature. | | `prediction` | Individual prediction for that row and grid value. | | `centered_prediction` | Prediction minus the first grid prediction when `center=True`; otherwise `NaN`. | This matches the brute-force ICE idea used by scikit-learn's partial dependence with `kind="individual"` and R `pdp::partial(..., ice=TRUE)`. `partial_dependence()` averages over rows. ICE keeps the row dimension. The grid strategy is also `linear_min_max`. ### accumulated_local_effect ```python macroforecast.interpretation.accumulated_local_effect( model, X, *, feature, bins=10, ) ``` Input: fitted predictor, feature `DataFrame`, and one feature name. Output columns: `feature`, `bin`, `center`, `ale`, `local_effect`. Binning strategy: empirical quantile bins. The returned `ale` curve is centered to mean zero after accumulating local finite differences. ### friedman_h_interaction ```python macroforecast.interpretation.friedman_h_interaction( model, X, *, features=None, grid_size=10, ) ``` Input: fitted predictor and feature `DataFrame`. `features=None` checks all feature pairs. Output columns: `feature_1`, `feature_2`, `h_statistic`, `joint_variance`, `interaction_variance`, and `grid_size`. The statistic is computed from manual one-way and two-way partial dependence. It is an interaction diagnostic, not a forecast-accuracy test. ## SHAP Values ### shap_values ```python macroforecast.interpretation.shap_values( model, X, *, background=None, explainer="auto", check_additivity=True, **kwargs, ) ``` Input: a fitted `ModelFit` or estimator plus a feature `DataFrame`. `background` defaults to `X`. For stable out-of-sample interpretation, pass a training or reference sample as `background`. Output: long `DataFrame`, one row per `(observation, feature)`. | Column | Meaning | | --- | --- | | `row` | Integer row position in `X`. | | `index` | Original pandas index value. | | `feature` | Feature name. | | `feature_value` | Observed feature value. | | `shap_value` | SHAP contribution. | | `base_value` | SHAP base value when provided by the backend. | Supported `explainer` values: | Value | Backend behavior | | --- | --- | | `auto` | Uses `shap.Explainer` with the model prediction function and background frame. | | `permutation` | Uses `shap.PermutationExplainer`. | | `tree` | Uses `shap.TreeExplainer` on the native estimator. | SHAP is optional: ```bash pip install "macroforecast[interpretation]" ``` ### shap_importance and SHAP wrappers ```python macroforecast.interpretation.shap_importance(model, X, background=None, explainer="auto") macroforecast.interpretation.shap_tree(model, X, background=None) macroforecast.interpretation.shap_linear(model, X, background=None) macroforecast.interpretation.shap_kernel(model, X, background=None) macroforecast.interpretation.shap_deep(model, X, background=None) ``` Input: fitted predictor and feature `DataFrame`. Output columns: `feature`, `importance`, `mean_shap`, and `std_shap`. `importance` is mean absolute SHAP value. `shap_tree` forces the tree explainer. `shap_kernel` uses the permutation SHAP path. `shap_linear` and `shap_deep` use the generic explainer path because the installed `shap` backend determines the exact implementation. ## Forecast Accuracy Anatomy This section covers the Shapley-based forecast-accuracy anatomy in Borup, Goulet Coulombe, Rapach, Montes Schutte, and Schwenk-Nebbe (2022), *The Anatomy of Out-of-Sample Forecasting Accuracy*, Federal Reserve Bank of Atlanta Working Paper 2022-16, DOI `10.29338/wp2022-16`, SSRN `4278745`. The paper separates several related quantities: | Quantity | Meaning | | --- | --- | | iShapley-VI | In-sample Shapley variable importance over a training/reference sample. The upstream `anatomy` package accepts this as a MAS input but does not compute a separate iShapley object from `Anatomy.explain(...)`; macroforecast standardizes user-supplied in-sample contribution tables with `ishapley_vi(...)`. | | oShapley-VI | Out-of-sample Shapley variable importance over the forecast sample. In the upstream README this is computed by anatomizing raw forecasts and averaging absolute local contributions. | | PBSV_p | Performance-based Shapley value for predictor `p`. This is not a separate `PBSVp` algorithm; the subscript `p` denotes the predictor. A negative value reduces a lower-is-better loss. | | MAS | Model-adaptation score comparing variable-importance and performance-based rankings. | Backend coverage in macroforecast: | Method | Callable | Backend status | | --- | --- | --- | | Raw anatomy explanations | `anatomy_explain(..., metric="forecast")` | Direct `anatomy.Anatomy.explain(...)`. | | oShapley-VI | `oshapley_vi(...)` | Direct backend raw-forecast explanation plus the upstream README aggregation rule. | | PBSV_p | `pbsv(...)` / `performance_based_shapley_value(...)` | Direct backend loss transformer through `anatomy.Anatomy.explain(...)`. | | MAS | `model_accordance_score(...)` | Direct `anatomy.MAS(...).compute(...)`. | | iShapley-VI input | `ishapley_vi(...)` / `shapley_variable_importance(...)` | Table adapter. Use `ishapley_vi(...)` on an in-sample SHAP/anatomy contribution table; upstream `MAS` then consumes the resulting VI. | | Additive fallback | `performance_shapley_value(...)` | macroforecast-native point-loss Shapley adapter for SHAP/decomposition tables; not the full `anatomy` provider. | ### Window support The upstream Python `anatomy` package supports both single-forecast and multi-window anatomy designs. | Design | How it is represented upstream | macroforecast wrapper behavior | | --- | --- | --- | | Single forecast | Provider with one period, or `explanation_subset` selecting one forecast date. | `anatomy_explain(..., explanation_subset=[date])`, `pbsv(..., loss="squared_error")`, and `pbsv(..., loss="absolute_error")` return local rows for that forecast date. | | Expanding window | `AnatomySubsets.generate(..., estimation_type=AnatomySubsets.EstimationType.EXPANDING, periods=..., gap=...)`. | macroforecast consumes the precomputed `Anatomy` and preserves the forecast-date index returned by the backend. | | Rolling window | `AnatomySubsets.generate(..., estimation_type=AnatomySubsets.EstimationType.ROLLING, periods=..., gap=...)`. | Same as expanding: the window scheme is fixed when the upstream `Anatomy` object is precomputed. | | Full OOS or subperiod summary | `Anatomy.explain(..., explanation_subset=None)` for all forecasts, or a subset index for a subperiod. Aggregating transformers such as RMSE/MAE/MSE return one global row per model set. | `pbsv(loss="rmse" | "mse" | "mae")` returns global or subperiod PBSV rows. | Important boundary: `macroforecast.interpretation` does not decide the rolling or expanding split. The `anatomy.Anatomy` object already contains the provider loop, fitted models, forecast periods, and masking logic. For an end-to-end macroforecast-native path, use `anatomy_provider(...)`, `precompute_anatomy(...)`, or `anatomy_pipeline(...)` with a `WindowSpec`. ### oShapley/PBSV bridge utilities The smooth user-facing path uses method names rather than backend package names: ```python result = mf.forecasting.run(feature_set, "ridge", window=window) result = mf.interpretation.oshapley_from_forecast_result( result, feature_set.X, feature_set.y.iloc[:, 0], {"ridge": "ridge"}, window=window, losses=("squared_error", "rmse"), ) mf.output.write_artifacts(result, "results/oshapley_run") ``` Once the sidecar is attached, choose the exact output from `macroforecast.interpretation`: ```python vi = mf.interpretation.oshapley_output(result, output="oshapley") loss = mf.interpretation.oshapley_output(result, output="pbsv", loss="rmse") local = mf.interpretation.oshapley_output(result, output="forecast") tables = mf.interpretation.oshapley_output(result, output="tables") meta = mf.interpretation.oshapley_output(result, output="metadata") ``` Supported `output` values: | Output | Meaning | | --- | --- | | `oshapley` | oShapley-VI table, equivalent aliases: `vi`, `variable_importance`, `importance`, `oshapley_vi`, `o_shapley_vi`. | | `pbsv` | PBSV/loss-decomposition table for `loss`, equivalent aliases: `performance`, `loss`, `loss_decomposition`. | | `forecast` | Raw forecast-explanation Shapley rows, equivalent aliases: `raw`, `prediction`, `explain`, `forecast_explanation`. | | `tables` | Dictionary of all CSV/JSON-ready output tables. `output="all"` is an alias. Use `table="oshapley_variable_importance"` to pick one named table. | | `metadata` | Flattened metadata table for the oShapley/PBSV run. | | `summary` | JSON-ready dictionary with metadata, forecast explanations, oShapley-VI, and PBSV tables. | If a `ForecastResult` has exactly one sidecar, the selector uses it. If multiple sidecars are attached and none has the preferred names `oshapley`, `forecast_shapley`, or `anatomy`, pass `sidecar_name=...`. The `anatomy_*` functions below are backend aliases. Use them only when the backend object itself is the object being handled. ### backend bridge utilities ```python macroforecast.interpretation.window_to_anatomy_subsets( window, index, *, train_source="fit", ) ``` The user-facing alias is: ```python macroforecast.interpretation.window_to_oshapley_subsets(...) ``` Input: a `WindowSpec`, method name, or `None`, plus the feature/target index. Output: upstream `anatomy.AnatomySubsets`. Unlike `AnatomySubsets.generate(...)`, this conversion follows the exact macroforecast `WindowSpec` plan, including rolling/expanding/fixed estimation, calendar or positional test steps, and retrain cadence. `train_source="fit"` uses the model-fit sample at each origin; `train_source="estimation"` uses the full estimation sample. ```python macroforecast.interpretation.anatomy_provider( X, y, models, *, window=None, params=None, target_name=None, train_source="fit", ) ``` The user-facing alias is: ```python macroforecast.interpretation.oshapley_provider(...) ``` Input: an already aligned forecast-row feature matrix `X` and target `y`. `models` can be a model name, `ModelSpec`, sequence, or alias mapping. The target must already represent the forecast object being explained, such as a direct `y[t+h]` target or a precomputed growth target. Panel-to-feature construction remains the job of `macroforecast.feature_engineering` or `macroforecast.forecasting`. Output: upstream `AnatomyModelProvider`. The provider fits the requested macroforecast model at each anatomy period and wraps the fit with `anatomy_model(...)`. ```python macroforecast.interpretation.precompute_anatomy( X, y, models, *, window=None, n_iterations=32, n_jobs=1, save_path=None, ) ``` The user-facing alias is: ```python macroforecast.interpretation.precompute_oshapley(...) ``` Output: precomputed upstream `anatomy.Anatomy` object. ```python macroforecast.interpretation.anatomy_pipeline( X, y, models, *, window=None, losses=("rmse",), n_iterations=32, n_jobs=1, ) ``` The user-facing alias is: ```python macroforecast.interpretation.oshapley_pipeline(...) ``` Output: `ForecastShapleyResult` (`AnatomyPipelineResult` backend alias) with: | Attribute | Meaning | | --- | --- | | `anatomy` | Precomputed backend object. | | `explanations["forecast"]` | Raw-forecast anatomy table from `anatomy_explain(...)`. | | `variable_importance` | oShapley-VI table from `oshapley_vi(...)`. | | `performance_values[loss]` | PBSV table for each requested loss. | | `metadata` | Window/model/loss settings used to build the provider. | `AnatomyPipelineResult.to_tables(prefix="anatomy")` returns named output tables for raw forecast explanations, oShapley-VI, PBSV/loss decompositions, and metadata. `macroforecast.output` uses this method when writing anatomy sidecars attached to a `ForecastResult`. Example: ```python window = mf.window.from_cutoffs( test_start=X.index[120], estimation_min_size=120, horizon=1, step=1, ) anat = mf.interpretation.anatomy_pipeline( X, y_direct, {"ridge": "ridge", "forest": "random_forest"}, window=window, losses=("squared_error", "rmse"), n_iterations=64, n_jobs=4, ) anat.variable_importance anat.performance_values["rmse"] ``` ```python macroforecast.interpretation.oshapley_from_forecast_result( result, X, y, models, *, window, attach=True, sidecar_name="anatomy", ) ``` Input: a completed `ForecastResult` plus the aligned `X`, `y`, model specs, and the exact `WindowSpec` used for the anatomy refit loop. The explicit `X/y` and `window` are required because a forecast table alone cannot reconstruct the feature matrix, train/test subsets, or model-provider path without risking look-ahead mistakes. Output: if `attach=True`, a copy of `ForecastResult` with an oShapley sidecar attached under `sidecar_name`. If `attach=False`, returns the standalone `ForecastShapleyResult`. ```python result = mf.forecasting.run(feature_set, "ridge", window=window) result = mf.interpretation.oshapley_from_forecast_result( result, feature_set.X, feature_set.y.iloc[:, 0], {"ridge": "ridge"}, window=window, losses=("squared_error", "rmse"), ) mf.output.write_artifacts(result, "results/anatomy_run") ``` Dual sidecars use the same completed-result pattern, but they do not refit the forecasting pipeline. The user passes the fitted model and exact train/test feature matrices: ```python result = mf.forecasting.run(feature_set, "ridge", window=window) result = mf.interpretation.dual_from_forecast_result( result, fit, X_train, y_train, X_test, method="ridge", ) mf.output.write_artifacts(result, "results/dual_run", layout="grouped") ``` ### anatomy_explain ```python macroforecast.interpretation.anatomy_explain( anatomy, *, model_groups=None, transformer=None, metric="forecast", explanation_subset=None, output="long", ) ``` Input: a precomputed `anatomy.Anatomy` object or a path saved by the Python `anatomy` package. The upstream package owns the expensive provider loop: rolling or expanding forecast periods, train/test extraction, model retrieval or fitting, background masking, model-combination handling, and Shapley precomputation. Output: | Column | Meaning | | --- | --- | | `model_set` | Model or model-combination name from the anatomy object. | | `index` | Forecast-date index for local explanations, or the global subset label for aggregated explanations. | | `feature` | `base_contribution` or predictor name. | | `contribution` | Shapley contribution returned by `anatomy.Anatomy.explain(...)`. | | `is_base` | Whether the row is the base contribution. | Set `output="wide"` to receive the backend's original wide table with `base_contribution` plus feature columns. Supported built-in `metric` values: | Metric | Explained value | | --- | --- | | `forecast` | Raw forecast, matching the backend default transformer. | | `squared_error` | Local point squared error `((y - y_hat) ** 2)`. | | `absolute_error` | Local point absolute error `abs(y - y_hat)`. | | `mse` | Aggregated mean squared error over the selected forecast subset. | | `rmse` | Aggregated root mean squared error over the selected forecast subset. | | `mae` | Aggregated mean absolute error over the selected forecast subset. | You can pass a custom `transformer` directly. If it is callable, macroforecast wraps it as `anatomy.AnatomyModelOutputTransformer`. The callable must follow the upstream API and use a named `y_hat` argument, optionally with `y`. Example: ```python table = mf.interpretation.anatomy_explain( "run_artifacts/anatomy.pkl", model_groups={"forest_plus_linear": ["rf", "ridge"]}, metric="squared_error", ) ``` This is the preferred route when the goal is to reproduce the paper's full forecast-anatomy design. It preserves the upstream provider's rolling or expanding training windows and masking logic. ### oshapley_vi ```python macroforecast.interpretation.oshapley_vi( anatomy, *, model_groups=None, explanation_subset=None, exclude_base=True, ) ``` Input: a precomputed `anatomy.Anatomy` object or saved path. Output columns: | Column | Meaning | | --- | --- | | `model_set` | Model or model-combination name. | | `feature` | Predictor name. | | `importance` | Mean absolute raw-forecast Shapley contribution. | | `mean_contribution` | Signed mean contribution. | | `std_contribution` | Standard deviation of local contributions. | | `n_rows` | Number of local contribution rows. | | `rank` | Rank within `model_set`. | This matches the upstream README recipe: ```python df_oshapley = anatomy.explain( model_sets=AnatomyModelCombination(groups=groups), transformer=AnatomyModelOutputTransformer(transform=lambda y_hat: y_hat), ) vi = df_oshapley.loc["model_name"].abs().mean(axis=0) ``` macroforecast performs that sequence through the backend and returns a tidy table. ### pbsv / performance_based_shapley_value ```python macroforecast.interpretation.pbsv( anatomy, *, model_groups=None, loss="rmse", transformer=None, explanation_subset=None, output="long", ) ``` Input: a precomputed `anatomy.Anatomy` object or saved path. Output: same shape as `anatomy_explain(...)`, with metadata kind `pbsv`. `performance_based_shapley_value(...)` is an alias for `pbsv(...)`. Built-in loss choices: | Loss | Scope | | --- | --- | | `rmse` | Global/subperiod PBSV_p for RMSE. | | `mse` | Global/subperiod PBSV_p for MSE. | | `mae` | Global/subperiod PBSV_p for MAE. | | `squared_error` | Local PBSV_p for each forecast row. | | `absolute_error` | Local PBSV_p for each forecast row. | For R-squared, gains, utility, custom benchmarks, or portfolio-style losses, pass `transformer=` directly. The callable must follow the backend API: `transform(y_hat, y)` or `transform(y_hat)`. ### ishapley_vi / shapley_variable_importance ```python macroforecast.interpretation.ishapley_vi( contributions, *, contribution_col=None, feature_col="feature", group_col=None, exclude_base=True, ) ``` Input: an in-sample/training Shapley contribution table with a feature column and a contribution column. This is the explicit macroforecast callable for iShapley-VI_p. Output: a mean-absolute in-sample Shapley VI table with metadata kind `ishapley_vi`. ```python macroforecast.interpretation.shapley_variable_importance( contributions, *, contribution_col=None, feature_col="feature", group_col=None, exclude_base=True, source="contribution_table", ) ``` Input: any local contribution table with a feature column and a contribution column. The function infers common contribution names such as `contribution`, `shap_value`, and `forecast_contribution`. Output: a mean-absolute Shapley VI table. Use it in two places: | Use | Example | | --- | --- | | iShapley-VI | Prefer `ishapley_vi(...)` on an in-sample/training contribution table, then pass to `model_accordance_score(...)`. | | oShapley-VI | Run on an out-of-sample raw-forecast contribution table. `oshapley_vi(...)` does this automatically for backend anatomy objects. | ### model_accordance_score ```python macroforecast.interpretation.model_accordance_score( is_vi, oos_pbsv, *, loss_type="lower_is_better", mas_type="importance_weighted", hypothesis_test=True, h0_alpha=0.5, n_samples=1000000, random_state=None, ) ``` Input: | Argument | Meaning | | --- | --- | | `is_vi` | Feature-indexed Series or VI table. This can be iShapley-VI or oShapley-VI. | | `oos_pbsv` | Feature-indexed Series or PBSV table. | | `loss_type` | `lower_is_better` for RMSE/MSE/MAE losses; `larger_is_better` for gains/scores. | | `mas_type` | `importance_weighted` or `equal_weighted`, matching upstream `anatomy.MAS`. | Output columns: `mas`, `mas_p_value`, `mas_type`, `loss_type`, `hypothesis_test`, `h0_alpha`, `n_samples`, and `n_features`. The function calls `anatomy.MAS(...).compute(...)` directly after converting macroforecast tables to the feature-indexed Series expected by the backend. When `random_state` is supplied, macroforecast temporarily seeds NumPy around the backend hypothesis-test simulation and then restores the previous random state. ### performance_shapley_value ```python macroforecast.interpretation.performance_shapley_value( contributions, y, *, loss="squared_error", row_col=None, feature_col="feature", contribution_col=None, base_col="base_value", base_value=0.0, n_permutations=None, max_exact_features=8, random_state=0, return_local=False, ) ``` Input: an additive forecast-contribution table and an aligned target vector. The contribution table can be the output of `shap_values(...)`, `forecast_decomposition(...)`, or any custom table with: | Required item | Default column inference | | --- | --- | | Row identity | `index` when it matches `y.index`, otherwise `row`. | | Feature name | `feature`. | | Additive forecast contribution | `shap_value`, `forecast_contribution`, `contribution`, or `value`. | | Base forecast | `base_value`, otherwise scalar `base_value` argument. | If the table has no `row` or `index` column, macroforecast treats it as a single-forecast contribution table. This is useful for one-row output from `forecast_decomposition(...)`. For each forecast row, the function evaluates the cooperative game: ```text v(S) = L(y_t, base_t + sum_{j in S} contribution_{t,j}) ``` and assigns feature `j`: ```text PBSV_{t,j} = sum_{S subset N \ {j}} |S|! (p - |S| - 1)! / p! [v(S union {j}) - v(S)] ``` The base row is the empty-coalition loss `L(y_t, base_t)`. Therefore: ```text base_loss_t + sum_j PBSV_{t,j} = full_loss_t ``` A negative `pbsv` means the feature reduced the selected loss for that forecast. A positive `pbsv` means it increased the loss. This sign convention is the useful paper-table convention for forecast accuracy: beneficial variables are negative when the metric is a loss. Output with `return_local=False`: | Column | Meaning | | --- | --- | | `feature` | Feature name or `__base__`. | | `is_base` | Whether the row is the base loss. | | `pbsv` | Mean PBSV across forecast rows. | | `abs_pbsv` | Mean absolute PBSV. | | `n_rows` | Number of forecast rows. | | `baseline_loss` | Mean empty-coalition loss. | | `full_loss` | Mean full-contribution loss. | | `rank` | Display rank in the returned table. | Output with `return_local=True` keeps one row per forecast and feature, with `actual`, `base_prediction`, `full_prediction`, `baseline_loss`, `full_loss`, `shapley_mode`, and `effective_permutations`. Exact Shapley weights are used when the number of features in a forecast row is at most `max_exact_features` and `n_permutations=None`. For larger rows, the function samples random feature orderings. Metadata records `max_efficiency_error`, which should be near zero when the contribution table is additive and row alignment is correct. Example from SHAP output: ```python shap_table = mf.interpretation.shap_values( fit, X_test, background=X_train, explainer="tree", ) pbsv = mf.interpretation.performance_shapley_value( shap_table, y_test, contribution_col="shap_value", loss="squared_error", ) ``` This adapter is useful when you already have additive local contributions and want a PBSV-style loss view inside the macroforecast flow. It does not rebuild the full Borup et al. rolling provider, background replacement, or iShapley / oShapley machinery. Use `anatomy_explain(...)` for that full backend path. ## Forecast And Model Decomposition ### forecast_decomposition ```python macroforecast.interpretation.forecast_decomposition(model, X, row=-1) ``` Input: fitted model and feature `DataFrame`. `row` can be an integer position or an index label. Output columns for linear models: `feature`, `feature_value`, `coefficient`, `contribution`, and `abs_contribution`. For linear models, the contribution sum equals the selected row's prediction up to numerical precision. If the model has no coefficients but does expose tree importance, the function returns a non-additive fallback. In that case `contribution` is `NaN`, `status="tree_importance_fallback_not_additive"`, and metadata records `prediction_additivity=False`. Do not report that fallback as a forecast decomposition. ### cumulative_r2_contribution ```python macroforecast.interpretation.cumulative_r2_contribution( model, X, y, *, feature_order=None, ) ``` Input: fitted predictor, feature `DataFrame`, and target vector. Output columns: `step`, `feature`, `r2`, `incremental_r2`, and `cumulative_features`. Inactive features are set to zero. The default order is native coefficient ranking when available, otherwise one-repeat permutation importance. This is a fixed-model masking diagnostic; the model is not refit as features enter. ### rolling_recompute ```python macroforecast.interpretation.rolling_recompute( model, X, y, *, window=None, step=None, method="permutation_importance", n_repeats=1, random_state=None, ) ``` Input: fitted predictor, feature `DataFrame`, and target vector. Output: feature-importance rows with `window_id`, `window_start`, and `window_end` columns added. Supported methods: | Method | Meaning | | --- | --- | | `permutation_importance` | Vanilla loss-degradation permutation within each rolling window. | | `permutation_importance_strobl` | Conditional permutation within each rolling window. | This function does not refit the model. It asks whether the already fitted model relies on different variables in different evaluation windows. Metadata records `refits_model=False`. ### bootstrap_jackknife ```python macroforecast.interpretation.bootstrap_jackknife( model, X, y, *, fit_func=None, n_replications=50, random_state=None, ) ``` Input: fitted predictor, feature frame, target vector, and optionally a refit callable. With `fit_func`, each bootstrap sample is refit and native coefficient importance is summarized. Without it, each bootstrap sample uses fixed-model permutation importance. The current implementation is bootstrap with replacement; metadata records `jackknife=False`. Output columns: `feature`, `importance`, `std`, `lower`, `upper`, and `n_replications`. ### lasso_inclusion_frequency ```python macroforecast.interpretation.lasso_inclusion_frequency( model, X=None, y=None, *, fit_func=None, n_bootstraps=50, random_state=None, ) ``` Input: a lasso-style fitted model. With `X`, `y`, and `fit_func`, the function bootstraps the sample and refits to estimate nonzero coefficient frequency. Without those arguments, it reports the single fitted model's nonzero coefficient indicator. Output columns include `feature`, `inclusion_frequency`, and `importance`. ## Group And Pipeline Attribution ### group_aggregate ```python macroforecast.interpretation.group_aggregate( table, *, groups=None, group_column=None, value_column=None, aggregation="sum", ) ``` Input: a feature-importance table with a `feature` column. `groups` can be either `{feature: group}` or `{group: [features...]}`. Output columns: `group` and `importance`. Supported aggregations: `sum`, `mean`, `max_abs`, and `signed_sum`. ### lineage_attribution ```python macroforecast.interpretation.lineage_attribution( table, lineage, *, level="pipeline_name", value_column=None, aggregation="sum", ) ``` Input: a feature-importance table and a lineage mapping such as `{"pc1": {"pipeline_name": "pca_block"}}`. Output columns: the selected lineage level and `importance`. ### transformation_attribution ```python macroforecast.interpretation.transformation_attribution( evaluation, *, pipeline_column=None, metric=None, method="shapley_over_pipelines", target_columns=("target", "horizon"), lower_is_better=True, baseline="worst", ) ``` Input: an evaluation table with a pipeline/model column and a loss metric column. If not supplied, the function infers common names such as `model`, `model_id`, `pipeline`, `mse`, `rmse`, and `mae`. Output columns: grouping columns when present, `pipeline`, `loss`, `utility`, `contribution`, `baseline`, `method`, `metric`, and `lower_is_better`. For the default loss setting, utility is: ```text utility_i = baseline_loss - loss_i ``` where `baseline="worst"` uses the worst observed pipeline in each target/horizon group. `method="shapley_over_pipelines"` applies exact Shapley weights to the utility game whose coalition value is average utility. This is a coherent way to summarize mutually exclusive pipeline alternatives, but it is not a causal component decomposition unless the evaluation table itself was constructed from component-removal experiments. Supported methods: | Method | Meaning | | --- | --- | | `shapley_over_pipelines` | Exact Shapley contribution over average-utility coalitions. | | `marginal_addition` | Pipeline utility relative to the selected baseline. | | `leave_one_out_pipeline` | Average-utility change after removing one pipeline. | ## Dual Interpretation This block covers the dual interpretation route in Goulet Coulombe, Goebel, and Klieber (2024), "Dual Interpretation of Machine Learning Forecasts" (`arXiv:2412.13076`). Variable-importance tools ask which predictors matter. Dual interpretation asks which historical observations the model is using as analogues for the current forecast. The preferred public surface is the dedicated namespace `macroforecast.interpretation.dual`. The short aliases in `macroforecast.interpretation` remain available for compatibility. See [Dual Interpretation](interpretation_dual.md) for the full input/output contract, `DualInterpretationResult`, and grouped output layout. The core object is a data-portfolio weight: ```text yhat_new = sum_i w_i(new) y_i ``` Positive weights mean the model borrows from similar historical outcomes. Negative weights mean the model uses an observation by contrast or symmetry. Concentrated weights mean the model relies on a few episodes. High leverage means the forecast is extrapolative rather than a simple average. High turnover means the historical analogies change quickly over time. The local reference code reviewed for this implementation is: | Reference path | Used for | | --- | --- | | `wiki/raw/paper_code/coulombe_site_github_20260530/dual_python/auxiliaries.py` | Python RR/KRR/RF weight formulas and DualML diagnostics. | | `wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/DualML.R` | R `FC_CR`, `FSP`, `FT` diagnostic definitions and RF leaf-weight logic. | | `wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/README.md` | Model-route inventory: OLS, RF, LGB, RR, KRR, NN. | Implemented routes: | Route | Formula / logic | macroforecast callable | | --- | --- | --- | | Ridge / OLS | `W = X_test (X_train' X_train + n lambda I)^-1 X_train'` by default. Set `ridge_penalty_scale="none"` for `lambda I`. | `dual.observation_weights(..., method="ridge")` | | Kernel ridge | `W = K_test (K_train + lambda I)^-1`; kernels: `linear`, `gaussian`/`rbf`, `laplace`/`laplacian`. | `dual.observation_weights(..., method="krr", kernel=...)` | | Random forest | For each tree, assign test and train rows to leaves; train rows in the same leaf share weight; average across trees. For sklearn forests, bootstrap sample counts are used when recoverable. | `dual.observation_weights(forest, ..., method="random_forest")` | Not yet implemented: | Route | Status | | --- | --- | | Boosted trees / LightGBM AXIL | Needs a validated AXIL-style path and model-specific tree-channel storage. | | LGB+ / LGBA+ channel-specific weights | Should wait until the LGB+ estimator stores separate linear/tree channels. | | Neural networks | Needs penultimate-layer embeddings and ridge-on-embedding approximation error reporting. | | Classification | Should default to log-odds-space contributions; probability-scale decomposition is order-dependent. | ### dual.observation_weights ```python macroforecast.interpretation.dual.observation_weights( model, X_train, X_test=None, *, method="auto", lambda_=1e-8, kernel="linear", sigma=1.0, add_intercept=False, ridge_penalty_scale="n_train", normalize=False, ) ``` Input: | Argument | Meaning | | --- | --- | | `model` | Required for `method="random_forest"`; can be `None` for ridge/KRR. | | `X_train` | Training feature `DataFrame`. Its index becomes `train_index`. | | `X_test` | Forecast-row feature `DataFrame`. If omitted, train rows are explained against themselves. | | `method` | `auto`, `ridge`, `ols`, `krr`, `kernel_ridge`, `random_forest`, `rf`. | | `lambda_` | Ridge/KRR regularization. Ridge defaults to the DualML convention `n_train * lambda_`; KRR uses `lambda_ I`. | | `kernel` | `linear`, `gaussian`/`rbf`, or `laplace`/`laplacian`. | | `sigma` | Kernel bandwidth convention used by the reference code: `exp(-sigma * distance)`. | | `add_intercept` | Adds an unpenalized intercept for ridge/OLS. The paper/reference code usually assumes standardized no-intercept design. | | `normalize` | Re-normalize row weights to sum to one. Default is `False` because negative weights and leverage are meaningful. | Output columns: | Column | Meaning | | --- | --- | | `test_row`, `test_index` | Forecast row position and index. | | `train_row`, `train_index` | Historical observation position and index. | | `weight`, `abs_weight` | Data-portfolio weight and absolute weight. | | `channel` | Implemented route: `ridge`, `krr`, or `random_forest`. | The full dense matrix is also attached at `attrs["weight_matrix"]` with shape `(n_test, n_train)`. Example: ```python weights = mf.interpretation.observation_weights( None, X_train, X_test, method="krr", kernel="laplace", sigma=1e-4, lambda_=0.1, ) ``` ### dual.observation_contributions ```python macroforecast.interpretation.dual.observation_contributions( weights, y_train, *, center=False, include_base=False, ) ``` Input: an `observation_weights(...)` table and the aligned in-sample target. If `y_train` is a `Series`, it is reindexed to `train_index` when possible. Output columns add: | Column | Meaning | | --- | --- | | `train_y` | Realized historical outcome. | | `centered_train_y` | `train_y - mean(y_train)` when `center=True`; otherwise `train_y`. | | `contribution` | `weight * train_y` by default, or `weight * centered_train_y` with optional base row. | | `prediction` | Sum of contributions for the forecast row. | | `channel` | `episode`, or `base` for the optional centered base row. | Default `center=False` preserves the exact identity `prediction = weights @ y_train`. Set `center=True` for plots that resemble the cumulative centered paths in the DualML reference code. ### dual.forecast_diagnostics ```python macroforecast.interpretation.dual.forecast_diagnostics( weights, *, top_q=0.05, ) ``` Output columns: | Column | Meaning | | --- | --- | | `concentration` | Sum of the top `top_q` share of absolute weights divided by total absolute weight. This matches the R `FC_CR` definition. | | `short_position` | Signed sum of negative weights. This matches the R `FSP` definition. | | `short_position_abs` | Absolute value version of the short side. | | `leverage` | Sum of signed weights. | | `gross_leverage` | Sum of absolute weights. | | `turnover` | Sum of absolute weight changes from the previous forecast row. The first row is `NaN`. | | `top_q`, `top_k`, `n_train` | Diagnostic settings. | Negative weights are not automatically errors. In this paper they can mean symmetry-based inference or extrapolation. The caution is economic: macro shocks are often asymmetric, so a mirror-image historical analogy can be misleading. ### dual.top_observations ```python macroforecast.interpretation.dual.top_observations( weights, *, y_train=None, n=10, sort_by="abs_weight", ) ``` Input: an observation-weight table or contribution table. If `y_train` is supplied and the input has no `contribution` column, contributions are built first. Supported `sort_by` values: `abs_weight`, `weight`, `contribution`, and `abs_contribution`. Output: top historical observations per forecast row, with a `rank` column. This is the most direct table for "which past episodes drove this forecast?" ### dual.group_observation_weights ```python macroforecast.interpretation.dual.group_observation_weights( weights, groups, *, y_train=None, ) ``` Input: observation weights or contributions plus a mapping from group names to training-index labels, for example: ```python groups = { "volcker": pd.period_range("1979Q3", "1982Q4", freq="Q").to_timestamp("Q"), "gfc": pd.period_range("2007Q4", "2009Q2", freq="Q").to_timestamp("Q"), "covid": pd.period_range("2020Q1", "2021Q2", freq="Q").to_timestamp("Q"), } ``` Output columns: `test_row`, `test_index`, `episode_group`, `weight`, `abs_weight`, `n_episodes`, and, when available, `contribution` and `abs_contribution`. Use this when a paper table needs weights assigned to recessions, inflation surges, tightening cycles, COVID episodes, or user-defined regimes. ## OLS As Attention This block implements the algebra in Philippe Goulet Coulombe (2026), "Ordinary Least Squares as an Attention Mechanism." It is not a separate forecasting model. Standard OLS already produces the forecast. These functions make the same forecast inspectable as a query-key-value style weighted average of historical outcomes. The paper identity is: ```text beta_hat = (X_train' X_train)^-1 X_train' y_train y_hat_test = X_test beta_hat = X_test (X_train' X_train)^-1 X_train' y_train = A y_train ``` where: | Object | Meaning | | --- | --- | | `X_train` | training feature matrix, already preprocessed and feature-engineered | | `X_test` | forecast-row feature matrix with the same columns | | `y_train` | realized training outcomes | | `A` | test-by-train attention matrix | | `A[j, i]` | signed weight placed on training observation `i` when predicting test row `j` | Positive weights borrow from a historical outcome. Negative weights use that historical outcome by contrast. The row weights do not need to be non-negative, so this is not softmax transformer attention. It is the linear OLS/ridge attention operator implied by the regression geometry. Recommended use: ```python fit = mf.models.ols(X_train, y_train) weights = mf.interpretation.ols_attention_weights(X_train, X_test) audit = mf.interpretation.ols_attention_equivalence( X_train, y_train, X_test, reference_predictions=fit.predict(X_test), ) embedding = mf.interpretation.ols_attention_embedding(X_train, X_test) ``` For high-dimensional macro panels, use the ridge version: ```python fit = mf.models.ridge(X_train, y_train, alpha=0.1) weights = mf.interpretation.ridge_attention_weights(X_train, X_test, alpha=0.1) ``` With `add_intercept=True`, the intercept column is included in the design but is not penalized. This matches standard ridge practice in sklearn/R-style linear regression and keeps the affine identity for intercept models. ### ols_attention_weights ```python macroforecast.interpretation.ols_attention_weights( X_train, X_test=None, *, add_intercept=True, ) ``` Input: | Argument | Type | Default | Meaning | | --- | --- | --- | --- | | `X_train` | pandas `DataFrame` | required | Training feature matrix. Index labels become `train_index`. | | `X_test` | pandas `DataFrame` or `None` | `None` | Forecast-row feature matrix. If omitted, training rows are explained against themselves. Missing columns are filled with zero after reindexing to `X_train.columns`. | | `add_intercept` | bool | `True` | Add an intercept column before forming the attention matrix. | Output: long pandas `DataFrame`. | Column | Meaning | | --- | --- | | `test_row`, `test_index` | forecast-row position and label | | `train_row`, `train_index` | training-row position and label | | `weight` | attention weight `A[j, i]` | The dense matrix `A` is attached at `attrs["attention_matrix"]`. ### ridge_attention_weights ```python macroforecast.interpretation.ridge_attention_weights( X_train, X_test=None, *, alpha=1.0, add_intercept=True, ) ``` Input is the same as `ols_attention_weights`, plus: | Argument | Type | Default | Meaning | | --- | --- | --- | --- | | `alpha` | float | `1.0` | Ridge penalty in `X_train'X_train + alpha I`. Must be non-negative. | Output: same long attention-weight table. This function is useful when the feature matrix is nearly singular or when the fitted forecasting model is `mf.models.ridge(..., alpha=alpha)`. ### ols_attention_embedding ```python macroforecast.interpretation.ols_attention_embedding( X_train, X_test=None, *, add_intercept=True, ridge=0.0, tol=1e-12, ) ``` This returns the whitened embedding view emphasized in the paper. In the full-rank OLS case: ```text (X_train'X_train)^-1 = U Lambda^-1 U' F_train = X_train U Lambda^-1/2 F_test = X_test U Lambda^-1/2 A = F_test F_train' ``` For ridge or rank-deficient cases, macroforecast uses the symmetric pseudoinverse precision matrix and keeps positive precision components above `tol`. Output: long pandas `DataFrame`. | Column | Meaning | | --- | --- | | `sample` | `train` or `test` | | `row`, `index` | row position and label | | `component` | whitened embedding component | | `value` | embedded coordinate | | `precision_eigenvalue` | eigenvalue of the precision matrix used for that component | Attached arrays: | Attr key | Meaning | | --- | --- | | `train_embedding` | dense `F_train` matrix | | `test_embedding` | dense `F_test` matrix | | `attention_matrix` | reconstructed `F_test F_train'` matrix | | `precision_matrix` | symmetric inverse or ridge inverse of the design Gram matrix | | `precision_eigenvalues` | retained positive precision eigenvalues | ### ols_attention_equivalence ```python macroforecast.interpretation.ols_attention_equivalence( X_train, y_train, X_test=None, *, reference_predictions=None, add_intercept=True, ridge=0.0, ) ``` Input: | Argument | Type | Default | Meaning | | --- | --- | --- | --- | | `X_train` | pandas `DataFrame` | required | Training feature matrix. | | `y_train` | pandas `Series` or array | required | Training outcomes. If a `Series`, it is aligned to `X_train.index`. | | `X_test` | pandas `DataFrame` or `None` | `None` | Forecast rows. | | `reference_predictions` | sequence or `None` | `None` | Optional predictions from an already fitted model. If omitted, macroforecast computes the closed-form normal-equation prediction. | | `add_intercept` | bool | `True` | Same convention as the weight functions. | | `ridge` | float | `0.0` | Set to the ridge model's `alpha` when auditing a ridge forecast. | Output: one row per test observation. | Column | Meaning | | --- | --- | | `attention_prediction` | `A @ y_train` | | `reference_prediction` | closed-form or user-supplied prediction | | `equivalence_error` | signed difference | | `abs_equivalence_error` | absolute difference | Use this as the auditable paper check: if OLS/Ridge and attention are aligned, `abs_equivalence_error` should be near machine precision. ### attention_weights ```python macroforecast.interpretation.attention_weights( X_train, X_test=None, *, add_intercept=True, ridge=1e-8, ) ``` Input: training feature frame and optional test feature frame. Output columns: `test_row`, `test_index`, `train_row`, `train_index`, and `weight`. The full attention matrix is also attached at `attrs["attention_matrix"]`. The implemented formula is: ```text Omega = X_test (X_train' X_train)^-1 X_train' ``` with a small ridge term for numerical stability. When `add_intercept=True`, the ridge term is not applied to the intercept column, matching the standard regression convention. Prefer `ols_attention_weights()` for exact paper-style OLS and `ridge_attention_weights()` for the ridge-stabilized variant. The older `attention_weights()` helper remains available as the generic numerically stabilized route. ### dual_decomposition ```python macroforecast.interpretation.dual_decomposition( X_train, y_train, X_test=None, *, add_intercept=True, ridge=1e-8, ) ``` Input: training features, training target, and optional test features. Output columns add `train_y` and `contribution` to the attention-weight table. The per-test prediction summary is attached at `attrs["prediction_summary"]`. ## Macroeconomic Random Forest GTVP ### mrf_gtvp ```python macroforecast.interpretation.mrf_gtvp(model, X=None) ``` Input: a fitted `mf.models.macro_random_forest(...)` result. The MRF reference backend only creates time-varying coefficients after prediction, so call `fit.predict(X_test)` first or pass `X` to `mrf_gtvp()` so it can trigger prediction. Output columns: `row`, `index`, `feature`, `coefficient`, `abs_coefficient`, and `importance`. `feature="__intercept__"` is the time-varying intercept. Other features are the coefficient columns returned by the MacroRandomForest reference backend's `betas` output. A compact feature-level summary is attached at `attrs["summary"]`. ## VAR Interpretation ### generalized_irf ```python macroforecast.interpretation.generalized_irf(model, n_periods=12, target=None) ``` Input: fitted VAR-style model, including `mf.models.var(...)`. Output columns: `feature`, `importance`, `coefficient`, and `status`. The function implements the Pesaran-Shin generalized impulse response: ```text GIRF_h(j) = sigma_jj^(-1/2) A_h Sigma e_j ``` `importance` is the sum of absolute target responses across horizons. ### orthogonalised_irf ```python macroforecast.interpretation.orthogonalised_irf(model, n_periods=12, target=None) ``` Input and output shape match `generalized_irf`, but responses come from the Cholesky orthogonalised IRF path. ### fevd ```python macroforecast.interpretation.fevd(model, n_periods=12, target=None) ``` Input: fitted VAR-style model. Output columns: `feature`, `importance`, `coefficient`, and `status`. `importance` is the summed forecast-error variance contribution over the requested horizon range. ### historical_decomposition ```python macroforecast.interpretation.historical_decomposition(model, max_lag=12, target=None) ``` Input: fitted VAR-style model. Output columns: `feature`, `importance`, `mean_contribution`, and `max_abs_contribution`. The implementation uses reduced-form MA coefficients and fitted residuals to summarize historical shock contributions. ## Deep-Model Hidden States ### lstm_hidden_state ```python macroforecast.interpretation.lstm_hidden_state(model, X) ``` Input: a fitted torch-backed LSTM/GRU model from `mf.models.lstm(...)` or `mf.models.gru(...)`, plus a feature frame. Output columns: `feature`, `importance`, and `coefficient`, where each feature is a hidden unit. This requires the `deep` extra. Transformer models are rejected because the recurrent hidden-state interpretation does not apply. ### Gradient attribution ```python macroforecast.interpretation.saliency_map(model, X) macroforecast.interpretation.gradient_attribution(model, X, method="saliency") macroforecast.interpretation.integrated_gradients(model, X, baseline=None, n_steps=50) macroforecast.interpretation.gradient_shap(model, X, baseline=None, n_samples=20) macroforecast.interpretation.deep_lift(model, X, baseline=None) ``` Input: a fitted torch-backed model from `mf.models.nn(...)`, `mf.models.lstm(...)`, `mf.models.gru(...)`, or `mf.models.transformer(...)`, plus a feature frame. Output columns: `feature`, `importance`, `mean_attribution`, `std_attribution`, and `method`. `gradient_attribution` is the method-dispatching helper. `saliency_map`, `integrated_gradients`, and `gradient_shap` are implemented directly with torch autograd. `deep_lift` uses Captum and therefore requires the `deep` extra. For recurrent models, attribution is averaged across sequence positions before feature-level aggregation. If a baseline is supplied as a pandas `DataFrame` for a standardized torch-backed model, macroforecast transforms the baseline into the fitted model's input scale before attribution. ## Custom Interpretation ```python macroforecast.interpretation.custom_interpretation( model, X, func, *, y=None, name=None, metadata=None, **params, ) ``` Input: a fitted `ModelFit` or predictor, feature `DataFrame`, optional target, and a user callable. Callable signature: ```python func(model, X, *, y=None, metadata=None, **params) ``` Accepted callable outputs are `DataFrame`, `Series`, mapping, or a sequence convertible to a `DataFrame`. The wrapper attaches: | Attr | Meaning | | --- | --- | | `macroforecast_metadata_schema.kind` | Always `custom_interpretation`. | | `macroforecast_metadata_schema.method` | `name` or callable name. | | `macroforecast_metadata_schema.metadata.params` | User parameters passed to the callable. | | `macroforecast_metadata_schema.metadata.user_metadata` | User-supplied metadata mapping. | Example: ```python def signed_mean_effect(model, X, *, y=None, metadata=None, scale=1.0): pred = model.predict(X) return {"signed_mean_prediction": float(pred.mean() * scale)} custom = mf.interpretation.custom_interpretation( fit, X_test, signed_mean_effect, name="signed_mean_effect", scale=100.0, ) ``` ## Examples ```python fit = mf.models.ridge(X_train, y_train) coef = mf.interpretation.linear_coefficients(fit) perm = mf.interpretation.permutation_importance( fit, X_test, y_test, n_repeats=20, random_state=123, ) pdp = mf.interpretation.partial_dependence(fit, X_test, features=["PAYEMS"]) ice = mf.interpretation.ice_curves(fit, X_test, features=["PAYEMS"]) ale = mf.interpretation.accumulated_local_effect(fit, X_test, feature="PAYEMS") ``` ```python tree = mf.models.random_forest(X_train, y_train, random_state=123) shap_table = mf.interpretation.shap_values( tree, X_test, background=X_train, explainer="tree", ) ``` - `var_impulse_response` -- VAR impulse responses with Monte-Carlo bootstrap confidence bands (statsmodels errband_mc; vars::irf), as a tidy horizon/impulse/response table. - `accumulated_local_effect_2d` -- second-order (two-feature) accumulated local effect: the pure interaction ALE surface (Apley-Zhu 2020; ALEPlot 2D).