# `op`

[Back to L7](../index.md) | [Browse all axes](../../browse_by_axis.md) | [Browse all options](../../browse_by_option.md)

> Axis ``op`` on sub-layer ``L7_A_importance_dag_body`` (layer ``l7``).

## Sub-layer

**L7_A_importance_dag_body**

## Axis metadata

- Default: `'permutation_importance'`
- Sweepable: False
- Status: operational

## Operational status summary

- Operational: 34 option(s)
- Future: 6 option(s)

## Options

### `accumulated_local_effect`  --  operational

Apley & Zhu (2020) accumulated local effects -- PDP alternative robust to correlation.

See [accumulated_local_effect function page](../op/accumulated_local_effect.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.ale_importance``.

### `attention_weights`  --  operational

OLS-as-attention closed-form attention matrix (Goulet Coulombe 2026).

Goulet Coulombe (2026) 'OLS as an Attention Mechanism' Eq. 3 closed form: ``Ω = X_test · (X'_train · X_train)⁻¹ · X'_train``. The ``(n_test, n_train)`` matrix encodes how strongly each test point attends to each training point under an OLS / ridge fit, identical to the representer expansion of the dual ridge solution. Output table carries one row per training observation (per-test-point weight aggregates) plus the full attention matrix and representer-identity diagnostics inline via ``frame.attrs``.

Promoted from ``future`` to ``operational`` in Phase B-10 (paper-10 replication). Compatible with linear-family L4 models (``ols`` / ``ridge`` / ``lasso`` / ``elastic_net`` / ``bayesian_ridge`` / ``huber``).

**When to use**

Linear-family attribution as a kernel-attention map; pedagogical / replication of paper-10 Coulombe (2026).

**When NOT to use**

Non-linear models (the closed form requires a linear estimator).

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Goulet Coulombe (2026) 'OLS as an Attention Mechanism', working paper -- Eq. 3 closed-form attention matrix.

**Related options**: [`dual_decomposition`](#dual-decomposition), [`model_native_linear_coef`](#model-native-linear-coef), [`shap_linear`](#shap-linear)

_Last reviewed 2026-05-05 by macroforecast author._

### `bootstrap_jackknife`  --  operational

Bootstrap / jackknife confidence bands around any importance score.

Wraps another importance op and re-runs it on ``B`` stationary-bootstrap (Politis-White 2004) or jackknife resamples. Emits ``(score_mean, score_p2.5, score_p97.5)`` per feature; pair with the ``boxplot`` figure type.

**When to use**

Reporting confidence-banded importance rankings.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Politis & White (2004) 'Automatic Block-Length Selection for the Dependent Bootstrap', Econometric Reviews 23(1): 53-70.

**Related options**: [`rolling_recompute`](#rolling-recompute)

_Last reviewed 2026-05-05 by macroforecast author._

### `boruta_selection`  --  future

_(no schema description for `boruta_selection`)_

> TBD: option doc not yet authored for this value. The encyclopedia falls back to the bare schema description above. PRs adding a full ``OptionDoc`` entry under ``macroforecast/scaffold/option_docs/l7.py`` are welcome.

### `bvar_pip`  --  operational

Posterior inclusion probabilities for BVAR / Bayesian linear models.

For each predictor ``j``, returns ``P(β_j ≠ 0 | data)`` -- the posterior probability that the variable enters the model with non-zero effect. Compatible with ``bvar_minnesota`` / ``bvar_normal_inverse_wishart`` / ``bayesian_ridge``.

**When to use**

Bayesian model selection; comparing variable importance under posterior uncertainty.

**When NOT to use**

Frequentist models -- use ``lasso_inclusion_frequency`` for an analogous stability score.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Koop & Korobilis (2010) 'Bayesian Multivariate Time Series Methods for Empirical Macroeconomics', Foundations and Trends in Econometrics 3(4): 267-358.

**Related options**: [`lasso_inclusion_frequency`](#lasso-inclusion-frequency)

_Last reviewed 2026-05-05 by macroforecast author._

### `cumulative_r2_contribution`  --  operational

Cumulative R² gain from adding features one at a time (forward-selection ranking).

Re-fits the L4 estimator with features added in descending order of marginal contribution; each step records the cumulative OOS-R² achieved. Pair with the ``lineplot`` figure type to visualise the marginal information value of each predictor.

**When to use**

Quantifying how many predictors the model actually needs to reach a target R².

**When NOT to use**

Highly correlated features -- the order is sensitive to entry rules.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Stock & Watson (2012) 'Generalized Shrinkage Methods for Forecasting using Many Predictors', JBES 30(4): 481-493.

**Related options**: [`lasso_inclusion_frequency`](#lasso-inclusion-frequency), [`lofo`](#lofo)

_Last reviewed 2026-05-05 by macroforecast author._

### `deep_lift`  --  operational

DeepLIFT (Shrikumar 2017) -- difference-from-reference attribution.

Decomposes the difference ``f(x) - f(x')`` into per-feature contributions using rescaled-difference / reveal-cancel rules for non-linear activations. Faster than integrated gradients but with less rigorous axiomatic backing.

**When to use**

NN attribution where integrated-gradients runtime is too high.

**When NOT to use**

When the completeness / sensitivity axioms matter -- prefer integrated gradients.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Shrikumar, Greenside & Kundaje (2017) 'Learning Important Features Through Propagating Activation Differences', ICML.

**Related options**: [`integrated_gradients`](#integrated-gradients), [`gradient_shap`](#gradient-shap), [`saliency_map`](#saliency-map), [`shap_deep`](#shap-deep)

_Last reviewed 2026-05-05 by macroforecast author._

### `dual_decomposition`  --  operational

Forecast-as-weighted-training-targets via the representer theorem (Coulombe et al. 2024); equivalently a restricted attention module (Goulet Coulombe 2026).

Goulet Coulombe / Goebel / Klieber (2024) 'Dual Interpretation of ML Forecasts'. Surfaces each prediction as a weighted combination of historical training targets; weights recovered through the representer theorem applied to the fitted model. Atomic L7 primitive: SHAP-family ops decompose by feature contribution, this op decomposes by training-row contribution, the natively interpretable view for small-sample temporally-ordered macro panels.

**Linear families** (operational v0.8.9): ridge / OLS / lasso via closed-form ``w(xₜ) = X(X'X + αI)⁻¹xₜ``.

**Tree-bagging ensembles** (operational v0.9.1 dev-stage v0.9.0B-5): RandomForestRegressor / ExtraTreesRegressor via the leaf-co-occurrence kernel ``wⱼ(xₜ) = (1/B) Σ_b 1[j ∈ B_b] · 1[leaf_b(xₜ) == leaf_b(xⱼ)] / leaf_size_b(xⱼ)`` where ``B_b`` is tree b's bootstrap subset (sklearn ``estimators_samples_``). Reproduces ``forest.predict`` to machine precision (~4e-16). Helper ``_rf_leaf_cooccurrence_weights`` in core/runtime.py.

Output frame layout: rows = training row labels, columns = ``mean_weight``, ``abs_mean_weight``, ``max_abs_weight``. Full ``(n_test × n_train)`` weight matrix attached as ``frame.attrs['dual_weights']``; ``frame.attrs['method']`` carries ``'linear_closed_form'`` or ``'rf_leaf_cooccurrence_kernel'`` for downstream renderers.

**Inline portfolio diagnostics.** The output artifact also carries the four portfolio metrics from the same paper (HHI = ``Σwⱼ²``, short = ``Σ max(0,-wⱼ)``, turnover = ``‖wₜ - wₜ₋₁‖₁``, leverage = ``‖w‖₁``) at ``frame.attrs['portfolio_metrics']``. These are trivial numpy reductions on the primary dual weights and do not warrant their own L7 op (decomposition discipline).

**OLS-as-attention equivalence.** Goulet Coulombe (2026) 'Ordinary Least Squares as an Attention Mechanism' (SSRN 5200864) shows that the same dual representation ``ŷ_test = F_test F_train' y_train`` (eq. 7) coincides with a *restricted attention module*: queries ``Q = X_test W``, keys ``K = X_train W`` with ``W = U Λ^{-½}``, values ``V = y``, and the softmax replaced by the identity (eqs. 17-19). The training-row weights ``ωⱼᵢ = ⟨Fⱼ, Fᵢ⟩`` surfaced by this op are exactly the (restricted) attention weights of that paper. Same compute, different vocabulary -- no separate runtime needed.

Boosted-tree (gradient_boosting / xgboost / lightgbm) and NN extensions are deferred: residual-bagging and learned non-linear models do not admit a clean sum-of-training-targets dual representation.

**When to use**

Decomposing macro forecasts into training-target contributions; explaining ML predictions to econometric audiences; bridging classical OLS to transformer-attention literature; per-prediction provenance for tree ensembles.

**When NOT to use**

Boosted-tree / NN families (gradient_boosting, xgboost, lightgbm, mlp, lstm, etc.) -- raises NotImplementedError; the residual-bagging structure does not factor into a sum-of-training-targets representation.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Goulet Coulombe, Goebel & Klieber (2024) 'Dual Interpretation of Machine Learning Forecasts', arXiv:2412.13076.
* Goulet Coulombe (2026) 'Ordinary Least Squares as an Attention Mechanism', SSRN 5200864 -- shows OLS predictions ŷ_test = F_test F_train' y_train (eq. 7) coincide with a restricted attention module (eqs. 17-19, identity activation, tied W_Q W_K' = (X_train' X_train)^{-1}). The dual_decomposition op already implements the same compute via the closed-form ridge representer; no separate runtime needed.

**Related options**: [`permutation_importance`](#permutation-importance), [`shap_kernel`](#shap-kernel)

_Last reviewed 2026-05-05 by macroforecast author._

### `fevd`  --  operational

Forecast error variance decomposition (Sims 1980).

For a fitted VAR (``var`` / ``factor_augmented_var`` / ``bvar_*``), decomposes the h-step-ahead forecast error variance into shares attributable to each orthogonalised shock. Default Cholesky orthogonalisation; ordering is set by the column order of the VAR. statsmodels ``fevd`` backend.

**When to use**

Standard VAR analysis; interpreting how shocks propagate across variables.

**When NOT to use**

Non-VAR models -- use ``permutation_importance`` instead.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Sims (1980) 'Macroeconomics and Reality', Econometrica 48(1): 1-48.

**Related options**: [`historical_decomposition`](#historical-decomposition), [`generalized_irf`](#generalized-irf), [`forecast_decomposition`](#forecast-decomposition)

_Last reviewed 2026-05-05 by macroforecast author._

### `forecast_decomposition`  --  operational

Decompose a single forecast into per-feature contributions.

For a single (cell, target, horizon) forecast, returns a table ``(feature → contribution)`` summing to ``forecast - benchmark``. Linear models: ``β_j x_j``. Trees: Tree SHAP. NN: gradient SHAP. Universal entry point unified across families -- delegates to the appropriate family-specific op.

**When to use**

Reporting feature contributions for a specific forecast (e.g. 'why is the model bullish on Q3 GDP').

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'

**Related options**: [`shap_tree`](#shap-tree), [`shap_linear`](#shap-linear), [`shap_deep`](#shap-deep)

_Last reviewed 2026-05-05 by macroforecast author._

### `friedman_h_interaction`  --  operational

Friedman & Popescu (2008) H-statistic for two-way feature interactions.

For feature pair ``(j, k)``, computes ``H²_{jk} = Σ[PD_{jk}(x_j, x_k) - PD_j(x_j) - PD_k(x_k)]² / Σ PD²_{jk}``. ``H² ∈ [0, 1]``; the share of the joint partial-dependence variance attributable to non-additive structure.

**When to use**

Identifying which feature pairs the model treats non-additively.

**When NOT to use**

Wide panels -- the M² PDP grid grows expensive.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Friedman & Popescu (2008) 'Predictive Learning via Rule Ensembles', Annals of Applied Statistics 2(3): 916-954.

**Related options**: [`shap_interaction`](#shap-interaction), [`partial_dependence`](#partial-dependence)

_Last reviewed 2026-05-05 by macroforecast author._

### `generalized_irf`  --  future

Pesaran-Shin (1998) generalized impulse-response function (future, v0.9.x).

Order-invariant IRF where each shock is constructed as the multivariate-normal projection of all residuals onto the j-th canonical direction. Distinct from Cholesky orthogonalised IRFs (which use a recursive lower-triangular rotation). **Future** -- the runtime currently raises NotImplementedError. For the Cholesky variant operational since v0.2, use ``orthogonalised_irf``.

**When to use**

VAR analysis where the variable ordering has no theoretical motivation -- order-invariance is the desired property.

**When NOT to use**

When a recursive identification IS theoretically motivated -- use ``orthogonalised_irf`` instead.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Pesaran & Shin (1998) 'Generalized impulse response analysis in linear multivariate models', Economics Letters 58(1): 17-29.

**Related options**: [`fevd`](#fevd), [`historical_decomposition`](#historical-decomposition), [`orthogonalised_irf`](#orthogonalised-irf)

_Last reviewed 2026-05-05 by macroforecast author._

### `gradient_shap`  --  operational

Gradient SHAP -- expectation-of-gradient SHAP approximation (Lundberg-Lee 2017).

Approximates SHAP values via expected gradients at random interpolations between input and a baseline distribution. Captum-backed; requires the ``macroforecast[deep]`` extra.

**When to use**

Differentiable models (NN families) where exact SHAP is too expensive.

**When NOT to use**

Non-NN models.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Lundberg & Lee (2017) 'A Unified Approach to Interpreting Model Predictions', NeurIPS 30: 4765-4774.

**Related options**: [`shap_deep`](#shap-deep), [`integrated_gradients`](#integrated-gradients), [`saliency_map`](#saliency-map), [`deep_lift`](#deep-lift)

_Last reviewed 2026-05-05 by macroforecast author._

### `group_aggregate`  --  operational

Aggregate per-feature importance into pre-defined block sums (FRED-SD blocks, theme blocks).

Sums (or means) per-feature importance scores over groups defined by a user-supplied or built-in mapping table. v0.25 ships 8 built-in blocks: 8-group FRED-MD + 14-group FRED-QD + 50-state FRED-SD grids.

Required input for the FRED-SD ``us_state_choropleth`` figure.

**When to use**

FRED-MD / -QD / -SD analyses where per-series importance should roll up to thematic / geographic blocks.

**When NOT to use**

Custom panels lacking a meaningful grouping.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* McCracken & Ng (2016) 'FRED-MD: A Monthly Database for Macroeconomic Research', JBES 34(4): 574-589.

**Related options**: [`lineage_attribution`](#lineage-attribution), [`transformation_attribution`](#transformation-attribution)

_Last reviewed 2026-05-05 by macroforecast author._

### `historical_decomposition`  --  operational

Historical decomposition (Burbidge-Harrison 1985) of the realised series into structural shocks.

Reconstructs each variable's realised path as the convolution of orthogonalised IRF coefficients (Cholesky-rotated structural form) with the time series of structural shocks recovered from the reduced-form residuals. Returns the per-shock cumulative absolute contribution to the target variable's realised fluctuations; the row labels match the VAR variable ordering.

**When to use**

Telling the historical narrative -- which shocks drove specific recessions / expansions.

**When NOT to use**

Non-VAR models.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Burbidge & Harrison (1985) 'A historical decomposition of the great depression to determine the role of money', JME 16(1): 45-54.

**Related options**: [`fevd`](#fevd), [`orthogonalised_irf`](#orthogonalised-irf)

_Last reviewed 2026-05-05 by macroforecast author._

### `integrated_gradients`  --  operational

Integrated gradients (Sundararajan 2017) -- path-integral attribution.

Computes ``(x_j - x'_j) · ∫₀¹ ∂f(x' + α(x - x')) / ∂x_j dα`` for a baseline ``x'`` (default zero). Satisfies the completeness axiom (sum of attributions equals ``f(x) - f(x')``). Captum-backed.

**When to use**

Axiomatically-grounded NN attribution (Sundararajan completeness + sensitivity properties).

**When NOT to use**

Non-NN models; pathological models where integration along the linear path is misleading.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Sundararajan, Taly & Yan (2017) 'Axiomatic Attribution for Deep Networks', ICML.

**Related options**: [`gradient_shap`](#gradient-shap), [`saliency_map`](#saliency-map), [`deep_lift`](#deep-lift)

_Last reviewed 2026-05-05 by macroforecast author._

### `lasso_inclusion_frequency`  --  operational

Bootstrap inclusion frequency for Lasso-selected features (Bach 2008).

For each feature ``j``, computes the share of ``B`` Lasso fits (on bootstrap or rolling-window resamples) for which ``β̂_j ≠ 0``. Returns a stability score in ``[0, 1]``. v0.25 supports ``sampling = bootstrap | rolling | both`` (via leaf_config).

**When to use**

Feature-selection stability audit for Lasso / Lasso-Path / Elastic Net.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Bach (2008) 'Bolasso: model consistent Lasso estimation through the bootstrap', ICML.
* Meinshausen & Bühlmann (2010) 'Stability selection', JRSS Series B 72(4): 417-473.

**Related options**: [`model_native_linear_coef`](#model-native-linear-coef), [`bootstrap_jackknife`](#bootstrap-jackknife)

_Last reviewed 2026-05-05 by macroforecast author._

### `lasso_path_selection`  --  future

_(no schema description for `lasso_path_selection`)_

> TBD: option doc not yet authored for this value. The encyclopedia falls back to the bare schema description above. PRs adding a full ``OptionDoc`` entry under ``macroforecast/scaffold/option_docs/l7.py`` are welcome.

### `lineage_attribution`  --  operational

Trace importance back through L3 feature lineage to the L1 raw source.

For each L3 feature, walks the L3.metadata ``column_lineage`` graph to identify the chain of transforms that produced it; attributes the L7 importance score back to the L1 raw column at the head of the lineage chain.

Solves the 'PCA factors are most important; what does that mean in terms of original variables?' problem.

**When to use**

Pipelines with PCA / factor / dimensionality-reduction stages where downstream importance must be traced back to raw inputs.

**When NOT to use**

Pipelines with only direct-input features (no L3 transforms).

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'

**Related options**: [`group_aggregate`](#group-aggregate), [`transformation_attribution`](#transformation-attribution)

_Last reviewed 2026-05-05 by macroforecast author._

### `lofo`  --  operational

Leave-one-feature-out (LOFO) refit importance.

For each predictor ``j``, refits the L4 estimator on the panel with column ``j`` removed and reports the OOS-loss delta. More expensive than permutation importance (one extra fit per feature) but free from the permutation-and-correlation interaction.

Compatible with every L4 family; runtime scales as ``n_features × cost_per_fit``.

**When to use**

Small / medium feature panels (< 100) where N-extra fits are affordable.

**When NOT to use**

Wide panels (n_features > 200) -- prohibitive runtime.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Lemaître, Aridas & Nogueira (2018) 'imbalanced-learn', JMLR 18(17): 1-5 -- LOFO popularised; pre-dating refit-importance traditions in econometrics.

**Related options**: [`permutation_importance`](#permutation-importance)

_Last reviewed 2026-05-05 by macroforecast author._

### `lstm_hidden_state`  --  future

_(no schema description for `lstm_hidden_state`)_

> TBD: option doc not yet authored for this value. The encyclopedia falls back to the bare schema description above. PRs adding a full ``OptionDoc`` entry under ``macroforecast/scaffold/option_docs/l7.py`` are welcome.

### `model_native_linear_coef`  --  operational

Standardised regression coefficients from a fitted linear model.

See [model_native_linear_coef function page](../op/model_native_linear_coef.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.model_native_linear_coef_importance``.

### `model_native_tree_importance`  --  operational

Mean-decrease-impurity importance from a fitted tree ensemble.

See [model_native_tree_importance function page](../op/model_native_tree_importance.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.model_native_tree_importance``.

### `mrf_gtvp`  --  operational

Macroeconomic Random Forest GTVP -- per-leaf time-varying coefficients (Coulombe 2024).

Compatible only with the ``macroeconomic_random_forest`` L4 family. For each leaf ``ℓ`` and predictor ``j``, returns the leaf-local linear coefficient ``β̂_{j, ℓ}``; the full output is an ``(n_leaves × n_features)`` GTVP (Generalised Time-Varying Parameter) panel.

**When to use**

Coulombe (2024) MRF interpretation; spotting non-linearity captured by the leaf partition.

**When NOT to use**

Non-MRF models.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Coulombe (2024) 'The Macroeconomic Random Forest', Journal of Applied Econometrics 39(7): 1190-1209.

**Related options**: [`rolling_recompute`](#rolling-recompute), [`model_native_tree_importance`](#model-native-tree-importance)

_Last reviewed 2026-05-05 by macroforecast author._

### `orthogonalised_irf`  --  operational

Cholesky-orthogonalised impulse-response function (Sims 1980).

Standard structural-VAR IRF: residual covariance Σᵤ is Cholesky-decomposed P P' = Σᵤ; the structural shocks P⁻¹ u_t are orthogonalised by construction. ``orth_irfs[s, i, j]`` is the response of variable ``i`` at horizon ``s`` to a unit structural shock to variable ``j`` at time 0. **Order-dependent**: the variable ordering in the recipe determines the recursive causal scheme imposed.

**When to use**

VAR analysis with a theoretically motivated recursive identification (e.g. monetary policy ordered last; supply ordered first).

**When NOT to use**

When the variable ordering is arbitrary -- file a v0.9.x request for ``generalized_irf`` (Pesaran-Shin 1998 order-invariant variant, currently future-gated).

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Sims (1980) 'Macroeconomics and Reality', Econometrica 48(1): 1-48.

**Related options**: [`fevd`](#fevd), [`historical_decomposition`](#historical-decomposition)

_Last reviewed 2026-05-05 by macroforecast author._

### `oshapley_vi`  --  operational

Out-of-sample SHAP-style variable importance (Borup et al. 2022) [schema; runtime via anatomy package].

Borup, Goulet Coulombe, Montes-Rojas, Schutte & Veiga (2022) 'Anatomy of Out-of-Sample Forecasting Accuracy'. Recomputes Shapley-style feature contributions on the *out-of-sample* loss rather than in-sample fit, addressing the distribution-shift mismatch where in-sample SHAP misranks features that matter for OOS accuracy.

Atomic primitive -- existing in-sample ``shap_*`` ops do not compose into oShapley-VI. Runtime delegates to the Borup et al. ``anatomy`` Python package as an optional dep (``pip install macroforecast[anatomy]``). Schema-only in v0.9.0; operational promotion lands once the anatomy integration is wired.

**When to use**

OOS-aware variable importance for macro forecast audits; replicating Borup et al. (2022).

**When NOT to use**

Pre-promotion. Without the anatomy extra installed.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Borup, Goulet Coulombe, Montes-Rojas, Schutte & Veiga (2022) 'Anatomy of Out-of-Sample Forecasting Accuracy', SSRN 4278745.

**Related options**: [`shap_kernel`](#shap-kernel), [`shap_tree`](#shap-tree), [`permutation_importance`](#permutation-importance), [`pbsv`](#pbsv)

_Last reviewed 2026-05-05 by macroforecast author._

### `partial_dependence`  --  operational

Friedman (2001) partial dependence plot.

See [partial_dependence function page](../op/partial_dependence.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.partial_dependence_importance``.

### `pbsv`  --  operational

Performance-Based Shapley Value (Borup et al. 2022) [schema; runtime via anatomy package].

OOS accuracy decomposition: Shapley-attributes the forecast performance improvement over a benchmark to each feature coalition's contribution. Differs from ``oshapley_vi`` in decomposing the *accuracy gain* rather than the OOS loss; they are companion ops covering the two faces of OOS Shapley.

Runtime delegates to ``anatomy`` package. Schema-only in v0.9.0.

**When to use**

Decomposing OOS forecast skill by feature; benchmark-relative interpretation studies.

**When NOT to use**

Pre-promotion. Without the anatomy extra installed.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Borup, Goulet Coulombe, Montes-Rojas, Schutte & Veiga (2022) 'Anatomy of Out-of-Sample Forecasting Accuracy', SSRN 4278745.

**Related options**: [`oshapley_vi`](#oshapley-vi), [`permutation_importance`](#permutation-importance)

_Last reviewed 2026-05-05 by macroforecast author._

### `permutation_importance`  --  operational

Breiman-Fisher-Rudin (2019) model-agnostic permutation importance.

See [permutation_importance function page](../op/permutation_importance.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.permutation_importance``.

### `permutation_importance_strobl`  --  operational

Strobl (2008) conditional permutation importance.

See [permutation_importance_strobl function page](../op/permutation_importance_strobl.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.cond_permutation_importance``.

### `recursive_feature_elimination`  --  future

_(no schema description for `recursive_feature_elimination`)_

> TBD: option doc not yet authored for this value. The encyclopedia falls back to the bare schema description above. PRs adding a full ``OptionDoc`` entry under ``macroforecast/scaffold/option_docs/l7.py`` are welcome.

### `rolling_recompute`  --  operational

Re-compute any importance score on a rolling-window basis.

Applies an inner importance op (e.g. ``permutation_importance``) on each of K rolling-window subsamples; emits a ``(K × n_features)`` matrix tracking how importance evolves over time. Pair with the ``heatmap`` or ``lineplot`` figure type.

**When to use**

Detecting time-varying feature importance; structural-stability audits.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'

**Related options**: [`bootstrap_jackknife`](#bootstrap-jackknife), [`mrf_gtvp`](#mrf-gtvp)

_Last reviewed 2026-05-05 by macroforecast author._

### `saliency_map`  --  operational

Saliency map (Simonyan 2014) -- absolute gradient at the input.

Returns ``|∂f / ∂x_j|`` evaluated at the input. The earliest and simplest gradient-based attribution; useful as a baseline but susceptible to gradient-saturation issues that integrated gradients address.

**When to use**

Quick NN attribution baseline; sanity-check vs more elaborate methods.

**When NOT to use**

Production attribution -- prefer integrated gradients or SHAP.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Simonyan, Vedaldi & Zisserman (2014) 'Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps', ICLR Workshops.

**Related options**: [`integrated_gradients`](#integrated-gradients), [`gradient_shap`](#gradient-shap), [`deep_lift`](#deep-lift)

_Last reviewed 2026-05-05 by macroforecast author._

### `shap_deep`  --  operational

Deep SHAP -- DeepLIFT-based SHAP for neural networks.

DeepLIFT (Shrikumar 2017) interpreted as Shapley-value approximation. Compatible with the ``mlp`` / ``lstm`` / ``gru`` / ``transformer`` L4 families when the ``macroforecast[deep]`` extra is installed (captum backend).

**When to use**

Neural-network forecasters (LSTM / GRU / Transformer / MLP).

**When NOT to use**

Non-NN models -- use ``shap_tree`` / ``shap_linear`` / ``shap_kernel`` instead.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Lundberg & Lee (2017) 'A Unified Approach to Interpreting Model Predictions', NeurIPS 30: 4765-4774.
* Shrikumar, Greenside & Kundaje (2017) 'Learning Important Features Through Propagating Activation Differences', ICML.

**Related options**: [`shap_tree`](#shap-tree), [`shap_kernel`](#shap-kernel), [`deep_lift`](#deep-lift), [`gradient_shap`](#gradient-shap), [`integrated_gradients`](#integrated-gradients)

_Last reviewed 2026-05-05 by macroforecast author._

### `shap_interaction`  --  operational

SHAP interaction values -- pairwise feature-interaction Shapley.

Lundberg-Erion-Lee (2020) extension that decomposes each SHAP value into a main-effect term plus pairwise interaction terms. Available for tree ensembles via the same polynomial-time algorithm as ``shap_tree``.

Output is an ``(n × M × M)`` tensor; pair with the ``heatmap`` figure type for visualisation.

**When to use**

Identifying which feature pairs drive the model's non-additive structure.

**When NOT to use**

Wide feature panels -- the ``M²`` storage cost grows quickly.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Lundberg & Lee (2017) 'A Unified Approach to Interpreting Model Predictions', NeurIPS 30: 4765-4774.
* Lundberg, Erion & Lee (2020) 'From local explanations to global understanding with explainable AI for trees', Nature Machine Intelligence 2: 56-67.

**Related options**: [`shap_tree`](#shap-tree), [`friedman_h_interaction`](#friedman-h-interaction)

_Last reviewed 2026-05-05 by macroforecast author._

### `shap_kernel`  --  operational

Kernel SHAP -- model-agnostic Shapley value approximation.

Lundberg-Lee (2017) weighted-LIME estimator that approximates Shapley values for any model via local linear regression on perturbed inputs. Slow (O(2^M) coalitions sampled) but universally applicable.

**When to use**

Non-tree, non-linear, non-deep models (SVM, kNN, custom callables).

**When NOT to use**

Trees (use ``shap_tree``) or linear models (use ``shap_linear``) -- both are dramatically faster.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Lundberg & Lee (2017) 'A Unified Approach to Interpreting Model Predictions', NeurIPS 30: 4765-4774.

**Related options**: [`shap_tree`](#shap-tree), [`shap_linear`](#shap-linear)

_Last reviewed 2026-05-05 by macroforecast author._

### `shap_linear`  --  operational

Linear SHAP -- closed-form Shapley values for linear models.

See [shap_linear function page](../op/shap_linear.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.shap_linear_importance``.

### `shap_tree`  --  operational

Tree SHAP -- exact polynomial-time Shapley values for tree ensembles.

See [shap_tree function page](../op/shap_tree.md) for full documentation + parameters + standalone usage. Standalone: ``mf.functions.shap_tree_importance``.

### `stability_selection`  --  future

_(no schema description for `stability_selection`)_

> TBD: option doc not yet authored for this value. The encyclopedia falls back to the bare schema description above. PRs adding a full ``OptionDoc`` entry under ``macroforecast/scaffold/option_docs/l7.py`` are welcome.

### `transformation_attribution`  --  operational

Shapley over pipelines -- decompose forecast skill across alternative L3 transforms.

Multi-cell sweep aggregator: given multiple pipelines that differ in their L3 transform choices, computes the Shapley share of each transform's contribution to the metric improvement. v0.25 uses the Castro-Gómez-Tejada (2009) permutation-Shapley sampler when ``n_pipelines > 8``.

**When to use**

Interpreting horse-race sweeps -- which L3 transform delivers the win?

**When NOT to use**

Single-pipeline studies; sweeps with fewer than 3 alternative pipelines.

**References**

* macroforecast design Part 3, L7: 'every importance op produces (table, figure) pairs; the L7.B sub-layer governs export shape.'
* Castro, Gómez & Tejada (2009) 'Polynomial calculation of the Shapley value based on sampling', Computers & Operations Research 36(5): 1726-1730.

**Related options**: [`lineage_attribution`](#lineage-attribution), [`group_aggregate`](#group-aggregate)

_Last reviewed 2026-05-05 by macroforecast author._