# macroforecast.interpretation.dual [Back to interpretation](interpretation.md) `macroforecast.interpretation.dual` is the dedicated namespace for the dual interpretation route in Goulet Coulombe, Goebel, and Klieber (2024), "Dual Interpretation of Machine Learning Forecasts" (`arXiv:2412.13076`). Standard variable-importance tools ask which predictor columns matter. Dual interpretation asks which historical training observations matter for a forecast. The central identity is: ```text yhat_new = sum_i w_i(new) y_i ``` The weights `w_i(new)` are observation weights, also called data-portfolio weights in the paper/code. A positive weight means the model borrows from that historical outcome. A negative weight means the model uses that observation by contrast. A concentrated weight vector means the forecast relies on a small number of episodes. A high short position or high gross leverage means the forecast is extrapolative rather than a simple local average. Relation to Goulet Coulombe (2026), "Ordinary Least Squares as an Attention Mechanism": OLS-as-attention is the exact linear algebra route `X_test (X_train'X_train)^-1 X_train'`. It is available through `macroforecast.interpretation.ols_attention_weights()`, `ridge_attention_weights()`, `ols_attention_embedding()`, and `ols_attention_equivalence()`. This `dual` namespace is broader: it uses the same historical-observation idea for ridge/OLS, kernel ridge, and random forest data-portfolio weights, plus contribution, diagnostic, top-observation, and group tables. ## Reference Sources | Source | Used for | | --- | --- | | Goulet Coulombe, Goebel, and Klieber (2024), "Dual Interpretation of Machine Learning Forecasts" | Paper terminology and interpretation target. | | Goulet Coulombe (2026), "Ordinary Least Squares as an Attention Mechanism" | Exact OLS/ridge attention identity and whitened embedding interpretation. | | `wiki/raw/paper_code/coulombe_site_github_20260530/dual_python/auxiliaries.py` | Ridge, kernel-ridge, and random-forest observation-weight formulas. | | `wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/DualML.R` | Forecast concentration, forecast short position, forecast leverage, and forecast turnover definitions. | | `wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/README.md` | Original model-route inventory: OLS, RF, LGB, RR, KRR, and NN. | Implemented now: ridge/OLS, kernel ridge, and sklearn-style random forest. Deferred routes: boosted-tree AXIL, LGB+/LGBA+ channel-specific weights, neural embedding-ridge approximation, and classification log-odds decomposition. ## Public Functions | Function | Input | Output | Purpose | | --- | --- | --- | --- | | `macroforecast.interpretation.dual.dual_interpretation()` | model, train features, train target, optional test features | `DualInterpretationResult` | Run the paper-aligned ridge/KRR/RF path and return all dual tables together. | | `macroforecast.interpretation.dual.dual_from_forecast_result()` | completed `ForecastResult`, model, train features, train target, optional test features | `ForecastResult` or `DualInterpretationResult` | Build a dual sidecar for a completed runner result. | | `macroforecast.interpretation.dual.observation_weights()` | model, `X_train`, optional `X_test` | long `DataFrame` | Compute historical observation/data-portfolio weights. | | `macroforecast.interpretation.dual.observation_contributions()` | weights and `y_train` | long `DataFrame` | Multiply observation weights by historical outcomes. | | `macroforecast.interpretation.dual.forecast_diagnostics()` | weights | `DataFrame` | Compute concentration, short position, leverage, gross leverage, and turnover. | | `macroforecast.interpretation.dual.top_observations()` | weights or contributions | long `DataFrame` | Return the largest historical observations for each forecast. | | `macroforecast.interpretation.dual.group_observation_weights()` | weights/contributions and a group mapping | `DataFrame` | Aggregate observation weights over user-defined regimes or episodes. | | `DualInterpretationResult.to_tables()` | result object | dict of `DataFrame` | Expand the result for `macroforecast.output`. | Backward-compatible aliases are still available: | Alias | Preferred name | | --- | --- | | `outcome_contributions` | `observation_contributions` | | `data_portfolio_diagnostics` | `forecast_diagnostics` | | `top_episodes` | `top_observations` | | `episode_group_weights` | `group_observation_weights` | ## Public Flow ```python import macroforecast as mf dual = mf.interpretation.dual.dual_interpretation( model, X_train, y_train, X_test, method="random_forest", top_n=10, groups={ "gfc": gfc_train_dates, "covid": covid_train_dates, }, ) tables = dual.to_tables(prefix="inflation") ``` For completed forecast runs, attach the same result as a sidecar: ```python result = mf.forecasting.run(feature_set, "ridge", window=window) result = mf.interpretation.dual.dual_from_forecast_result( result, fit, X_train, y_train, X_test, method="ridge", ) # Equivalent method form: result = result.with_dual(fit, X_train, y_train, X_test, method="ridge") ``` `forecasting.run()` does not compute dual interpretation automatically. The completed forecast table does not contain the exact fitted estimator, training-feature matrix, training target, or forecast-row feature matrix. Those objects must be passed explicitly to avoid silent look-ahead or stale-design errors. For a ridge/KRR route, `model` can be `None`: ```python dual = mf.interpretation.dual.dual_interpretation( None, X_train, y_train, X_test, method="krr", kernel="laplace", sigma=1e-4, lambda_=0.1, ) ``` ## dual_interpretation ```python macroforecast.interpretation.dual.dual_interpretation( model, X_train, y_train, X_test=None, *, method="auto", lambda_=1e-8, kernel="linear", sigma=1.0, add_intercept=False, ridge_penalty_scale="n_train", normalize=False, center=False, include_base=False, top_n=10, top_sort_by="abs_weight", top_q=0.05, groups=None, include_contributions=True, include_diagnostics=True, include_top_observations=True, include_group_weights=None, ) ``` Input: | Argument | Type | Default | Meaning | | --- | --- | --- | --- | | `model` | fitted model or `None` | required | Required for random-forest weights. Optional for ridge/KRR because weights are closed-form from `X_train` and `X_test`. | | `X_train` | pandas `DataFrame` | required | Training feature matrix. Its index becomes `train_index`. | | `y_train` | pandas `Series` or sequence | required | Training target aligned to `X_train`. If it is a `Series`, the index is aligned to `train_index`. | | `X_test` | pandas `DataFrame` or `None` | `None` | Forecast-row feature matrix. If omitted, each training row is explained against the training panel. | | `method` | string | `auto` | `auto`, `ridge`, `ols`, `krr`, `kernel_ridge`, `random_forest`, or `rf`. | | `lambda_` | float | `1e-8` | Ridge/KRR regularization. | | `kernel` | string | `linear` | KRR kernel: `linear`, `gaussian`, `rbf`, `laplace`, or `laplacian`. | | `sigma` | float | `1.0` | Kernel bandwidth convention used by the reviewed code: `exp(-sigma * distance)`. | | `add_intercept` | bool | `False` | Adds an unpenalized intercept for ridge/OLS. The paper code usually works with standardized no-intercept matrices. | | `ridge_penalty_scale` | string | `n_train` | Ridge penalty convention. `n_train` uses `n_train * lambda_`; `none` uses `lambda_`. | | `normalize` | bool | `False` | Re-normalize row weights to sum to one. Default is false because leverage and negative weights are meaningful diagnostics. | | `center` | bool | `False` | Center `y_train` before contribution calculation. | | `include_base` | bool | `False` | With `center=True`, add an explicit base-row contribution. | | `top_n` | int | `10` | Number of top observations returned per forecast row. | | `top_sort_by` | string | `abs_weight` | `abs_weight`, `weight`, `contribution`, or `abs_contribution`. | | `top_q` | float | `0.05` | Share of observations used in concentration. Values above `1` are treated as `1`. | | `groups` | mapping or `None` | `None` | Named historical episode groups, mapping group name to training-index labels. | | `include_*` | bool | varies | Include or skip contribution, diagnostic, top-observation, and group tables. | Output: `DualInterpretationResult`. | Field | Type | Meaning | | --- | --- | --- | | `weights` | `DataFrame` | Observation/data-portfolio weights. | | `contributions` | `DataFrame` or `None` | Observation-level forecast contributions. | | `diagnostics` | `DataFrame` or `None` | Forecast concentration, short position, leverage, gross leverage, and turnover. | | `top_observations` | `DataFrame` or `None` | Largest historical observations per forecast. | | `group_weights` | `DataFrame` or `None` | Group-level observation weights and contributions. | | `metadata` | dict | Paper route, implemented/deferred routes, and options used. | ## dual_from_forecast_result ```python macroforecast.interpretation.dual.dual_from_forecast_result( result, model, X_train, y_train, X_test=None, *, attach=True, sidecar_name="dual", **dual_options, ) ``` Input: | Argument | Type | Default | Meaning | | --- | --- | --- | --- | | `result` | `ForecastResult` | required | Completed forecast runner output. | | `model` | fitted model or `None` | required | Same model argument passed to `dual_interpretation(...)`. | | `X_train`, `y_train`, `X_test` | pandas objects | required except `X_test` | Exact design matrices used for the dual explanation. | | `attach` | bool | `True` | If true, return a copy of `ForecastResult` with the sidecar attached. If false, return the standalone `DualInterpretationResult`. | | `sidecar_name` | str | `dual` | Name used in `ForecastResult.sidecars` and output artifact names. | | `**dual_options` | keyword args | none | Forwarded to `dual_interpretation(...)`, such as `method`, `lambda_`, `kernel`, `groups`, and `top_n`. | Output: with `attach=True`, a new `ForecastResult`; with `attach=False`, a standalone `DualInterpretationResult`. ## observation_weights ```python macroforecast.interpretation.dual.observation_weights( model, X_train, X_test=None, *, method="auto", lambda_=1e-8, kernel="linear", sigma=1.0, add_intercept=False, ridge_penalty_scale="n_train", normalize=False, ) ``` Implemented routes: | Route | Formula / logic | Notes | | --- | --- | --- | | Ridge / OLS | `W = X_test (X_train' X_train + n lambda I)^-1 X_train'` by default | Set `ridge_penalty_scale="none"` for `lambda I`. `add_intercept=True` adds an unpenalized intercept. | | Kernel ridge | `W = K_test (K_train + lambda I)^-1` | Kernels: `linear`, `gaussian`/`rbf`, `laplace`/`laplacian`. | | Random forest | For each tree, assign test and train rows to leaves; train rows in the same leaf share weight; average across trees | For sklearn forests, bootstrap sample counts are used when recoverable. | Output columns: | Column | Meaning | | --- | --- | | `test_row`, `test_index` | Forecast-row position and index. | | `train_row`, `train_index` | Historical observation position and index. | | `weight`, `abs_weight` | Signed and absolute observation weight. | | `channel` | Implemented route: `ridge`, `krr`, or `random_forest`. | The dense matrix is attached as `attrs["weight_matrix"]` with shape `(n_test, n_train)`. ## observation_contributions ```python macroforecast.interpretation.dual.observation_contributions( weights, y_train, *, center=False, include_base=False, ) ``` Input: an observation-weight table and the aligned training target. Output columns add: | Column | Meaning | | --- | --- | | `train_y` | Realized historical outcome. | | `centered_train_y` | `train_y - mean(y_train)` when `center=True`; otherwise `train_y`. | | `contribution` | `weight * train_y` by default. | | `prediction` | Sum of contributions for the forecast row. | | `channel` | `episode`, or `base` when `center=True` and `include_base=True`. | Default `center=False` preserves the exact identity `prediction = weights @ y_train`. Centering is useful for plots but changes the table into a base-plus-centered-contribution decomposition. ## forecast_diagnostics ```python macroforecast.interpretation.dual.forecast_diagnostics(weights, *, top_q=0.05) ``` Output: | Column | Paper/code meaning | | --- | --- | | `concentration` | Forecast concentration: sum of top absolute weights divided by total absolute weight. | | `short_position` | Forecast short position: signed sum of negative weights. | | `short_position_abs` | Absolute short-side exposure. | | `leverage` | Signed weight sum. | | `gross_leverage` | Sum of absolute weights. | | `turnover` | Sum of absolute weight changes relative to the previous forecast row. | | `top_q`, `top_k`, `n_train` | Diagnostic settings. | Negative weights are not automatically errors. In this paper they identify contrast-based use of historical observations. The caution is economic: macroeconomic shocks are often asymmetric, so a mirror-image historical analogy may be a weak explanation even if the model uses it. ## top_observations ```python macroforecast.interpretation.dual.top_observations( weights, *, y_train=None, n=10, sort_by="abs_weight", ) ``` Input: observation weights or observation contributions. If `y_train` is provided and the table lacks `contribution`, contributions are computed first. Output: top historical observations per forecast row with a `rank` column. Supported `sort_by` values: `abs_weight`, `weight`, `contribution`, and `abs_contribution`. ## group_observation_weights ```python macroforecast.interpretation.dual.group_observation_weights( weights, groups, *, y_train=None, ) ``` Input: | Argument | Meaning | | --- | --- | | `weights` | Observation-weight or contribution table. | | `groups` | Mapping from group name to training-index labels. | | `y_train` | Optional training target used to create contributions before grouping. | Example: ```python groups = { "gfc": pd.period_range("2007Q4", "2009Q2", freq="Q").to_timestamp("Q"), "covid": pd.period_range("2020Q1", "2021Q2", freq="Q").to_timestamp("Q"), } grouped = mf.interpretation.dual.group_observation_weights( dual.weights, groups, y_train=y_train, ) ``` Output columns: `test_row`, `test_index`, `episode_group`, `weight`, `abs_weight`, `n_episodes`, and, when available, `contribution` and `abs_contribution`. ## Output Integration `DualInterpretationResult.to_tables(prefix="dual")` returns: | Table key | Meaning | | --- | --- | | `dual_observation_weights` | Long observation-weight table. | | `dual_observation_contributions` | Long contribution table, when requested. | | `dual_forecast_diagnostics` | Concentration, short-position, leverage, gross-leverage, and turnover table. | | `dual_top_observations` | Top historical observations per forecast row. | | `dual_group_observation_weights` | Group-level weights/contributions, when groups are provided. | | `dual_metadata` | Result metadata as key/value rows. | The output module recognizes this result directly: ```python bundle = mf.output.bundle_outputs( forecasts=result, interpretation={"dual": dual}, metadata={"study": "inflation_dual"}, ) manifest = mf.output.write_artifacts( bundle, "results/inflation_dual", layout="grouped", ) ``` With `layout="grouped"`, dual tables are written under: ```text interpretation/dual/ ``` The same grouped path is used when a `ForecastResult` contains a dual sidecar: ```python result = result.with_dual(fit, X_train, y_train, X_test, method="ridge") mf.output.write_artifacts(result, "results/dual_run", layout="grouped") ``` This keeps DualML observation-based explanations separate from SHAP, oShapley/PBSV, PDP/ICE/ALE, and other feature-based interpretation outputs.