macroforecast.interpretation.dual#

Back to interpretation

macroforecast.interpretation.dual is the dedicated namespace for the dual interpretation route in Goulet Coulombe, Goebel, and Klieber (2024), “Dual Interpretation of Machine Learning Forecasts” (arXiv:2412.13076). Standard variable-importance tools ask which predictor columns matter. Dual interpretation asks which historical training observations matter for a forecast.

The central identity is:

yhat_new = sum_i w_i(new) y_i

The weights w_i(new) are observation weights, also called data-portfolio weights in the paper/code. A positive weight means the model borrows from that historical outcome. A negative weight means the model uses that observation by contrast. A concentrated weight vector means the forecast relies on a small number of episodes. A high short position or high gross leverage means the forecast is extrapolative rather than a simple local average.

Relation to Goulet Coulombe (2026), “Ordinary Least Squares as an Attention Mechanism”: OLS-as-attention is the exact linear algebra route X_test (X_train'X_train)^-1 X_train'. It is available through macroforecast.interpretation.ols_attention_weights(), ridge_attention_weights(), ols_attention_embedding(), and ols_attention_equivalence(). This dual namespace is broader: it uses the same historical-observation idea for ridge/OLS, kernel ridge, and random forest data-portfolio weights, plus contribution, diagnostic, top-observation, and group tables.

Reference Sources#

Source

Used for

Goulet Coulombe, Goebel, and Klieber (2024), “Dual Interpretation of Machine Learning Forecasts”

Paper terminology and interpretation target.

Goulet Coulombe (2026), “Ordinary Least Squares as an Attention Mechanism”

Exact OLS/ridge attention identity and whitened embedding interpretation.

wiki/raw/paper_code/coulombe_site_github_20260530/dual_python/auxiliaries.py

Ridge, kernel-ridge, and random-forest observation-weight formulas.

wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/DualML.R

Forecast concentration, forecast short position, forecast leverage, and forecast turnover definitions.

wiki/raw/paper_code/coulombe_site_github_20260530/DualML_R/README.md

Original model-route inventory: OLS, RF, LGB, RR, KRR, and NN.

Implemented now: ridge/OLS, kernel ridge, and sklearn-style random forest. Deferred routes: boosted-tree AXIL, LGB+/LGBA+ channel-specific weights, neural embedding-ridge approximation, and classification log-odds decomposition.

Public Functions#

Function

Input

Output

Purpose

macroforecast.interpretation.dual.dual_interpretation()

model, train features, train target, optional test features

DualInterpretationResult

Run the paper-aligned ridge/KRR/RF path and return all dual tables together.

macroforecast.interpretation.dual.dual_from_forecast_result()

completed ForecastResult, model, train features, train target, optional test features

ForecastResult or DualInterpretationResult

Build a dual sidecar for a completed runner result.

macroforecast.interpretation.dual.observation_weights()

model, X_train, optional X_test

long DataFrame

Compute historical observation/data-portfolio weights.

macroforecast.interpretation.dual.observation_contributions()

weights and y_train

long DataFrame

Multiply observation weights by historical outcomes.

macroforecast.interpretation.dual.forecast_diagnostics()

weights

DataFrame

Compute concentration, short position, leverage, gross leverage, and turnover.

macroforecast.interpretation.dual.top_observations()

weights or contributions

long DataFrame

Return the largest historical observations for each forecast.

macroforecast.interpretation.dual.group_observation_weights()

weights/contributions and a group mapping

DataFrame

Aggregate observation weights over user-defined regimes or episodes.

DualInterpretationResult.to_tables()

result object

dict of DataFrame

Expand the result for macroforecast.output.

Backward-compatible aliases are still available:

Alias

Preferred name

outcome_contributions

observation_contributions

data_portfolio_diagnostics

forecast_diagnostics

top_episodes

top_observations

episode_group_weights

group_observation_weights

Public Flow#

import macroforecast as mf

dual = mf.interpretation.dual.dual_interpretation(
    model,
    X_train,
    y_train,
    X_test,
    method="random_forest",
    top_n=10,
    groups={
        "gfc": gfc_train_dates,
        "covid": covid_train_dates,
    },
)

tables = dual.to_tables(prefix="inflation")

For completed forecast runs, attach the same result as a sidecar:

result = mf.forecasting.run(feature_set, "ridge", window=window)

result = mf.interpretation.dual.dual_from_forecast_result(
    result,
    fit,
    X_train,
    y_train,
    X_test,
    method="ridge",
)

# Equivalent method form:
result = result.with_dual(fit, X_train, y_train, X_test, method="ridge")

forecasting.run() does not compute dual interpretation automatically. The completed forecast table does not contain the exact fitted estimator, training-feature matrix, training target, or forecast-row feature matrix. Those objects must be passed explicitly to avoid silent look-ahead or stale-design errors.

For a ridge/KRR route, model can be None:

dual = mf.interpretation.dual.dual_interpretation(
    None,
    X_train,
    y_train,
    X_test,
    method="krr",
    kernel="laplace",
    sigma=1e-4,
    lambda_=0.1,
)

dual_interpretation#

macroforecast.interpretation.dual.dual_interpretation(
    model,
    X_train,
    y_train,
    X_test=None,
    *,
    method="auto",
    lambda_=1e-8,
    kernel="linear",
    sigma=1.0,
    add_intercept=False,
    ridge_penalty_scale="n_train",
    normalize=False,
    center=False,
    include_base=False,
    top_n=10,
    top_sort_by="abs_weight",
    top_q=0.05,
    groups=None,
    include_contributions=True,
    include_diagnostics=True,
    include_top_observations=True,
    include_group_weights=None,
)

Input:

Argument

Type

Default

Meaning

model

fitted model or None

required

Required for random-forest weights. Optional for ridge/KRR because weights are closed-form from X_train and X_test.

X_train

pandas DataFrame

required

Training feature matrix. Its index becomes train_index.

y_train

pandas Series or sequence

required

Training target aligned to X_train. If it is a Series, the index is aligned to train_index.

X_test

pandas DataFrame or None

None

Forecast-row feature matrix. If omitted, each training row is explained against the training panel.

method

string

auto

auto, ridge, ols, krr, kernel_ridge, random_forest, or rf.

lambda_

float

1e-8

Ridge/KRR regularization.

kernel

string

linear

KRR kernel: linear, gaussian, rbf, laplace, or laplacian.

sigma

float

1.0

Kernel bandwidth convention used by the reviewed code: exp(-sigma * distance).

add_intercept

bool

False

Adds an unpenalized intercept for ridge/OLS. The paper code usually works with standardized no-intercept matrices.

ridge_penalty_scale

string

n_train

Ridge penalty convention. n_train uses n_train * lambda_; none uses lambda_.

normalize

bool

False

Re-normalize row weights to sum to one. Default is false because leverage and negative weights are meaningful diagnostics.

center

bool

False

Center y_train before contribution calculation.

include_base

bool

False

With center=True, add an explicit base-row contribution.

top_n

int

10

Number of top observations returned per forecast row.

top_sort_by

string

abs_weight

abs_weight, weight, contribution, or abs_contribution.

top_q

float

0.05

Share of observations used in concentration. Values above 1 are treated as 1.

groups

mapping or None

None

Named historical episode groups, mapping group name to training-index labels.

include_*

bool

varies

Include or skip contribution, diagnostic, top-observation, and group tables.

Output: DualInterpretationResult.

Field

Type

Meaning

weights

DataFrame

Observation/data-portfolio weights.

contributions

DataFrame or None

Observation-level forecast contributions.

diagnostics

DataFrame or None

Forecast concentration, short position, leverage, gross leverage, and turnover.

top_observations

DataFrame or None

Largest historical observations per forecast.

group_weights

DataFrame or None

Group-level observation weights and contributions.

metadata

dict

Paper route, implemented/deferred routes, and options used.

dual_from_forecast_result#

macroforecast.interpretation.dual.dual_from_forecast_result(
    result,
    model,
    X_train,
    y_train,
    X_test=None,
    *,
    attach=True,
    sidecar_name="dual",
    **dual_options,
)

Input:

Argument

Type

Default

Meaning

result

ForecastResult

required

Completed forecast runner output.

model

fitted model or None

required

Same model argument passed to dual_interpretation(...).

X_train, y_train, X_test

pandas objects

required except X_test

Exact design matrices used for the dual explanation.

attach

bool

True

If true, return a copy of ForecastResult with the sidecar attached. If false, return the standalone DualInterpretationResult.

sidecar_name

str

dual

Name used in ForecastResult.sidecars and output artifact names.

**dual_options

keyword args

none

Forwarded to dual_interpretation(...), such as method, lambda_, kernel, groups, and top_n.

Output: with attach=True, a new ForecastResult; with attach=False, a standalone DualInterpretationResult.

observation_weights#

macroforecast.interpretation.dual.observation_weights(
    model,
    X_train,
    X_test=None,
    *,
    method="auto",
    lambda_=1e-8,
    kernel="linear",
    sigma=1.0,
    add_intercept=False,
    ridge_penalty_scale="n_train",
    normalize=False,
)

Implemented routes:

Route

Formula / logic

Notes

Ridge / OLS

W = X_test (X_train' X_train + n lambda I)^-1 X_train' by default

Set ridge_penalty_scale="none" for lambda I. add_intercept=True adds an unpenalized intercept.

Kernel ridge

W = K_test (K_train + lambda I)^-1

Kernels: linear, gaussian/rbf, laplace/laplacian.

Random forest

For each tree, assign test and train rows to leaves; train rows in the same leaf share weight; average across trees

For sklearn forests, bootstrap sample counts are used when recoverable.

Output columns:

Column

Meaning

test_row, test_index

Forecast-row position and index.

train_row, train_index

Historical observation position and index.

weight, abs_weight

Signed and absolute observation weight.

channel

Implemented route: ridge, krr, or random_forest.

The dense matrix is attached as attrs["weight_matrix"] with shape (n_test, n_train).

observation_contributions#

macroforecast.interpretation.dual.observation_contributions(
    weights,
    y_train,
    *,
    center=False,
    include_base=False,
)

Input: an observation-weight table and the aligned training target.

Output columns add:

Column

Meaning

train_y

Realized historical outcome.

centered_train_y

train_y - mean(y_train) when center=True; otherwise train_y.

contribution

weight * train_y by default.

prediction

Sum of contributions for the forecast row.

channel

episode, or base when center=True and include_base=True.

Default center=False preserves the exact identity prediction = weights @ y_train. Centering is useful for plots but changes the table into a base-plus-centered-contribution decomposition.

forecast_diagnostics#

macroforecast.interpretation.dual.forecast_diagnostics(weights, *, top_q=0.05)

Output:

Column

Paper/code meaning

concentration

Forecast concentration: sum of top absolute weights divided by total absolute weight.

short_position

Forecast short position: signed sum of negative weights.

short_position_abs

Absolute short-side exposure.

leverage

Signed weight sum.

gross_leverage

Sum of absolute weights.

turnover

Sum of absolute weight changes relative to the previous forecast row.

top_q, top_k, n_train

Diagnostic settings.

Negative weights are not automatically errors. In this paper they identify contrast-based use of historical observations. The caution is economic: macroeconomic shocks are often asymmetric, so a mirror-image historical analogy may be a weak explanation even if the model uses it.

top_observations#

macroforecast.interpretation.dual.top_observations(
    weights,
    *,
    y_train=None,
    n=10,
    sort_by="abs_weight",
)

Input: observation weights or observation contributions. If y_train is provided and the table lacks contribution, contributions are computed first.

Output: top historical observations per forecast row with a rank column. Supported sort_by values: abs_weight, weight, contribution, and abs_contribution.

group_observation_weights#

macroforecast.interpretation.dual.group_observation_weights(
    weights,
    groups,
    *,
    y_train=None,
)

Input:

Argument

Meaning

weights

Observation-weight or contribution table.

groups

Mapping from group name to training-index labels.

y_train

Optional training target used to create contributions before grouping.

Example:

groups = {
    "gfc": pd.period_range("2007Q4", "2009Q2", freq="Q").to_timestamp("Q"),
    "covid": pd.period_range("2020Q1", "2021Q2", freq="Q").to_timestamp("Q"),
}

grouped = mf.interpretation.dual.group_observation_weights(
    dual.weights,
    groups,
    y_train=y_train,
)

Output columns: test_row, test_index, episode_group, weight, abs_weight, n_episodes, and, when available, contribution and abs_contribution.

Output Integration#

DualInterpretationResult.to_tables(prefix="dual") returns:

Table key

Meaning

dual_observation_weights

Long observation-weight table.

dual_observation_contributions

Long contribution table, when requested.

dual_forecast_diagnostics

Concentration, short-position, leverage, gross-leverage, and turnover table.

dual_top_observations

Top historical observations per forecast row.

dual_group_observation_weights

Group-level weights/contributions, when groups are provided.

dual_metadata

Result metadata as key/value rows.

The output module recognizes this result directly:

bundle = mf.output.bundle_outputs(
    forecasts=result,
    interpretation={"dual": dual},
    metadata={"study": "inflation_dual"},
)

manifest = mf.output.write_artifacts(
    bundle,
    "results/inflation_dual",
    layout="grouped",
)

With layout="grouped", dual tables are written under:

interpretation/dual/

The same grouped path is used when a ForecastResult contains a dual sidecar:

result = result.with_dual(fit, X_train, y_train, X_test, method="ridge")
mf.output.write_artifacts(result, "results/dual_run", layout="grouped")

This keeps DualML observation-based explanations separate from SHAP, oShapley/PBSV, PDP/ICE/ALE, and other feature-based interpretation outputs.