macroforecast.forecast_analysis#
macroforecast.forecast_analysis inspects outputs from
macroforecast.forecasting.run(...). It does not refit models and does not
change forecasts. It reads two sources:
Source |
Used for |
|---|---|
|
Forecast-vs-actual rows, residuals, rolling loss, model-selection metadata, and combination rows. |
Saved model sidecar JSON from |
Coefficients, intercepts, hyperparameters, and fit diagnostics recorded by |
macroforecast.forecast_diagnostic remains available as a compatibility alias.
This module corresponds to the old generator/model diagnostic role, but the new API is callable-first. No YAML, recipe graph, or runtime materialization is involved.
Public Flow#
import macroforecast as mf
result = mf.forecasting.run(
processed_panel,
["ridge", "random_forest"],
window=window_spec,
features=feature_spec,
model_selection=mf.model_selection.grid({"alpha": [0.01, 0.1]}),
combination=["mean", "inverse_mspe"],
)
analysis = mf.forecast_analysis.diagnose_forecasts(
result,
rolling_window=12,
include_residual_acf=True,
include_residual_qq=True,
)
diagnose_forecasts#
macroforecast.forecast_analysis.diagnose_forecasts(
forecasts,
*,
include_fitted: bool = True,
include_residuals: bool = True,
include_residual_acf: bool = False,
include_residual_qq: bool = False,
include_rolling_loss: bool = True,
rolling_window: int = 12,
rolling_metric: str = "rmse",
include_forecast_scale: bool = False,
levels=None,
scale_view: str = "both_overlay",
back_transform=None,
include_training_loss: bool = False,
include_rolling_training_loss: bool = False,
training_loss_metric: str = "rmse",
include_first_vs_last: bool = False,
include_dfm_idiosyncratic_acf: bool = False,
include_dfm_factor_stability: bool = False,
include_coefficients: bool = True,
include_parameter_stability: bool = True,
include_tuning: bool = True,
include_tuning_objective: bool = True,
include_hyperparameters: bool = True,
include_tuning_scores: bool = True,
include_ensemble_weights: bool = True,
include_ensemble_concentration: bool = True,
include_member_contribution: bool = False,
include_stage_updates: bool = True,
include_combined: bool = True,
) -> ForecastDiagnosticReport
Input#
Name |
Type |
Default |
Choices |
|---|---|---|---|
|
|
required |
Runner result or forecast table. |
|
|
|
Include row-level fitted-vs-actual table. |
|
|
|
Include grouped residual summary. |
|
|
|
Include residual autocorrelation by model/horizon. |
|
|
|
Include normal QQ reference points for residuals. |
|
|
|
Include rolling OOS loss table. |
|
positive int |
|
Rolling window length in forecast rows within each group. |
|
|
|
|
|
|
|
Include transformed and/or back-transformed forecast rows. |
|
Series, DataFrame, or |
|
Original-level target series used by |
|
|
|
|
|
callable or |
|
Optional custom row-level back-transform function. |
|
|
|
Read saved model sidecar fit metrics. |
|
|
|
Add rolling traces of saved fit metrics. |
|
|
|
Metric name from model sidecar diagnostics, usually |
|
|
|
Include first-versus-last forecast comparison by model/horizon. |
|
|
|
Include ACF of DFM residual diagnostics when sidecars contain DFM residuals. |
|
|
|
Include filtered-factor stability summaries when sidecars contain DFM factors. |
|
|
|
Read saved model sidecar JSON and return coefficient paths when available. |
|
|
|
Summarize coefficient stability over origins/windows. |
|
|
|
Include per-forecast model-selection metadata. |
|
|
|
Extract the selected objective and best score from tuning metadata. |
|
|
|
Return selected hyperparameter values over time. |
|
|
|
Summarize tuning best-score distributions. |
|
|
|
Reconstruct weights for identifiable combination methods. |
|
|
|
Summarize ensemble-weight concentration by combined forecast row. |
|
|
|
Attach member prediction, weight, and contribution rows where weights are identifiable. |
|
|
|
Include preprocessing/feature stage update trace from result metadata. |
|
|
|
Include combined forecast rows in fitted/residual/loss tables. |
Output#
Returns ForecastDiagnosticReport.
Field |
Type |
Meaning |
|---|---|---|
|
|
Forecast count, model/horizon counts, date range, missing prediction/actual counts, stored-model count, model-selection count, uncertainty count. |
|
|
Forecast rows plus residual, absolute error, squared error, and percent error. |
|
|
Grouped residual diagnostics by model and horizon. |
|
|
Residual autocorrelation table by model, horizon, and lag. |
|
|
Residual quantiles paired with standard-normal theoretical quantiles. |
|
|
Rolling OOS loss by model and horizon. |
|
|
Transformed/back-transformed forecast rows. |
|
|
Sidecar fit metrics by origin/model/horizon. |
|
|
Rolling sidecar fit metrics. |
|
|
First and last forecast rows in each group plus changes. |
|
|
Coefficient path from saved model sidecars when available. |
|
|
Coefficient stability summary across origins/windows. |
|
|
Model-selection event trace from forecast table metadata. |
|
|
Selected objective, best score, retune flag, and trial counts by forecast row. |
|
|
Long-form selected hyperparameter values over forecast rows. |
|
|
Tuning-score distribution summary by model/horizon/method. |
|
|
Reconstructed combination weights for supported methods. |
|
|
Herfindahl/effective-number summary of ensemble weights. |
|
|
Member prediction and weighted contribution rows. |
|
|
DFM idiosyncratic residual ACF from saved fit diagnostics. |
|
|
Filtered-factor mean, variance, drift, and autocorrelation summaries. |
|
|
Runner stage update trace. |
|
|
Input metadata plus compact |
Returned tables carry attrs["macroforecast_metadata"] == report.metadata.
Metadata#
diagnose_forecasts(...) attaches one compact stage:
analysis.metadata["forecast_analysis"]
The stage records:
Key |
Meaning |
|---|---|
|
Compact counts: forecasts, models, combined rows, stored-model rows, and model-selection rows. |
|
Residual, rolling-loss, coefficient, tuning, ensemble, and stage-update choices. |
|
Number of rows generated by each analysis table. |
Helper Functions#
forecast_overview#
macroforecast.forecast_analysis.forecast_overview(forecasts) -> dict
Returns counts and coverage for one forecast table:
Key |
Meaning |
|---|---|
|
Forecast-table shape. |
|
Forecast date range. |
|
Missingness in forecast/actual columns. |
|
Combination versus base model rows. |
|
Rows with saved model metadata. |
|
Rows with model-selection metadata and rows that retuned. |
|
Uncertainty output coverage. |
fitted_vs_actual#
macroforecast.forecast_analysis.fitted_vs_actual(
forecasts,
*,
include_combined: bool = True,
drop_missing_actual: bool = True,
) -> pandas.DataFrame
Returns row-level diagnostics:
Column |
Meaning |
|---|---|
|
Forecast and realized target from the runner. |
|
|
|
Absolute residual. |
|
Squared residual. |
|
|
residual_report#
macroforecast.forecast_analysis.residual_report(
forecasts,
*,
group_by: Sequence[str] = ("model", "horizon"),
include_combined: bool = True,
) -> pandas.DataFrame
Default grouping is model by horizon. Output columns include n, bias,
mae, mse, rmse, residual_sd, residual_autocorr1, mean_actual,
mean_prediction, first_date, and last_date. The lag-1 autocorrelation is
computed after sorting each group by origin_pos and date; input row order
does not affect the result.
residual_autocorrelation#
macroforecast.forecast_analysis.residual_autocorrelation(
forecasts,
*,
max_lag: int = 12,
group_by: Sequence[str] = ("model", "horizon"),
include_combined: bool = True,
) -> pandas.DataFrame
Returns one row per model/horizon/lag with residual ACF values. This supports forecast-residual correlogram checks without rerunning the model.
residual_qq#
macroforecast.forecast_analysis.residual_qq(
forecasts,
*,
n_quantiles: int = 21,
group_by: Sequence[str] = ("model", "horizon"),
include_combined: bool = True,
) -> pandas.DataFrame
Returns empirical residual quantiles and matching standard-normal theoretical quantiles. Use it for QQ plots or tail-shape checks. It does not run a normality test.
rolling_loss#
macroforecast.forecast_analysis.rolling_loss(
forecasts,
*,
metric: str = "rmse",
window: int = 12,
min_periods: int | None = None,
group_by: Sequence[str] = ("model", "horizon"),
include_combined: bool = True,
) -> pandas.DataFrame
Computes rolling OOS loss inside each group. rmse rolls the squared error and
takes the square root after averaging.
forecast_scale_view#
macroforecast.forecast_analysis.forecast_scale_view(
forecasts,
*,
levels=None,
target=None,
transform=None,
view="both_overlay",
back_transform=None,
include_combined=True,
) -> pandas.DataFrame
Returns forecast rows on the transformed scale, the original level scale, or
both. The runner records target_transform when available. Built-in
back-transform support covers level, one-step change, growth, and
log_growth. For path-average targets or custom transforms, pass
back_transform, a callable that returns either a mapping with prediction
and actual or a two-value sequence.
Output columns include date, origin, horizon, model, scale,
target_transform, prediction, actual, residual, and
back_transform_available.
select_forecast_origins#
macroforecast.forecast_analysis.select_forecast_origins(
forecasts,
*,
view="all_origins",
every_n=12,
include_last=True,
include_combined=True,
) -> pandas.DataFrame
Filters the forecast table to one of three origin views:
"all_origins", "last_origin_only", or "every_n_origins". The last
origin is retained by default in "every_n_origins" so the final test window
is visible even when it does not land exactly on the spacing.
first_vs_last_forecast#
macroforecast.forecast_analysis.first_vs_last_forecast(
forecasts,
*,
group_by=("model", "horizon"),
include_combined=True,
) -> pandas.DataFrame
Compares the first and last forecast row inside each group. Output includes first/last dates, origins, predictions, actuals, residuals, and changes. This is the callable equivalent of the old first-window versus last-window view.
training_loss_trace#
macroforecast.forecast_analysis.training_loss_trace(
forecasts,
*,
load_pickle=False,
) -> pandas.DataFrame
Reads saved model sidecar JSON and returns fit metrics recorded by
macroforecast.models, usually n, mean, std, mae, mse, and rmse.
It does not unpickle models by default. Path-average forecasts can save one
fit per step; those rows use fit_step.
rolling_training_loss#
macroforecast.forecast_analysis.rolling_training_loss(
forecasts_or_trace,
*,
metric="rmse",
window=12,
min_periods=None,
group_by=("model", "horizon"),
load_pickle=False,
) -> pandas.DataFrame
Computes a rolling average of sidecar training metrics. It accepts either a
runner ForecastResult/forecast table or the output of
training_loss_trace(...).
dfm_idiosyncratic_acf#
macroforecast.forecast_analysis.dfm_idiosyncratic_acf(
source,
*,
max_lag=12,
load_pickle=False,
) -> pandas.DataFrame
Reads DFM residual diagnostics from a ModelFit, ForecastResult, or forecast
table with saved model sidecars. Output columns include model context,
residual name, lag, ACF, and observation count.
dfm_factor_stability#
macroforecast.forecast_analysis.dfm_factor_stability(
source,
*,
load_pickle=False,
) -> pandas.DataFrame
Reads filtered DFM factor diagnostics from a ModelFit, ForecastResult, or
forecast table with saved model sidecars. Output includes factor name, count,
mean, standard deviation, variance, first/last values, drift, and lag-1
autocorrelation.
coefficient_trace#
macroforecast.forecast_analysis.coefficient_trace(
forecasts,
*,
include_intercept: bool = True,
load_pickle: bool = False,
models: Iterable[str] | None = None,
) -> pandas.DataFrame
Reads stored_model["metadata_path"] sidecar JSON for each forecast row and
extracts fit.fit.diagnostics.coefficients. It does not unpickle model objects
by default. Set load_pickle=True only for trusted local artifacts when a
sidecar is missing.
Returned rows include date, origin, origin_pos, horizon, model,
fit_step, feature, coefficient, component, and stored artifact paths.
parameter_stability#
macroforecast.forecast_analysis.parameter_stability(
forecasts,
*,
include_intercept: bool = True,
load_pickle: bool = False,
group_by: Sequence[str] = ("model", "horizon", "feature"),
models: Iterable[str] | None = None,
) -> pandas.DataFrame
Summarizes coefficient traces across forecast origins. Output includes count, mean, standard deviation, min/max, first/last coefficient, and sign-change count. It is available only when model sidecars contain coefficient metadata.
tuning_trace#
macroforecast.forecast_analysis.tuning_trace(forecasts) -> pandas.DataFrame
Returns one row per forecast row with model-selection metadata. It records method, metric, validation window, retune flag, best score, best params, trial counts, and policy. The current runner stores selection-event summaries, not full per-trial histories.
tuning_objective_trace#
macroforecast.forecast_analysis.tuning_objective_trace(forecasts) -> pandas.DataFrame
Extracts the objective-facing part of tuning_trace: model, horizon, origin,
method, metric, validation window, retune flag, best score, trial counts, and
policy. Use this when the question is whether model selection itself behaved
consistently.
Output columns are date, origin, origin_pos, horizon, model,
model_spec, method, metric, window, best_score, retuned,
n_trials, n_successful, n_failed, and policy.
hyperparameter_path#
macroforecast.forecast_analysis.hyperparameter_path(forecasts) -> pandas.DataFrame
Returns selected hyperparameters in long form: one row per forecast row and parameter. This is the callable table behind hyperparameter-over-time plots.
tuning_score_distribution#
macroforecast.forecast_analysis.tuning_score_distribution(
forecasts,
*,
group_by: Sequence[str] = ("model", "horizon", "method"),
) -> pandas.DataFrame
Summarizes selected tuning scores by group. Output includes count, mean, standard deviation, min, max, and median best score.
ensemble_weights_over_time#
macroforecast.forecast_analysis.ensemble_weights_over_time(
forecasts,
*,
unsupported: str = "skip",
) -> pandas.DataFrame
Reconstructs weights when the combination method has identifiable weights.
Method |
Weight support |
|---|---|
|
Equal weights. |
|
Recomputed from historical squared forecast errors using the same discount/min-weight parameters. |
|
Equal weights over historically best models. |
|
No unique model weights; skipped by default. |
unsupported controls non-identifiable methods: "skip", "nan", or
"raise".
ensemble_weight_concentration#
macroforecast.forecast_analysis.ensemble_weight_concentration(forecasts) -> pandas.DataFrame
Summarizes reconstructed weights for each combined forecast row. Output includes member count, min/max weight, Herfindahl index, effective number of members, and entropy. These are concentration diagnostics for identifiable weights; median, trimmed-mean, and custom combinations can be non-identifiable.
ensemble_member_contribution#
macroforecast.forecast_analysis.ensemble_member_contribution(forecasts) -> pandas.DataFrame
Returns long-form member contribution rows when weights are identifiable: member model, member prediction, weight, contribution, and combined prediction.
stage_update_trace#
macroforecast.forecast_analysis.stage_update_trace(forecasts) -> pandas.DataFrame
Returns preprocessing and feature-engineering stage update records stored in
ForecastResult.metadata["stages"]. This is empty for direct FeatureSet
inputs because the runner does not refit preprocessing/features in that path.
custom_forecast_diagnostic#
macroforecast.forecast_analysis.custom_forecast_diagnostic(
forecasts,
func,
*,
name=None,
metadata=None,
**params,
) -> pandas.DataFrame
Runs one user diagnostic on a runner ForecastResult or forecast table. This
is for inspection only; it does not refit models, recompute model selection, or
change forecast rows.
Callable signature:
func(forecasts, *, metadata=None, **params)
The forecasts argument passed to the callable is a copy of the forecast
DataFrame. Accepted callable outputs are DataFrame, Series, mapping, or a
sequence convertible to a DataFrame.
The returned table carries:
Attr |
Meaning |
|---|---|
|
Always |
|
|
|
Input metadata plus a |
Example:
def tail_errors(forecasts, *, metadata=None, q=0.95):
err = (forecasts["actual"] - forecasts["prediction"]).abs()
return {"q": q, "abs_error": float(err.quantile(q))}
diag = mf.forecast_analysis.custom_forecast_diagnostic(
result,
tail_errors,
name="tail_errors",
q=0.9,
)
Boundary#
Question |
Use |
|---|---|
Run windowed forecasts and combinations |
|
Score and rank forecasts |
|
Run forecast-comparison tests |
|
Inspect forecast rows, residuals, tuning, coefficients, weights, and stage updates |
|