macroforecast.preprocessing#
Purpose#
macroforecast.preprocessing turns a canonical pandas panel from
macroforecast.data into a processed panel plus metadata. It accepts
a DataSpec, DataBundle, (panel, metadata) tuple, or pandas.DataFrame,
then returns a PreprocessedData object. The preferred input is a
DataBundle or DataSpec produced by macroforecast.data; if preprocessing
receives a plain panel without data-generated metadata, it emits a warning.
The default reprocess() path follows the public McCracken-Ng FRED-MD Matlab
workflow for FRED-MD/FRED-QD style panels. FRED-SD has no official t-code map,
so the user must explicitly choose transform="none" or pass custom codes.
Preprocessing fails closed on transformation metadata. If
transform="official" is selected but no t-code map is available from
transform_codes, metadata["transform_codes"], or
panel.attrs["macroforecast_transform_codes"], reprocess() raises
ValueError. If explicit transform-code keys do not match panel columns, it
also raises. This prevents accidental no-op preprocessing.
Public Functions#
Function |
Purpose |
Output |
|---|---|---|
|
Run the full-sample preprocessing sequence. |
|
|
Store preprocessing choices for runner-fitted execution. |
|
|
Apply one user callable directly to data. |
|
|
Build a custom step for |
|
|
Validate and summarize configured preprocessing choices without changing data. |
|
|
Summarize a completed preprocessing result. |
|
|
Apply McCracken-Ng t-code formulas to matching panel columns. |
|
|
Expand FRED-SD variable/state t-code choices and suggestions. |
|
|
Keep or remove transform-induced leading missing rows. |
|
|
Apply one outlier rule. |
|
|
Fill missing panel values with one imputation rule. |
|
|
Fit and apply one full-panel scaling rule. |
|
|
Keep, truncate, drop, or fill remaining unbalanced edges. |
|
Low-level clean helpers are also public for exact single-operation use. They are listed in Low-Level Clean Helpers.
Public Flow#
import macroforecast as mf
bundle = mf.data.load_fred_md()
data_spec = mf.data.spec(bundle, target="INDPRO", horizons=[1, 3, 6, 12])
processed = mf.preprocessing.reprocess(data_spec)
panel = processed.panel
metadata = processed.metadata
Public Classes And Values#
Symbol |
Meaning |
|---|---|
|
Output object returned by |
|
Accepted direct preprocessing input type: |
|
Runner-compatible fit/transform preprocessing contract. |
|
Fitted preprocessing state used by the runner for fit-window or fixed-reference policies. |
|
High-confidence package t-code suggestions for FRED-SD variables with national analogs. |
|
Broader provisional FRED-SD t-code suggestions. |
PreprocessedData#
macroforecast.preprocessing.PreprocessedData(
panel: pandas.DataFrame,
metadata: dict,
target: str | None = None,
targets: tuple[str, ...] = (),
horizons: tuple[int, ...] = (),
start: str | None = None,
end: str | None = None,
predictors = "all",
steps: tuple[dict, ...] = (),
)
Output Schema#
Field |
Type |
Meaning |
|---|---|---|
|
|
Processed canonical date-indexed panel. |
|
|
Input metadata plus preprocessing stages and transform/standardization state. |
|
copied from |
Run-level data choices preserved for downstream stages. |
|
|
Ordered preprocessing step log. |
Methods#
Method |
Input |
Output |
Meaning |
|---|---|---|---|
|
|
|
Return a new object with one metadata stage added. |
PreprocessedData also supports tuple unpacking:
panel, metadata = processed
Default Order#
Step |
Default |
Meaning |
|---|---|---|
1. Frequency |
|
Keep the input frequency unless the user asks for monthly/quarterly alignment. |
2. Transform |
|
Apply official t-code transforms from FRED-MD/FRED-QD metadata. |
3. T-code lag |
|
Remove leading rows implied by the largest t-code lag. This is two rows for full FRED-MD. |
4. Outliers |
|
Flag observations with |
5. Imputation |
|
Run FRED-MD style PCA-EM with Bai-Ng |
6. Standardize |
|
Optional column-wise scaling after imputation. Choices are |
7. Frame |
|
Keep the post-EM frame. No final balanced-panel truncation is applied by default. |
Set transform_order="before_frequency" when a mixed-frequency panel should be
transformed in each native frequency before monthly or quarterly alignment. The
default is transform_order="after_frequency", which first aligns frequency and
then applies t-codes.
T-Code Formulas#
The official FRED-MD/FRED-QD t-code map uses these formulas for a raw series
x_t.
T-code |
Formula |
Leading missing values |
Log-domain rule |
|---|---|---|---|
|
|
|
none |
|
|
|
none |
|
|
|
none |
|
|
|
if |
|
|
|
requires |
|
|
|
requires |
|
|
|
none |
There is no preprocess(...) compatibility alias in the clean public API. Use
reprocess(...) for full-sample preprocessing and preprocess_spec(...) for a
runner-fitted preprocessing contract.
Most empirical macro papers preprocess the full panel once before fitting
models. That is supported by reprocess(...). For a real-time forecast design,
where each origin should only use information available at that origin, use
preprocess_spec(...) inside macroforecast.forecasting.run(...).
preprocess_spec(...) only stores what preprocessing should do; the runner
receives preprocessing_policy=mf.window.stage_policy(...) and decides where
the spec may fit.
Common runner policies:
Policy scope |
Meaning |
|---|---|
|
Fit preprocessing once on the full panel. This is useful for retrospective replication designs. |
|
Re-run preprocessing on observations available at each origin plus requested test rows. This supports EM imputation on variables observed by that origin. |
|
Fit outlier, imputation, and standardization state on the model fit window, then apply that state to validation/test rows. It currently supports |
|
Fit supported preprocessing state on a fixed reference period, then apply that state to later windows. |
pre = macroforecast.preprocessing.preprocess_spec(
transform="official",
outliers="iqr",
impute="em_factor",
frame="keep",
)
result = macroforecast.forecasting.run(
panel,
"ridge",
preprocessing=pre,
preprocessing_policy=macroforecast.window.stage_policy("origin_available"),
features=features,
window=window,
)
reprocess#
macroforecast.preprocessing.reprocess(
data,
*,
metadata: Mapping[str, object] | None = None,
frequency: str = "keep",
quarterly_to_monthly: str = "step_backward",
weekly_to_monthly: str = "mean",
monthly_to_quarterly: str = "quarterly_average",
weekly_to_quarterly: str = "mean",
transform_order: str = "after_frequency",
transform: str = "official",
transform_codes: Mapping[str, int] | None = None,
transform_code_overrides: Mapping[str, int] | None = None,
tcode_lag: str = "drop",
outliers: str = "iqr",
outlier_action: str = "flag_as_nan",
iqr_threshold: float = 10.0,
zscore_threshold: float = 3.0,
winsorize_quantiles: tuple[float, float] = (0.01, 0.99),
impute: str = "em_factor",
em_n_factors: int = 8,
em_factor_selection: str = "baing_p2",
em_demean: int = 2,
em_max_iter: int = 50,
em_tolerance: float = 1e-6,
standardize: str = "none",
standardize_columns: str | Sequence[str] = "all",
standardize_ddof: int = 0,
frame: str = "keep",
warn_metadata: bool = True,
) -> PreprocessedData
Input#
Name |
Type |
Default |
Choices |
|---|---|---|---|
|
|
required |
Canonical data input. |
|
mapping or |
|
Extra metadata to merge before preprocessing. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mapping or |
from metadata |
Full t-code map. Required for |
|
mapping or |
|
Per-series override applied on top of official or custom codes. Override keys must match panel columns. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Degrees of freedom used by z-score scaling. |
|
|
|
|
|
|
|
Warn when plain panels lack metadata from |
Output#
Returns PreprocessedData.
Field |
Type |
Meaning |
|---|---|---|
|
|
Processed canonical date-indexed panel. |
|
|
Original data metadata plus a |
|
copied from |
Run-level data choices preserved for downstream stages. |
|
|
Ordered preprocessing log. |
metadata["preprocessing"]["transform_state"] stores inverse-transform support
metadata for every transformed series: t-code, log-domain requirement, lag
count, and the last observed raw values/dates available before transformation.
metadata["preprocessing"]["standardization_state"] stores the fitted center
and scale values when standardize != "none".
When transforms are applied, the final post-override t-code map is also stored
in metadata["transform_codes_applied"] and
processed.panel.attrs["macroforecast_transform_codes"]. This is the map that
actually ran, not just the raw loader metadata.
Error Conditions#
Condition |
Result |
|---|---|
Plain |
|
|
|
|
|
Explicit transform-code or override key not in the panel |
|
FRED-SD with default |
|
Frequency inference finds sparse unknown columns during alignment |
|
EM imputation sees an all-missing row or column |
|
Standardization sees a zero-variance numeric column |
|
PreprocessedData supports tuple unpacking:
panel, metadata = processed
preprocess_spec#
preprocess_spec(...) stores the same preprocessing options accepted by
reprocess(...), excluding input-only arguments such as data and metadata.
It rejects unknown options immediately, so stage timing options must be passed
to forecasting.run(..., preprocessing_policy=...), not hidden inside the
preprocessing spec.
macroforecast.preprocessing.preprocess_spec(
**options,
) -> PreprocessSpec
Input#
**options may include any reprocess(...) option except data and
metadata. It also accepts:
Name |
Type |
Default |
Meaning |
|---|---|---|---|
|
sequence or omitted |
omitted |
Custom preprocessing steps created by |
|
|
|
Whether to warn when input lacks |
Do not pass window timing, stage scope, or split choices here. Those belong to
forecasting.run(..., preprocessing_policy=...).
Output#
Returns PreprocessSpec.
Method |
Input |
Output |
Meaning |
|---|---|---|---|
|
preprocessing input |
|
Fit preprocessing choices on a training/history panel. |
|
preprocessing input |
|
Fit and return the processed training panel. |
|
none |
|
JSON-ready preprocessing options. |
|
none |
|
Compact runner metadata. |
FittedPreprocessor.transform(data, metadata=None, history=None, policy=None)
returns PreprocessedData for new rows. policy="origin_available" replays
preprocessing on history + data; policy="fit_window" applies state fitted
on the training window where supported.
pre = mf.preprocessing.preprocess_spec(
transform="official",
outliers="iqr",
impute="em_factor",
standardize="zscore",
frame="keep",
)
For direct advanced use:
fitted = pre.fit(train_panel, policy="origin_available")
processed_test = fitted.transform(test_panel, history=train_panel)
The fitted and transformed metadata records fit_period, history_period,
transform_period, and output_period. policy="fit_window" applies
fit-window outlier, imputation, and standardization state; it currently supports
impute="none", "mean", and "forward_fill".
preprocess_spec(...) also accepts custom_steps=[...]. These steps run after
the built-in preprocessing options. Inside forecasting.run(...), the custom
steps are fitted or applied inside the same stage policy as the rest of the
preprocessing spec.
def add_spread(panel, *, metadata=None, scale=1.0):
out = panel.copy()
out["spread"] = (out["long_rate"] - out["short_rate"]) * scale
return out
pre = mf.preprocessing.preprocess_spec(
transform="none",
impute="mean",
custom_steps=[
mf.preprocessing.custom_preprocess_step("spread", add_spread, scale=100.0),
],
)
custom_preprocess#
Apply one user-supplied preprocessing callable directly to a panel or bundle.
macroforecast.preprocessing.custom_preprocess(
data,
func,
*,
metadata: Mapping[str, object] | None = None,
name: str | None = None,
**params,
) -> PreprocessedData
Callable Contract#
The callable receives:
func(panel: pandas.DataFrame, *, metadata: dict, **params)
It must return one of:
Return type |
Meaning |
|---|---|
|
New canonical or normalizable panel. Existing |
|
Panel plus metadata to continue with. |
|
Full preprocessing object to continue with. |
|
Explicit panel and metadata pair. |
Output#
Returns PreprocessedData. Metadata gains metadata["custom_preprocess"],
including callable name, parameters, input panel summary, and output panel
summary. The output panel also carries
panel.attrs["macroforecast_metadata"].
custom_preprocess_step#
Create a runner-compatible preprocessing step for
preprocess_spec(custom_steps=[...]).
macroforecast.preprocessing.custom_preprocess_step(
name: str,
func,
**params,
) -> dict
Input |
Meaning |
|---|---|
|
Stable step name stored in metadata. |
|
Callable following the |
|
JSON-ready parameters passed to |
The returned dictionary keeps the callable for Python execution, but
PreprocessSpec.to_dict() records only the callable name so runner metadata is
JSON-ready.
Step Helpers#
These helpers return pandas.DataFrame unless noted.
Function |
Input |
Output |
Meaning |
|---|---|---|---|
|
DataFrame/bundle/spec |
|
Dry-run summary of configured choices, transform codes, metadata warning, and detected native frequencies. |
|
|
|
Compact report from a completed preprocessing result. |
|
DataFrame/bundle/spec and callable |
|
Apply one custom preprocessing function directly. |
|
name and callable |
|
Build a custom step for |
|
DataFrame, t-code map |
DataFrame |
Apply McCracken-Ng t-code formulas. |
|
FRED-SD panel/bundle/spec |
|
Build FRED-SD state-series t-codes from user choices and optional national-analog suggestions. |
|
DataFrame |
DataFrame |
Handle missing rows introduced by t-code transforms. |
|
DataFrame |
DataFrame |
Apply one outlier policy. |
|
DataFrame |
DataFrame |
Fill missing values. |
|
DataFrame |
DataFrame |
Apply one full-panel standardization policy. |
|
DataFrame |
DataFrame |
Keep/drop/truncate/fill remaining unbalanced edges. |
Low-level callable variants are public for users who want one exact operation
without the full reprocess(...) sequence.
Low-Level Clean Helpers#
These helpers accept a pandas.DataFrame and return a new pandas.DataFrame
unless the output column says otherwise.
Function |
Key options |
Output |
Meaning |
|---|---|---|---|
|
|
DataFrame |
IQR outlier rule used by |
|
|
DataFrame |
Z-score outlier rule used by |
|
quantile bounds |
DataFrame |
Winsorization rule used by |
|
EM factor controls |
DataFrame |
PCA-EM imputation used by |
|
EM controls |
DataFrame |
Multivariate EM imputation used by |
|
none |
DataFrame |
Column-mean imputation. |
|
none |
DataFrame |
Forward-fill imputation. |
|
none |
DataFrame |
Time interpolation imputation. |
|
none |
DataFrame |
Keep the largest balanced sample. |
|
none |
DataFrame |
Drop series that keep unbalanced sample edges. |
|
none |
DataFrame |
Fill leading missing values with zero. |
|
scaling method |
|
Fit reusable scaling state. |
|
fitted state |
DataFrame |
Apply previously fitted scaling state. |
|
scaling method |
DataFrame |
One-shot panel standardization. |
|
t-code map |
DataFrame |
Apply McCracken-Ng t-code formulas to matching panel columns. |
|
column list, rule |
DataFrame |
Low-level quarterly-to-monthly alignment helper. |
|
column list, rule |
DataFrame |
Low-level monthly-to-quarterly alignment helper. |
plan#
macroforecast.preprocessing.plan(
data,
*,
metadata: Mapping[str, object] | None = None,
frequency: str = "keep",
transform_order: str = "after_frequency",
transform: str = "official",
transform_codes: Mapping[str, int] | None = None,
transform_code_overrides: Mapping[str, int] | None = None,
tcode_lag: str = "drop",
outliers: str = "iqr",
impute: str = "em_factor",
standardize: str = "none",
standardize_columns: str | Sequence[str] = "all",
standardize_ddof: int = 0,
frame: str = "keep",
) -> dict
Input#
Same data input contract as reprocess(). plan() validates the panel and
normalizes choices, but it does not transform, impute, or mutate the panel.
Output#
Key |
Meaning |
|---|---|
|
Shape, date range, columns, missing count, and inferred index frequency. |
|
Warning text that would matter for a panel without data-generated metadata, or |
|
Ordered step names implied by |
|
Requested frequency policy plus native-frequency map and metadata source. |
|
Native-frequency inference concerns such as sparse |
|
Transform method, applied t-code map, ignored metadata-only codes, and any no-code/no-match error note. |
|
Normalized choice values. |
report#
macroforecast.preprocessing.report(processed: PreprocessedData) -> dict
Input#
processed must be the object returned by reprocess().
Output#
Key |
Meaning |
|---|---|
|
Panel summary before preprocessing. |
|
Panel summary after preprocessing. |
|
Ordered execution log with input/output shapes where relevant. |
|
Final normalized preprocessing choices. |
|
Inverse-transform support metadata saved during the transform step. |
|
Fitted scaling metadata saved during the standardization step. |
apply_transform_codes#
macroforecast.preprocessing.apply_transform_codes(
panel: pandas.DataFrame,
codes: Mapping[str, int],
) -> pandas.DataFrame
Input#
Name |
Type |
Required |
Choices |
|---|---|---|---|
|
|
yes |
Canonical date-indexed numeric panel. |
|
mapping from column name to integer |
yes |
T-codes |
Output#
Returns a new pandas.DataFrame with matching columns transformed by the
McCracken-Ng formulas above. Columns without a matching t-code are copied
unchanged. Leading missing values are not removed here; call
handle_tcode_lag() or use reprocess(tcode_lag=...).
Note the distinction between this low-level helper and reprocess().
apply_transform_codes() ignores absent code keys for convenience when used
interactively. reprocess() is stricter: explicit transform-code keys must
match panel columns so a production run cannot silently miss a requested
series.
handle_tcode_lag#
macroforecast.preprocessing.handle_tcode_lag(
panel: pandas.DataFrame,
*,
method: str = "drop",
codes: Mapping[str, int] | None = None,
) -> pandas.DataFrame
Input#
|
Meaning |
|---|---|
|
Drop the first |
|
Keep all rows, including transform-induced leading missing values. |
|
Drop only rows where every column is missing. |
|
Drop every row with at least one missing value. This is strict and often removes too much data. |
Output#
Returns a new pandas.DataFrame. The function does not impute; it only handles
missing rows introduced by transformations.
handle_outliers#
macroforecast.preprocessing.handle_outliers(
panel: pandas.DataFrame,
*,
method: str = "iqr",
action: str = "flag_as_nan",
iqr_threshold: float = 10.0,
zscore_threshold: float = 3.0,
winsorize_quantiles: tuple[float, float] = (0.01, 0.99),
) -> pandas.DataFrame
Input#
Name |
Default |
Choices |
|---|---|---|
|
|
|
|
|
|
|
|
Positive float. McCracken-Ng default is |
|
|
Positive float. |
|
|
Lower and upper quantiles for winsorization. |
Output#
Returns a new pandas.DataFrame. The default marks IQR outliers as NaN, so
the next imputation step can fill them.
impute_missing#
macroforecast.preprocessing.impute_missing(
panel: pandas.DataFrame,
*,
method: str = "em_factor",
em_n_factors: int = 8,
em_factor_selection: str = "baing_p2",
em_demean: int = 2,
em_max_iter: int = 50,
em_tolerance: float = 1e-6,
) -> pandas.DataFrame
Input#
Name |
Default |
Choices |
|---|---|---|
|
|
|
|
|
Maximum factor count for |
|
|
|
|
|
|
|
|
Positive integer. |
|
|
Positive float. |
Output#
Returns a new pandas.DataFrame. The default em_factor path uses the
FRED-MD-style PCA-EM algorithm. It raises if the panel contains an all-missing
row or all-missing column; use handle_tcode_lag() before this step for the
usual FRED-MD transform-induced leading missing rows.
method="linear" fills only interior missing values bracketed by observed
data. It does not extrapolate leading or trailing missing values, because those
edges usually encode unavailable source observations.
method="em_multivariate" uses the same all-missing row/column guard as
em_factor.
standardize_panel#
macroforecast.preprocessing.standardize_panel(
panel: pandas.DataFrame,
*,
method: str = "zscore",
ddof: int = 0,
) -> pandas.DataFrame
Input#
Name |
Default |
Choices |
|---|---|---|
|
|
|
|
|
Non-negative integer used only for z-score standardization. |
Output#
Returns a new pandas.DataFrame with numeric columns scaled. zscore uses
column means and standard deviations, robust uses median and IQR, and
minmax uses minimum and range. The helper fits scaling parameters on the full
panel supplied to it.
For forecasting experiments that require origin-by-origin information sets,
prefer preprocess_spec(standardize=...) through the forecasting runner. In
that path, scaling parameters are fitted on the train window and reused for the
test rows.
Inside reprocess(...), use standardize_columns="predictors" when a
DataSpec should scale predictor columns while leaving the target in its
post-transform units.
handle_frame_edges#
macroforecast.preprocessing.handle_frame_edges(
panel: pandas.DataFrame,
*,
method: str = "keep",
) -> pandas.DataFrame
Input#
|
Meaning |
|---|---|
|
Keep the panel as-is. This is the default after EM imputation. |
|
Truncate to the largest balanced sample. |
|
Drop columns that keep unbalanced edges. |
|
Fill leading missing values with zero. |
Output#
Returns a new pandas.DataFrame.
FRED-SD#
FRED-SD does not provide official t-codes. reprocess(fred_sd_bundle) with
the default transform="official" raises an error. The user must choose one
of these paths.
Package suggestion tables are exposed as constants for inspection:
Symbol |
Meaning |
|---|---|
|
High-confidence t-code suggestions based on national FRED-MD/FRED-QD analogs. |
|
Broader provisional t-code suggestions; opt in with |
fred_sd_transform_codes#
macroforecast.preprocessing.fred_sd_transform_codes(
data,
*,
variable_codes: Mapping[str, int] | None = None,
state_series_codes: Mapping[str, int] | None = None,
use_national_analog_suggestions: bool = True,
include_medium_confidence: bool = False,
return_table: bool = False,
) -> dict[str, int] | tuple[dict[str, int], pandas.DataFrame]
Input#
Name |
Type |
Default |
Meaning |
|---|---|---|---|
|
|
required |
FRED-SD wide state-series panel. |
|
mapping or |
|
User t-code choices by FRED-SD variable, such as |
|
mapping or |
|
User t-code choices by exact column, such as |
|
|
|
Include high-confidence package suggestions based on national FRED-MD/FRED-QD analogs. |
|
|
|
Include broader provisional suggestions. |
|
|
|
Return a provenance table with the expanded code map. |
Output#
By default, returns dict[str, int] mapping FRED-SD state-series columns to
t-codes. With return_table=True, returns (codes, table). The table columns
are column, sd_variable, state, tcode, source, and
suggestion_confidence.
suggestion_confidence is not a statistical confidence interval. It records
whether the t-code came from a user state-series override, user variable-level
choice, high-confidence package suggestion, medium-confidence package
suggestion, or no assignment.
No transform:
processed = mf.preprocessing.reprocess(fred_sd_bundle, transform="none")
Variable-level t-codes expanded to all state series:
codes = mf.preprocessing.fred_sd_transform_codes(
fred_sd_bundle,
variable_codes={"UR": 2, "ICLAIMS": 5},
)
processed = mf.preprocessing.reprocess(
fred_sd_bundle,
frequency="monthly",
transform="custom",
transform_codes=codes,
)
Built-in national-analog suggestions are offered for high-confidence FRED-SD
variables such as UR, PARTRATE, ICLAIMS, LF, NA, and major employment
sector variables. These are suggestions, not official FRED-SD metadata. Pass
include_medium_confidence=True to also include broader output, housing, trade,
and income analogs.
To inspect provenance:
codes, table = mf.preprocessing.fred_sd_transform_codes(
fred_sd_bundle,
variable_codes={"UR": 2},
return_table=True,
)
table has columns column, sd_variable, state, tcode, source, and
suggestion_confidence. Sources distinguish user state-series overrides, user
variable-level choices, high- or medium-confidence national-analog suggestions,
and unassigned columns. suggestion_confidence is not a statistical confidence
interval; it is a provenance label for non-official package suggestions.
For FRED-SD frequency alignment, preprocessing reads the data-generated
fred_sd_series_metadata report first. Observed-date inference is only a
fallback. FRED-SD is mixed monthly/quarterly data; combined dataset frequency
alignment belongs in macroforecast.data, not in preprocessing.
FRED-QD and Dataset Combination#
mf.data.load_fred_qd() returns a quarterly panel with
metadata["frequency"] == "quarterly" and official FRED-QD t-codes. FRED-QD is
not mixed-frequency in the same sense as FRED-SD.
Combinations such as FRED-MD + FRED-SD or FRED-QD + FRED-SD should be built in
macroforecast.data, not in preprocessing. Dataset composition decides which
sources to load, how to align indices before a run, how to merge metadata, and
how to record frequency-conversion provenance. Preprocessing then operates on
the combined canonical panel it receives.
Use:
monthly_bundle = mf.data.load_fred_md_sd(states=["CA"], variables=["UR"])
quarterly_bundle = mf.data.load_fred_qd_sd(states=["CA"], variables=["UR"])
Source#
The FRED-MD/FRED-QD defaults are based on the public FRED-Databases Matlab code
linked from the St. Louis Fed FRED-MD/FRED-QD page, specifically
fredfactors.m, prepare_missing.m, remove_outliers.m, and factors_em.m.
box_cox_lambda– select a Box-Cox lambda for one series (‘loglik’ MLE or ‘guerrero’; forecast::BoxCox.lambda).box_cox_clean– apply a Box-Cox variance-stabilising transform per numeric column (lambda selected or supplied).inverse_box_cox– invert a Box-Cox transform given lambda.