maf_per_variable_pca – Per-variable MAF via PCA on lag-panels – Coulombe et al. (2021 IJF) Eq. (7).#
Back to op axis | Back to L3 | Browse all options
Operational op under axis
op, sub-layerL3_A_step_op, layerl3. Standalone callable:mf.functions.maf_per_variable_pca_transform.
Function signature#
mf.functions.maf_per_variable_pca_transform(
panel: pd.DataFrame,
n_lags: int,
n_components_per_var: int,
) -> pd.DataFrame
Parameters#
name |
type |
default |
constraint |
description |
|---|---|---|---|---|
|
|
— |
— |
Input panel. Each column is a variable; rows are time periods. Series is promoted to a single-column DataFrame internally. |
|
|
|
>= 1 |
Number of lags in the per-variable lag-panel. Paper default: 12 (monthly data). |
|
|
|
>= 1 |
Number of PCA components per variable. Paper default: 2 (footnote 11). |
Returns#
pd.DataFrame — scalar result.
Behavior#
Implements the paper-exact Moving Average Factor (MAF) construction from Coulombe, Leroux, Stevanovic & Surprenant (2021 IJF) §2.2 Eq. (7). For each variable k = 1..K in the input panel:
Build the
T × (n_lags + 1)lag-panel[X_{t,k}, L X_{t,k}, ..., L^{n_lags} X_{t,k}].Run PCA retaining
n_components_per_varcomponents (paper default: 2).Append the resulting factor columns to the output.
Output shape: (T, K · n_components_per_var). With defaults n_lags=12, n_components_per_var=2 the output is (T, 2K) – paper footnote 11: ‘We keep two MAFs for each series and they are obtained by PCA.’
Distinction from existing ma_increasing_order → pca(4) path: the existing stacked-PCA MAF cell runs a single PCA over all MA columns at once (stacked, 4 global components). This op runs separate PCA per variable, yielding 2K locally-structured factors rather than 4 global ones. Use this op when paper-Eq.7-exact replication is required.
First n_lags rows per variable are NaN (lag-panel boundary). temporal_rule is required; full_sample_once is rejected to enforce walk-forward boundaries.
Operational from v0.9.0 (phase-f16).
When to use
Paper-exact replication of Coulombe et al. (2021 IJF) MAF construction; when per-variable PCA factors are preferred over global stacked-PCA (4-component) path.
When NOT to use
When the 16-cell horse-race stacked-PCA MAF cell is sufficient (use ma_increasing_order → pca instead); or when K is large and 2K output columns would exceed T.
In recipe context#
Set params.op = "maf_per_variable_pca" in the relevant layer to activate this op within a recipe:
# Layer L3 recipe fragment
params:
op: maf_per_variable_pca
References#
macroforecast design Part 2, L3: ‘feature engineering is a DAG of typed transforms; cascade-depth bounds the longest chain at cascade_max_depth.’
Coulombe, Leroux, Stevanovic & Surprenant (2021) ‘Macroeconomic Data Transformations Matter’, International Journal of Forecasting 37(4): 1338-1354. https://doi.org/10.1016/j.ijforecast.2021.05.005