feature_selection – Filter columns by variance / correlation / lasso pre-screen.#
Back to op axis | Back to L3 | Browse all options
Operational op under axis
op, sub-layerL3_A_step_op, layerl3. Standalone callable:mf.functions.feature_selection_transform.
Function signature#
mf.functions.feature_selection_transform(
panel: pd.DataFrame,
target: pd.Series | None,
n_features: int | float,
method: str,
) -> pd.DataFrame
Parameters#
name |
type |
default |
constraint |
description |
|---|---|---|---|---|
|
|
— |
— |
Input panel. Each column is a variable; rows are time periods. Series is promoted to a single-column DataFrame internally. |
|
`pd.Series |
None` |
|
— |
|
`int |
float` |
|
int >= 1 or float in (0, 1] |
|
|
|
“variance” |
“correlation” |
Returns#
pd.DataFrame — scalar result.
Behavior#
Drops columns failing one of three criteria configured via params.method:
variance– drop columns with variance belowparams.threshold.correlation– drop columns with pairwise correlation aboveparams.threshold(keeps the first).lasso– fit a Lasso pre-screen and keep columns with non-zero coefficients.
When to use
Trimming the panel before expensive downstream estimators (NN, SVM, kernel) when high-dim noise dominates.
When NOT to use
Tree models – they handle irrelevant features natively.
In recipe context#
Set params.op = "feature_selection" in the relevant layer to activate this op within a recipe:
# Layer L3 recipe fragment
params:
op: feature_selection
References#
macroforecast design Part 2, L3: ‘feature engineering is a DAG of typed transforms; cascade-depth bounds the longest chain at cascade_max_depth.’