feature_selection – Filter columns by variance / correlation / lasso pre-screen.#

Back to op axis | Back to L3 | Browse all options

Operational op under axis op, sub-layer L3_A_step_op, layer l3. Standalone callable: mf.functions.feature_selection_transform.

Function signature#

mf.functions.feature_selection_transform(
    panel: pd.DataFrame,
    target: pd.Series | None,
    n_features: int | float,
    method: str,
) -> pd.DataFrame

Parameters#

name

type

default

constraint

description

panel

pd.DataFrame

Input panel. Each column is a variable; rows are time periods. Series is promoted to a single-column DataFrame internally.

target

`pd.Series

None`

None

n_features

`int

float`

0.5

int >= 1 or float in (0, 1]

method

str

'"variance"'

“variance”

“correlation”

Returns#

pd.DataFrame — scalar result.

Behavior#

Drops columns failing one of three criteria configured via params.method:

  • variance – drop columns with variance below params.threshold.

  • correlation – drop columns with pairwise correlation above params.threshold (keeps the first).

  • lasso – fit a Lasso pre-screen and keep columns with non-zero coefficients.

When to use

Trimming the panel before expensive downstream estimators (NN, SVM, kernel) when high-dim noise dominates.

When NOT to use

Tree models – they handle irrelevant features natively.

In recipe context#

Set params.op = "feature_selection" in the relevant layer to activate this op within a recipe:

# Layer L3 recipe fragment
params:
  op: feature_selection

References#

  • macroforecast design Part 2, L3: ‘feature engineering is a DAG of typed transforms; cascade-depth bounds the longest chain at cascade_max_depth.’