`extra_trees` – Extremely randomized trees (sklearn).#

Operational op under axis family, sub-layer L4_A_model_selection, layer l4. Standalone callable: mf.functions.extra_trees_fit.

Function signature#

mf.functions.extra_trees_fit(
    X: np.ndarray | pd.DataFrame,
    y: np.ndarray | pd.Series,
) -> ExtraTreesFitResult

name	type	default	constraint	description
`X`	`np.ndarray	pd.DataFrame`	—	—
`y`	`np.ndarray	pd.Series`	—	—

ExtraTreesFitResult — frozen dataclass with fit results.

Attribute	Type	Description
`.feature_importances_`	`np.ndarray`	Mean decrease in impurity per feature, shape (n_features,). Sums to 1.0.
`.n_estimators_used`	`int`	Number of trees grown (= n_estimators parameter).
`.predict(X)`	`np.ndarray`	Predictions for new data X, shape (n_samples,).
`.summary()`	`str`	Human-readable table of fit results including top-3 feature importances.

Like RF but splits at random thresholds (no greedy search). Faster than RF; sometimes lower variance.

v0.9 sub-axis:

params.max_features – number of predictors considered at each split. "sqrt" (default) matches sklearn; 1 (operational, v0.9) implements Coulombe (2024) ‘To Bag is to Prune’ Perfectly Random Forest baseline (one random feature per split, fully random structure).

When to use

Quick non-linear baseline; large ensemble experiments; PRF baseline (max_features=1).

Set params.family = "extra_trees" in the relevant layer to activate this op within a recipe:

# Layer L4 recipe fragment
params:
  family: extra_trees

macroforecast design Part 2, L4: ‘forecasting model is the layer where every authoring iteration ends – pick family, tune, repeat.’
Geurts, Ernst & Wehenkel (2006) ‘Extremely randomized trees’, Machine Learning 63(1).