# Custom Function Quickstart macroforecast provides three first-class Python extension points for custom functions. Register once in Python, reference by name in a recipe or experiment. For full contract details see [Custom Extensions](custom_hooks.md). For target-transformer runtime rules see [Target transformer](target_transformer.md). ## When to use which hook | You want to... | Use | |---|---| | Replace the forecasting estimator (custom loss, recurrent rule, ML model) | `custom_model` | | Post-process the feature matrix after Layer 2 construction | `custom_preprocessor` | | Transform the training target (then inverse-transform predictions) | `target_transformer` | All three hooks are decorator APIs from the `macroforecast.custom` module, re-exported at the top-level `mf` namespace. ## Register a custom model ```python import macroforecast as mf @mf.custom_model("my_ridge") def my_ridge(X_train, y_train, X_test, context): """Minimal custom ridge-like estimator.""" from sklearn.linear_model import Ridge model = Ridge(alpha=1.0) model.fit(X_train, y_train) return float(model.predict(X_test)[0]) ``` Contract: ```python fn(X_train, y_train, X_test, context) -> scalar or one-element sequence ``` - `X_train`: `(n_train, n_features)` array -- training features - `y_train`: `(n_train,)` array -- training target (already split by the runtime) - `X_test`: `(1, n_features)` array -- one test row per forecast origin - `context`: dict with `model_name`, `target`, `horizon`, `feature_names`, and other runtime metadata Rules: - Fit only on `X_train` / `y_train`. Never read future rows or full-sample statistics. - Return a scalar or a one-element array/sequence. Use in a recipe (YAML): ```yaml 4_forecasting_model: nodes: - id: fit_custom type: step op: fit_model params: {family: my_ridge, min_train_size: 24, forecast_strategy: direct, training_start_rule: expanding, refit_policy: every_origin, search_algorithm: none} inputs: [src_X, src_y] ``` > **YAML + Python**: YAML selects the registered name via `family: my_ridge`. > The Python file that registers `my_ridge` must be imported **before** > `mf.run()` is called. YAML does not import Python modules automatically. Use with `Experiment`: ```python result = ( mf.Experiment( dataset="fred_md", target="CPIAUCSL", start="1990-01", end="2019-12", horizons=[1, 3, 6], model_family="my_ridge", ) .run() ) ``` ## Register a custom preprocessor ```python import macroforecast as mf import numpy as np @mf.custom_preprocessor("demean") def demean(X_train, y_train, X_test, context): """Remove column means fit on training data only.""" col_means = X_train.mean(axis=0) return X_train - col_means, X_test - col_means ``` Contract: ```python fn(X_train, y_train, X_test, context) -> (X_train_new, X_test_new) ``` - Fit preprocessing decisions (e.g., column means, PCA components) on `X_train` only -- never on `X_test`. - `y_train` is read-only context; do not transform it here. - Return arrays with the same row count as the inputs. Use with `Experiment`: ```python result = ( mf.Experiment( dataset="fred_md", target="CPIAUCSL", start="1990-01", end="2019-12", horizons=[1, 3, 6], model_family="ridge", ) .use_preprocessor("demean") .run() ) ``` Use in a recipe (YAML): ```yaml 4_forecasting_model: nodes: - id: preprocess type: step op: apply_preprocessor params: {name: demean} inputs: [src_X, src_y] ``` ## Register a target transformer ```python import macroforecast as mf import numpy as np @mf.target_transformer("standardize_target") class StandardizeTarget: """Standardize the training target; inverse-transform predictions.""" def fit(self, target_train, context): self._mean = float(np.mean(target_train)) self._std = float(np.std(target_train, ddof=1)) or 1.0 return self def transform(self, target, context): return (target - self._mean) / self._std def inverse_transform_prediction(self, target_pred, context): return target_pred * self._std + self._mean ``` Contract: the transformer class must implement three methods: | Method | Signature | Purpose | |---|---|---| | `fit` | `(target_train, context) -> self` | Fit on training window only | | `transform` | `(target, context) -> target_transformed` | Applied to training target before model fitting | | `inverse_transform_prediction` | `(target_pred, context) -> target_raw` | Restores predictions to raw scale | Scale rules (enforced by the runtime): - Model is trained on the **transformed** target. - All reported forecasts and metrics are on the **raw** target scale. - Benchmarks always remain on the raw scale for comparability. **Current runtime gate**: `target_transformer` is executable only for target-lag and raw-panel feature runtimes with `model_family` in `ols`, `ridge`, `lasso`, `elasticnet`, or a registered `custom_model`. Other feature runtimes reject non-`none` transformer values until their scale contracts are designed. Use in a recipe (YAML): ```yaml 1_data: leaf_config: target_transformer: standardize_target ``` > **Note**: `Experiment.use_target_transformer()` is not available in the > current Python API. Use the YAML recipe path or mutate `to_recipe_dict()` > before calling `mf.run()`: > > ```python > recipe = exp.to_recipe_dict() > recipe["1_data"]["leaf_config"]["target_transformer"] = "standardize_target" > # then run via mf.run() with the dict or write it to YAML > ``` ## Using a custom function with YAML recipes YAML recipes reference registered names as strings. Python registration must happen in the same process that calls `mf.run()`. Recommended pattern: keep your custom functions in a dedicated module, import it at the top of your script. ```python # my_study.py import custom_functions # registers @mf.custom_model, @mf.custom_preprocessor, etc. import macroforecast as mf result = mf.run("my_study.yaml", output_directory="output/") ``` ```python # custom_functions.py import macroforecast as mf @mf.custom_model("my_model") def my_model(X_train, y_train, X_test, context): return float(y_train[-1]) # naive last-value baseline ``` Check what is registered: ```python print(mf.list_custom_models()) print(mf.list_custom_preprocessors()) print(mf.list_custom_target_transformers()) ``` ## Common pitfalls | Symptom | Cause | Fix | |---|---|---| | `KeyError: custom model 'my_model' is not registered` | The Python file that calls `@mf.custom_model(...)` was not imported before `mf.run()`. | Import the registration module before `mf.run()`. | | Custom model returns wrong shape | Return value has more than one element. | Return a scalar or a one-element array: `return float(pred[0])`. | | Preprocessor causes row count mismatch | `X_train` and `y_train` row counts diverge after transformation. | Never add or remove rows in a preprocessor -- only transform feature values. | | `target_transformer` rejected at runtime | Feature runtime is not in the supported allowlist (`ols`/`ridge`/`lasso`/`elasticnet`/custom model). | Switch to a supported model family, or set `target_transformer: none`. | | `name must not start with '_'` | Custom name begins with underscore. | Use a name that starts with a letter: `"my_model"` not `"_my_model"`. | | Registry survives across test runs | Module-level dicts persist in the same process. | Call `mf.clear_custom_extensions()` in test teardown or notebook re-runs. | ## See also - [Custom Extensions](custom_hooks.md) -- full five-hook reference with all contract details - [Target transformer](target_transformer.md) -- scale rules, runtime gate, inverse-transform contract - [Bring your own data](../for_researchers/user_data_workflow.md) -- CSV/Parquet data format and loading - [Simple API](../for_researchers/simple_api/index.md) -- `Experiment` and `mf.forecast(...)` walkthrough