custom_dataset#

Back to custom extensions

Use custom data functions when the panel does not come from FRED-MD, FRED-QD, or FRED-SD. The output must still be a canonical DataBundle: a date-indexed numeric panel plus metadata that later stages can read.

Function Choices#

Function	Input	Output	Use case
`mf.data.custom_dataset(...)`	in-memory `DataFrame`	`DataBundle`	User code already loaded the data.
`mf.data.load_custom_csv(...)`	CSV path	`DataBundle`	File source is CSV.
`mf.data.load_custom_parquet(...)`	Parquet path	`DataBundle`	File source is Parquet.

custom_dataset#

mf.data.custom_dataset(
    data,
    *,
    date=None,
    columns=None,
    dataset="custom",
    frequency=None,
    transform_codes=None,
    metadata=None,
) -> mf.data.DataBundle

Input#

Name	Type	Meaning
`data`	`pandas.DataFrame`	User panel. It can contain a date column or already have a `DatetimeIndex`.
`date`	str or `None`	Date column to move into the index. Use `None` when the index is already dates.
`columns`	sequence or `None`	Optional variable subset after date handling.
`dataset`	str	Dataset label stored in metadata.
`frequency`	str or mapping or `None`	Panel frequency such as `monthly`, `quarterly`, or per-column frequency metadata.
`transform_codes`	mapping or `None`	Optional FRED-style transformation code metadata by column.
`metadata`	mapping or `None`	User metadata to merge into the bundle metadata.

Output#

Field	Contract
`bundle.panel`	Numeric `DataFrame`, sorted `DatetimeIndex` named `date`, no duplicate dates.
`bundle.metadata["dataset"]`	Dataset name.
`bundle.metadata["frequency"]`	Dataset-level or column-level frequency information when supplied.
`bundle.metadata["transform_codes"]`	Transform-code metadata when supplied.
`bundle.metadata["panel_normalization"]`	Normalization report for date/index/column conversion.

Flow#

bundle = mf.data.custom_dataset(
    frame,
    date="date",
    dataset="local_macro",
    frequency="monthly",
    transform_codes={"target": 1, "x": 1},
)

processed = mf.preprocessing.reprocess(bundle, transform="none")

File Loaders#

mf.data.load_custom_csv(path, *, date, columns=None, dataset="custom", ...)
mf.data.load_custom_parquet(path, *, date, columns=None, dataset="custom", ...)

The file loaders normalize the same panel contract as custom_dataset(). Use them when the file path itself should appear in the loader metadata.

Validation#

Problem	Behavior
Missing date column	raises loader/normalization error.
Duplicate dates	raises unless permissive mode is explicitly requested by the loader.
Non-numeric selected variables	coerced or reported by panel normalization.
Missing frequency metadata	allowed, but frequency-aware downstream logic may need explicit `set_frequencies(...)`.