API reference¶

Full reference for the calcfidata Python package. Generated from in-source docstrings using mkdocstrings.

`calcfidata` ¶

calcfidata — Python client for CalcFi Open Data.

34 free CC-BY financial and macroeconomic time series mirrored from primary sources (FRED, BLS, Freddie Mac, US Treasury, BEA, World Bank). Lazy-loaded from the Hugging Face dataset mirror so users only download what they need.

Quick start:

>>> import calcfidata as cf
>>> df = cf.load("30-year-fixed")
>>> df.tail()
            date  value     unit
2607  2026-05-15   6.95  percent

>>> series = cf.list_series()
>>> series.head()
                      slug                          title  row_count
0          30-year-fixed   30-Year Fixed Mortgage Rate       2607
1          15-year-fixed   15-Year Fixed Mortgage Rate       1812

Citation: Salmisto, J. (2026). CalcFi Open Data: 34 Free CC-BY Financial and Macro Time Series Mirrored from Primary Sources [Dataset]. Figshare. https://doi.org/10.6084/m9.figshare.32332290

`load(slug)` `cached` ¶

Load a single CalcFi Open Data series by slug.

Parameters:

Name	Type	Description	Default
`slug`	`str`	Series identifier (e.g. `"30-year-fixed"`, `"cpi"`, `"usd-eur"`). See :func:`list_series` for the full list.	required

Returns:

Type	Description
`DataFrame`	Columns: `date` (datetime), `value` (float), `unit` (str). Comment-header lines from the source CSV are stripped.

Raises:

Type	Description
`HTTPError`	If the slug is unknown (the underlying CSV returns 404).

Examples:

>>> df = load("30-year-fixed")
>>> df.dtypes
date     datetime64[ns]
value           float64
unit             object

Source code in python/calcfidata/client.py

@lru_cache(maxsize=64)
def load(slug: str) -> pd.DataFrame:
    """
    Load a single CalcFi Open Data series by slug.

    Parameters
    ----------
    slug : str
        Series identifier (e.g. ``"30-year-fixed"``, ``"cpi"``, ``"usd-eur"``).
        See :func:`list_series` for the full list.

    Returns
    -------
    pandas.DataFrame
        Columns: ``date`` (datetime), ``value`` (float), ``unit`` (str).
        Comment-header lines from the source CSV are stripped.

    Raises
    ------
    requests.HTTPError
        If the slug is unknown (the underlying CSV returns 404).

    Examples
    --------
    >>> df = load("30-year-fixed")
    >>> df.dtypes
    date     datetime64[ns]
    value           float64
    unit             object
    """
    url = f"{HF_BASE}/{slug}/data.csv"
    raw = _get(url)
    df = pd.read_csv(io.BytesIO(raw), comment="#")
    if "date" in df.columns:
        df["date"] = pd.to_datetime(df["date"], errors="coerce")
    if "value" in df.columns:
        df["value"] = pd.to_numeric(df["value"], errors="coerce")
    return df.dropna(subset=[c for c in ("date", "value") if c in df.columns])

`list_series()` `cached` ¶

Return the full catalog of available series.

Returns:

Type	Description
`DataFrame`	Columns: `slug`, `title`, `row_count`, `source`, `primary_url`, `unit`.

Examples:

>>> cat = list_series()
>>> cat.query("row_count > 1000").shape[0]
13

Source code in python/calcfidata/client.py

@lru_cache(maxsize=1)
def list_series() -> pd.DataFrame:
    """
    Return the full catalog of available series.

    Returns
    -------
    pandas.DataFrame
        Columns: ``slug``, ``title``, ``row_count``, ``source``,
        ``primary_url``, ``unit``.

    Examples
    --------
    >>> cat = list_series()
    >>> cat.query("row_count > 1000").shape[0]
    13
    """
    index_url = f"{HF_BASE}/INDEX.json"
    raw = _get(index_url)
    index = json.loads(raw)
    rows = []
    for item in index["datasets"]:
        slug = item["slug"]
        try:
            meta = metadata(slug)
        except Exception:
            meta = {}
        sources = meta.get("sources") or [{}]
        rows.append(
            {
                "slug": slug,
                "title": meta.get("title", slug),
                "row_count": item.get("rows", 0),
                "source": sources[0].get("title", ""),
                "primary_url": sources[0].get("path", ""),
                "unit": (meta.get("schema") or {}).get("fields", [{}, {}])[1].get("unit", "")
                if (meta.get("schema") or {}).get("fields") else "",
            }
        )
    return pd.DataFrame(rows)

`metadata(slug)` `cached` ¶

Return the Frictionless datapackage.json metadata for a series.

Includes the primary-source URL, the canonical CalcFi page, the license, keywords, and the schema for the CSV.

Examples:

>>> meta = metadata("cpi")
>>> meta["sources"][0]["title"]
'BLS via FRED (CPIAUCSL)'

Source code in python/calcfidata/client.py

@lru_cache(maxsize=64)
def metadata(slug: str) -> dict:
    """
    Return the Frictionless ``datapackage.json`` metadata for a series.

    Includes the primary-source URL, the canonical CalcFi page, the license,
    keywords, and the schema for the CSV.

    Examples
    --------
    >>> meta = metadata("cpi")
    >>> meta["sources"][0]["title"]
    'BLS via FRED (CPIAUCSL)'
    """
    url = f"{HF_BASE}/{slug}/datapackage.json"
    raw = _get(url)
    return json.loads(raw)

`multi(slugs, align_on='outer')` ¶

Load multiple series and join them into a single wide DataFrame.

Parameters:

Name	Type	Description	Default
`slugs`	`iterable of str`	Series identifiers.	required
`align_on`	`('outer', 'inner', 'left')`	Pandas merge `how` argument. `"outer"` keeps all dates (NaN where a series doesn't have an observation), `"inner"` keeps only dates where every series has a value.	`"outer"`

Returns:

Type	Description
`DataFrame`	Indexed on `date`; one column per slug containing that slug's value.

Examples:

>>> df = multi(["cpi", "pce"], align_on="inner")
>>> df.columns.tolist()
['cpi', 'pce']

Source code in python/calcfidata/client.py

def multi(slugs: Iterable[str], align_on: str = "outer") -> pd.DataFrame:
    """
    Load multiple series and join them into a single wide DataFrame.

    Parameters
    ----------
    slugs : iterable of str
        Series identifiers.
    align_on : {"outer", "inner", "left"}
        Pandas merge ``how`` argument. ``"outer"`` keeps all dates (NaN where a
        series doesn't have an observation), ``"inner"`` keeps only dates where
        every series has a value.

    Returns
    -------
    pandas.DataFrame
        Indexed on ``date``; one column per slug containing that slug's value.

    Examples
    --------
    >>> df = multi(["cpi", "pce"], align_on="inner")
    >>> df.columns.tolist()
    ['cpi', 'pce']
    """
    if align_on not in {"outer", "inner", "left"}:
        raise ValueError(f"align_on must be outer|inner|left, got {align_on!r}")
    frames = []
    for slug in slugs:
        df = load(slug)[["date", "value"]].rename(columns={"value": slug})
        frames.append(df.set_index("date"))
    if not frames:
        return pd.DataFrame()
    result = frames[0]
    for df in frames[1:]:
        result = result.join(df, how=align_on)
    return result.sort_index()

Module constants¶

The package exposes a few module-level constants you can use to construct URLs without hard-coding:

Name	Description
`calcfidata.DATASET_URL`	Canonical Hugging Face dataset page
`calcfidata.HF_BASE`	Base URL for resolving series CSV/JSON files
`calcfidata.__version__`	Installed package version
`calcfidata.__doi__`	Permanent Figshare DOI of the underlying dataset
`calcfidata.__orcid__`	Author ORCID iD

Error handling¶

calcfidata.load(slug) raises requests.HTTPError if the slug is unknown — the underlying CSV returns a 404 from Hugging Face. Wrap calls in try/except if you're loading from a user-supplied slug list:

import requests
import calcfidata as cf

try:
    df = cf.load(user_input)
except requests.HTTPError:
    print(f"Series '{user_input}' not found. See cf.list_series() for valid slugs.")

Threading¶

The internal requests.get calls are not protected by a lock, but functools.lru_cache is thread-safe in CPython. You can call cf.load() concurrently from multiple threads — duplicate network requests for the same slug may fire if the cache is cold, but the result is consistent.

Versioning¶

calcfidata follows Semantic Versioning. The data schema is part of the contract: changes to column names or types require a major version bump. New series are additive (minor bump). Bug fixes and provenance-comment improvements are patch bumps.

See the changelog for release notes.

API reference¶

calcfidata ¶

load(slug) cached ¶

list_series() cached ¶

metadata(slug) cached ¶

multi(slugs, align_on='outer') ¶

Module constants¶

Error handling¶

Threading¶

Versioning¶

`calcfidata` ¶

`load(slug)` `cached` ¶

`list_series()` `cached` ¶

`metadata(slug)` `cached` ¶

`multi(slugs, align_on='outer')` ¶