Skip to content

API reference

Full reference for the calcfidata Python package. Generated from in-source docstrings using mkdocstrings.

calcfidata

calcfidata — Python client for CalcFi Open Data.

34 free CC-BY financial and macroeconomic time series mirrored from primary sources (FRED, BLS, Freddie Mac, US Treasury, BEA, World Bank). Lazy-loaded from the Hugging Face dataset mirror so users only download what they need.

Quick start:

>>> import calcfidata as cf
>>> df = cf.load("30-year-fixed")
>>> df.tail()
            date  value     unit
2607  2026-05-15   6.95  percent

>>> series = cf.list_series()
>>> series.head()
                      slug                          title  row_count
0          30-year-fixed   30-Year Fixed Mortgage Rate       2607
1          15-year-fixed   15-Year Fixed Mortgage Rate       1812

Citation: Salmisto, J. (2026). CalcFi Open Data: 34 Free CC-BY Financial and Macro Time Series Mirrored from Primary Sources [Dataset]. Figshare. https://doi.org/10.6084/m9.figshare.32332290

load(slug) cached

Load a single CalcFi Open Data series by slug.

Parameters:

Name Type Description Default
slug str

Series identifier (e.g. "30-year-fixed", "cpi", "usd-eur"). See :func:list_series for the full list.

required

Returns:

Type Description
DataFrame

Columns: date (datetime), value (float), unit (str). Comment-header lines from the source CSV are stripped.

Raises:

Type Description
HTTPError

If the slug is unknown (the underlying CSV returns 404).

Examples:

>>> df = load("30-year-fixed")
>>> df.dtypes
date     datetime64[ns]
value           float64
unit             object
Source code in python/calcfidata/client.py
@lru_cache(maxsize=64)
def load(slug: str) -> pd.DataFrame:
    """
    Load a single CalcFi Open Data series by slug.

    Parameters
    ----------
    slug : str
        Series identifier (e.g. ``"30-year-fixed"``, ``"cpi"``, ``"usd-eur"``).
        See :func:`list_series` for the full list.

    Returns
    -------
    pandas.DataFrame
        Columns: ``date`` (datetime), ``value`` (float), ``unit`` (str).
        Comment-header lines from the source CSV are stripped.

    Raises
    ------
    requests.HTTPError
        If the slug is unknown (the underlying CSV returns 404).

    Examples
    --------
    >>> df = load("30-year-fixed")
    >>> df.dtypes
    date     datetime64[ns]
    value           float64
    unit             object
    """
    url = f"{HF_BASE}/{slug}/data.csv"
    raw = _get(url)
    df = pd.read_csv(io.BytesIO(raw), comment="#")
    if "date" in df.columns:
        df["date"] = pd.to_datetime(df["date"], errors="coerce")
    if "value" in df.columns:
        df["value"] = pd.to_numeric(df["value"], errors="coerce")
    return df.dropna(subset=[c for c in ("date", "value") if c in df.columns])

list_series() cached

Return the full catalog of available series.

Returns:

Type Description
DataFrame

Columns: slug, title, row_count, source, primary_url, unit.

Examples:

>>> cat = list_series()
>>> cat.query("row_count > 1000").shape[0]
13
Source code in python/calcfidata/client.py
@lru_cache(maxsize=1)
def list_series() -> pd.DataFrame:
    """
    Return the full catalog of available series.

    Returns
    -------
    pandas.DataFrame
        Columns: ``slug``, ``title``, ``row_count``, ``source``,
        ``primary_url``, ``unit``.

    Examples
    --------
    >>> cat = list_series()
    >>> cat.query("row_count > 1000").shape[0]
    13
    """
    index_url = f"{HF_BASE}/INDEX.json"
    raw = _get(index_url)
    index = json.loads(raw)
    rows = []
    for item in index["datasets"]:
        slug = item["slug"]
        try:
            meta = metadata(slug)
        except Exception:
            meta = {}
        sources = meta.get("sources") or [{}]
        rows.append(
            {
                "slug": slug,
                "title": meta.get("title", slug),
                "row_count": item.get("rows", 0),
                "source": sources[0].get("title", ""),
                "primary_url": sources[0].get("path", ""),
                "unit": (meta.get("schema") or {}).get("fields", [{}, {}])[1].get("unit", "")
                if (meta.get("schema") or {}).get("fields") else "",
            }
        )
    return pd.DataFrame(rows)

metadata(slug) cached

Return the Frictionless datapackage.json metadata for a series.

Includes the primary-source URL, the canonical CalcFi page, the license, keywords, and the schema for the CSV.

Examples:

>>> meta = metadata("cpi")
>>> meta["sources"][0]["title"]
'BLS via FRED (CPIAUCSL)'
Source code in python/calcfidata/client.py
@lru_cache(maxsize=64)
def metadata(slug: str) -> dict:
    """
    Return the Frictionless ``datapackage.json`` metadata for a series.

    Includes the primary-source URL, the canonical CalcFi page, the license,
    keywords, and the schema for the CSV.

    Examples
    --------
    >>> meta = metadata("cpi")
    >>> meta["sources"][0]["title"]
    'BLS via FRED (CPIAUCSL)'
    """
    url = f"{HF_BASE}/{slug}/datapackage.json"
    raw = _get(url)
    return json.loads(raw)

multi(slugs, align_on='outer')

Load multiple series and join them into a single wide DataFrame.

Parameters:

Name Type Description Default
slugs iterable of str

Series identifiers.

required
align_on ('outer', 'inner', 'left')

Pandas merge how argument. "outer" keeps all dates (NaN where a series doesn't have an observation), "inner" keeps only dates where every series has a value.

"outer"

Returns:

Type Description
DataFrame

Indexed on date; one column per slug containing that slug's value.

Examples:

>>> df = multi(["cpi", "pce"], align_on="inner")
>>> df.columns.tolist()
['cpi', 'pce']
Source code in python/calcfidata/client.py
def multi(slugs: Iterable[str], align_on: str = "outer") -> pd.DataFrame:
    """
    Load multiple series and join them into a single wide DataFrame.

    Parameters
    ----------
    slugs : iterable of str
        Series identifiers.
    align_on : {"outer", "inner", "left"}
        Pandas merge ``how`` argument. ``"outer"`` keeps all dates (NaN where a
        series doesn't have an observation), ``"inner"`` keeps only dates where
        every series has a value.

    Returns
    -------
    pandas.DataFrame
        Indexed on ``date``; one column per slug containing that slug's value.

    Examples
    --------
    >>> df = multi(["cpi", "pce"], align_on="inner")
    >>> df.columns.tolist()
    ['cpi', 'pce']
    """
    if align_on not in {"outer", "inner", "left"}:
        raise ValueError(f"align_on must be outer|inner|left, got {align_on!r}")
    frames = []
    for slug in slugs:
        df = load(slug)[["date", "value"]].rename(columns={"value": slug})
        frames.append(df.set_index("date"))
    if not frames:
        return pd.DataFrame()
    result = frames[0]
    for df in frames[1:]:
        result = result.join(df, how=align_on)
    return result.sort_index()

Module constants

The package exposes a few module-level constants you can use to construct URLs without hard-coding:

Name Description
calcfidata.DATASET_URL Canonical Hugging Face dataset page
calcfidata.HF_BASE Base URL for resolving series CSV/JSON files
calcfidata.__version__ Installed package version
calcfidata.__doi__ Permanent Figshare DOI of the underlying dataset
calcfidata.__orcid__ Author ORCID iD

Error handling

calcfidata.load(slug) raises requests.HTTPError if the slug is unknown — the underlying CSV returns a 404 from Hugging Face. Wrap calls in try/except if you're loading from a user-supplied slug list:

import requests
import calcfidata as cf

try:
    df = cf.load(user_input)
except requests.HTTPError:
    print(f"Series '{user_input}' not found. See cf.list_series() for valid slugs.")

Threading

The internal requests.get calls are not protected by a lock, but functools.lru_cache is thread-safe in CPython. You can call cf.load() concurrently from multiple threads — duplicate network requests for the same slug may fire if the cache is cold, but the result is consistent.

Versioning

calcfidata follows Semantic Versioning. The data schema is part of the contract: changes to column names or types require a major version bump. New series are additive (minor bump). Bug fixes and provenance-comment improvements are patch bumps.

See the changelog for release notes.