// llm-powered data processing engine

Run any LLM on
every row of
your data.

Ondine is a batch processing engine for tabular data. Feed it a pandas or Polars DataFrame, pick any LLM, get structured columns back. Cost control, checkpointing, and anti-hallucination are built in.

0x
fewer API calls
via multi-row batching
0%
cost reduction
via prefix caching
0.9%
completion rate
with checkpointing
0+
LLM providers
via LiteLLM

Capabilities

Everything a production pipeline needs.

Not a toy wrapper. Ondine handles checkpointing, budget limits, structured output, and anti-hallucination out of the box.

Multi-row batching
Pack N DataFrame rows into one prompt. The LLM processes an entire batch per API call instead of one row at a time.
→ 10–100x API call reduction
Prefix caching
System prompt + schema cached once across all batches. You only pay for the variable row data each time.
→ 40–50% token cost savings
Checkpointing
Results saved after every batch. Resume interrupted jobs without reprocessing completed rows. Crash-safe by default.
→ 99.9% completion rate
Budget limits
Set a max spend before running. Ondine estimates cost upfront and halts when the limit is reached. No surprise invoices.
→ estimate() before run()
Structured output
Define a Pydantic model. Every row comes back parsed, validated, and typed. No string parsing.
→ Pydantic v2 native
Context Store
Anti-hallucination layer. Validates LLM outputs against known facts from your dataset before writing results.
→ Grounded, verifiable outputs
100+ LLM providers
GPT-5.4, Claude, Gemini, Mistral, local Ollama. All through LiteLLM. Switch with one line.
→ Zero vendor lock-in
Cost estimation
Per-batch, per-row, and total cost breakdowns before you spend a single token.
→ Full transparency

Usage

Three lines to start.

Drop into any pandas or Polars workflow. No schemas, no infrastructure, no boilerplate.

import polars as pl  # or: import pandas as pd
from ondine import Ondine

df = pl.read_csv("reviews.csv")  # 10,000 rows — pandas works too

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Classify sentiment: positive, neutral, or negative.",
    batch_size=50,             # 50 rows per API call → 200 calls, not 10,000
)

results = engine.run(df)
print(results[["review", "sentiment"]])
from pydantic import BaseModel
from ondine import Ondine

class ReviewAnalysis(BaseModel):
    sentiment: str          # "positive" | "neutral" | "negative"
    score: int              # 1–10
    key_topic: str

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Analyze this customer review.",
    output_model=ReviewAnalysis,  # fully typed, validated
    batch_size=50,
)

results = engine.run(df)
# results.sentiment, results.score, results.key_topic — all typed columns
engine = Ondine(
    model="gpt-5.4",
    prompt="Summarize this support ticket.",
    batch_size=20,
    max_cost=5.00,             # hard budget limit in USD
)

# Always estimate first
est = engine.estimate(df)
print(f"Estimated: ${est.total_cost:.4f}")
print(f"Batches:   {est.total_batches}")

# Run — stops at $5.00, checkpoints progress
results = engine.run(df, checkpoint="tickets.ckpt.json")
from ondine import Ondine, ContextStore

# Ground LLM outputs against known facts
store = ContextStore.from_dataframe(
    df,
    key_columns=["employee_id", "department"],
)

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Score employee performance 1–10.",
    context_store=store,          # validates outputs vs. known data
    batch_size=50,
)

results = engine.run(df)
# Hallucinated scores flagged — not silently written

How it works

How your data flows.

pandas or Polars DataFrame in, AI-enriched DataFrame out. Ondine handles the pipeline between.

01
Pack rows into batches
Instead of 1 row per API call, Ondine packs N rows into a single prompt. The LLM sees a mini-table and returns N results at once. Batch size is tunable per use case.
02
Cache the shared prefix
Your system prompt and output schema are sent once, then cached. Only the batch data varies between calls. This is where prefix caching cuts 40–50% of your token cost.
03
Checkpoint and reassemble
Results are written after every batch. If a job crashes mid-run, restart picks up exactly where it stopped. Final output reassembles into your original DataFrame shape.

Get started

One command.
LLM-powered data processing.

Works with pandas and Polars, any LLM provider, and your existing data workflow.

$ pip install ondine copied!