// llm-powered data processing engine

Run any LLM on
every row of
your data.

Ondine is a batch processing engine for tabular data. Feed it a pandas or Polars DataFrame, pick any LLM, get structured columns back. Cost control, checkpointing, and anti-hallucination are built in.

pip install ondine Star on GitHub Read docs →

python — ondine

$ python analyze_reviews.py

◆ Loading reviews.csv — 10,000 rows × 4 columns

◆ Task: classify sentiment, score 1–10, extract topic

◆ Model: gpt-5.4-mini · Batch size: 50 rows/call

◆ Estimated cost: $2.30

◆ Processing rows...

0% — 10,000 rows

✓ Complete. 3 new columns added:

→ sentiment (str) positive | neutral | negative

→ score (int) 1–10

→ key_topic (str) free text

✓ Saved to reviews_analyzed.csv · $2.28 spent

Capabilities

Everything a production pipeline needs.

Not a toy wrapper. Ondine handles checkpointing, budget limits, structured output, and anti-hallucination out of the box.

Multi-row batching

Pack N DataFrame rows into one prompt. The LLM processes an entire batch per API call instead of one row at a time.

→ 10–100x API call reduction

Prefix caching

System prompt + schema cached once across all batches. You only pay for the variable row data each time.

→ 40–50% token cost savings

Checkpointing

Results saved after every batch. Resume interrupted jobs without reprocessing completed rows. Crash-safe by default.

→ 99.9% completion rate

Budget limits

Set a max spend before running. Ondine estimates cost upfront and halts when the limit is reached. No surprise invoices.

→ estimate() before run()

Structured output

Define a Pydantic model. Every row comes back parsed, validated, and typed. No string parsing.

→ Pydantic v2 native

Context Store

Anti-hallucination layer. Validates LLM outputs against known facts from your dataset before writing results.

→ Grounded, verifiable outputs

100+ LLM providers

GPT-5.4, Claude, Gemini, Mistral, local Ollama. All through LiteLLM. Switch with one line.

→ Zero vendor lock-in

Cost estimation

Per-batch, per-row, and total cost breakdowns before you spend a single token.

→ Full transparency

Usage

Three lines to start.

Drop into any pandas or Polars workflow. No schemas, no infrastructure, no boilerplate.

Basic

Structured output

Budget control

Context Store

import polars as pl  # or: import pandas as pd
from ondine import Ondine

df = pl.read_csv("reviews.csv")  # 10,000 rows — pandas works too

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Classify sentiment: positive, neutral, or negative.",
    batch_size=50,             # 50 rows per API call → 200 calls, not 10,000
)

results = engine.run(df)
print(results[["review", "sentiment"]])

from pydantic import BaseModel
from ondine import Ondine

class ReviewAnalysis(BaseModel):
    sentiment: str          # "positive" | "neutral" | "negative"
    score: int              # 1–10
    key_topic: str

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Analyze this customer review.",
    output_model=ReviewAnalysis,  # fully typed, validated
    batch_size=50,
)

results = engine.run(df)
# results.sentiment, results.score, results.key_topic — all typed columns

engine = Ondine(
    model="gpt-5.4",
    prompt="Summarize this support ticket.",
    batch_size=20,
    max_cost=5.00,             # hard budget limit in USD
)

# Always estimate first
est = engine.estimate(df)
print(f"Estimated: ${est.total_cost:.4f}")
print(f"Batches:   {est.total_batches}")

# Run — stops at $5.00, checkpoints progress
results = engine.run(df, checkpoint="tickets.ckpt.json")

from ondine import Ondine, ContextStore

# Ground LLM outputs against known facts
store = ContextStore.from_dataframe(
    df,
    key_columns=["employee_id", "department"],
)

engine = Ondine(
    model="gpt-5.4-mini",
    prompt="Score employee performance 1–10.",
    context_store=store,          # validates outputs vs. known data
    batch_size=50,
)

results = engine.run(df)
# Hallucinated scores flagged — not silently written

How it works

How your data flows.

pandas or Polars DataFrame in, AI-enriched DataFrame out. Ondine handles the pipeline between.

Pack rows into batches

Instead of 1 row per API call, Ondine packs N rows into a single prompt. The LLM sees a mini-table and returns N results at once. Batch size is tunable per use case.

Cache the shared prefix

Your system prompt and output schema are sent once, then cached. Only the batch data varies between calls. This is where prefix caching cuts 40–50% of your token cost.

Checkpoint and reassemble

Results are written after every batch. If a job crashes mid-run, restart picks up exactly where it stopped. Final output reassembles into your original DataFrame shape.

Run any LLM on every row of your data.

Everything a production pipeline needs.

Three lines to start.

How your data flows.

One command.LLM-powered data processing.

Run any LLM on
every row of
your data.

One command.
LLM-powered data processing.