Field notes

Prompt optimization, evolutionary search, and what we're learning about measuring accuracy in production.

How to evaluate AI support handoff before it costs you customers

A practical checklist for testing whether an AI support agent should answer, ask a clarifying question, create a ticket, or hand off to a human.

2026-05-11 · 4 min read

Case study

Case study — evolving a cold email classifier from 60% to 87% accuracy

Real before/after from our benchmark runs. A three-sentence personalization detector became a multi-criteria rubric. The lift came from structure, not wording.

2026-05-06 · 4 min read

Launch

Introducing Cambrian Lab — prompt optimization with measured accuracy

You paste a prompt, give us 5 examples, and we hand back a measurably better version. Here's how it works, what it costs, and what we learned building it.

2026-05-05 · 4 min read

Landscape

Prompt optimization in 2026 — a landscape analysis

Eight platforms doing prompt optimization, categorized by approach. What each does well, where each falls short, and what's genuinely missing in the category.

2026-04-28 · 6 min read