Blog

Notes on private AI & the chatbot business

Essays on RAG, BYOK strategy, open-weight models, flat-rate billing, and the unglamorous work of running a chatbot SaaS in 2026.

May 24, 2026

How RAG actually works (in 5 paragraphs, no jargon)

What actually happens between "user asks" and "model answers" in a doc-grounded chatbot.

May 23, 2026

Per-token billing taxes the moments your business is winning. Here's why we don't.

May 22, 2026

Bring-your-own-key vs. hosted open-weight: when each one is the right call.

May 21, 2026

Two techniques people confuse constantly. Quick guide to which one solves your actual problem.

May 20, 2026

Renting OpenAI vs. operating your own inference: the cost-curve trade-off behind chatbot pricing.

May 18, 2026

CSAT, deflection rate, escalation accuracy. What changes when 60-80% of tier-1 goes to a bot.