Blog
Notes on private AI & the chatbot business
Essays on RAG, BYOK strategy, open-weight models, flat-rate billing, and the unglamorous work of running a chatbot SaaS in 2026.
RSS feed →May 24, 2026
How RAG actually works (in 5 paragraphs, no jargon)
What actually happens between "user asks" and "model answers" in a doc-grounded chatbot.
RAGengineeringprimer
May 23, 2026
Why we charge flat-rate for AI chatbots (and most competitors don't)
Per-token billing taxes the moments your business is winning. Here's why we don't.
pricingbusiness modelAI chatbots
May 22, 2026
BYOK vs. hosted models: when each one makes sense
Bring-your-own-key vs. hosted open-weight: when each one is the right call.
BYOKpricingengineering
May 21, 2026
RAG vs. fine-tuning for chatbots: which one are you actually doing?
Two techniques people confuse constantly. Quick guide to which one solves your actual problem.
RAGfine-tuningengineering
May 20, 2026
Own the inference stack, or rent it? The choice that shapes a chatbot SaaS
Renting OpenAI vs. operating your own inference: the cost-curve trade-off behind chatbot pricing.
infrastructurepricingstrategy
May 18, 2026
The customer-support metrics that actually matter after you add AI
CSAT, deflection rate, escalation accuracy. What changes when 60-80% of tier-1 goes to a bot.
customer supportmetricsoperations
