May 23, 2026·pricing · business model

Why we charge flat-rate for AI chatbots (and most competitors don't)

Per-token billing is fine for developers. For everyone else, it's a meter that taxes the moments your business is winning.

If you've shopped for a chatbot SaaS in the last year, you've seen a familiar pattern: a free tier with 100 messages, a "Starter" plan at $19/mo with 500 messages, and then per-message overages that scale linearly with your traffic. It looks reasonable. It's designed to look reasonable.

Here's the math problem nobody warns you about: the moments your business does best are the moments your bill spikes. A viral page sends 10x your normal traffic. The chatbot answers 10x the questions. Your bill is 10x. A chatty customer asks 50 follow-up questions instead of 5. Your bill goes up. None of this is wrong, exactly. It's also not what you signed up for.

What flat-rate actually costs us

Most chatbot SaaS providers route inference through OpenAI or Anthropic. They pay per token, you pay per token (with markup), and the meter is the meter. Ashh.ai runs open-weight models on dedicated infrastructure we operate. Marginal cost per token is electricity, not API spend. Within reasonable usage limits (10k–100k messages/month per tier), we can afford to not pass the meter along.

That doesn't make us better; it makes us different. If you need GPT-4o or Claude Sonnet 4.6, those are cloud models with real per-token costs, and you BYOK them — your key, your bill. We don't mark up your provider relationship.

Where the bet pays off

Three places, in our experience:

Procurement. Enterprise buyers can't approve a contract with variable monthly cost. "Up to $50/mo" is a budget item. "Up to $X based on usage" is a research project.
Customer experience. When a chatbot answers gets expensive, the temptation is to throttle it. Lower the message cap. Add a "you've reached your free message limit" gate. The chatbot stops being useful at exactly the moment a user needs it most. Flat-rate removes that temptation.
Trust. You aren't watching a meter spin every time a customer asks a question.

What we give up

We're probably leaving money on the table on the top end. A customer doing 500k messages a month on a $50/mo Team plan is a loss-leader for us. The bet is that they're also our best advocate, and the next 100 customers they bring in pay full rate.

If that math stops working, we'll add a higher tier — not switch to per-token. Per-token billing is a tax on success, and we'd rather lose customers than tax their good days.

Build a private AI chatbot in 5 minutes.

Flat-rate. Your data never used to train anyone else's models.

Start free →