·customer support · metrics
The customer-support metrics that actually matter after you add AI
CSAT, deflection rate, escalation accuracy, time-to-handoff. What changes when 60-80% of your tier-1 tickets go to a chatbot.
When an AI chatbot deflects 60-80% of your tier-1 support tickets, the conventional metrics stop telling you what you need to know. Average response time was 8 hours — now it's 2 seconds. Tickets per agent dropped. CSAT… went up, but only because the bot answered the easy questions before they could ever frustrate a customer waiting for a human.
Here are the metrics that actually matter once an AI is in the loop, and what to watch for.
1. Deflection rate (with caveats)
The headline metric — what percent of incoming conversations the bot fully resolves without a human ever touching them. Easy to measure (conversation closed, no escalation flag), easy to game (count every visitor that arrived and didn't escalate, regardless of whether they got a real answer).
The real question: what percent of genuinely answerable questions are getting answered? Pull a sample of bot conversations weekly. Have a human grade them on a 1-5 scale: did the bot actually solve what the visitor asked? Watch the trend, not the absolute number.
2. Escalation accuracy
When the bot says "let me get you to a human," is the human getting useful context? Or is the visitor restarting from scratch?
Two failure modes here: over-escalation (the bot punts on questions it should answer, taxing your human team) and under-escalation (the bot tries to handle complex issues and the visitor ends up emailing support a second time, frustrated). Track both. The first is a confidence-threshold tuning issue; the second is a knowledge-base completeness issue.
3. Time-to-handoff (when escalation happens)
Average time from "visitor asks question" to "human agent says hello" when the bot escalates. A well-tuned bot should hand off within 2-3 messages — not 10. Long handoffs mean the bot is wasting the visitor's time guessing before admitting it doesn't know.
4. CSAT, split by path
Average CSAT goes up after AI. That's misleading. Split it: CSAT for visitors who got a bot-only resolution, CSAT for visitors who escalated to a human. The bot-only CSAT is usually high (instant answer = happy customer). The escalated CSAT is usually lower — those visitors started annoyed and the bot maybe made it worse. The aggregate looks great because the bot path dominates volume. Don't let it hide the escalated-path problem.
5. Repeat-question rate
What percent of visitors who get a bot answer come back within 24 hours with a follow-up question? Some follow-up is normal. Excessive follow-up means the bot is giving partial answers that don't actually resolve the underlying need.
6. Knowledge-base gap density
For every bot conversation where the answer wasn't in the docs (and the bot correctly admitted it), file a "doc gap" ticket. Track the gap-fill rate. If you're creating gaps faster than you're filling them, the bot's deflection rate will quietly degrade over time even as the metric still looks fine.
7. Cost per resolution
This is what your CFO actually cares about. Old metric: support team cost / monthly tickets. New metric: (support team cost + AI chatbot cost) / (deflected + escalated conversations). If you're on flat-rate AI billing, the marginal cost per deflected conversation approaches zero. If you're on per-resolution pricing (Fin, etc.), this is where your margin actually lives.
The dashboard you want
One screen, six numbers, updated weekly:
- Deflection rate (with sample-graded accuracy)
- Escalation accuracy (over + under)
- Bot CSAT vs. human CSAT
- Repeat-question rate
- New doc gaps this week
- Cost per resolution
This is the difference between deploying an AI chatbot and actually running one. The first is a vendor purchase. The second is an operational discipline.
Build a private AI chatbot in 5 minutes.
Flat-rate. Your data never used to train anyone else's models.
Start free →