Docs · TEAM tier
Ashh.ai hosts the dashboard, RAG index, and chat plumbing. You host the GPU. Bots routed through your endpoint never send a single token to a third-party AI provider — your data stays on your hardware.
Pick whichever matches your network. Both work the same from the bot's perspective.
Outbound-only Go binary on the GPU box. No inbound ports, no DNS, no certs. Best for corporate firewalls, airgapped LANs, and anyone who doesn't want to operate a reverse proxy.
You front your Ollama with TLS + a bearer token. Ashh.ai calls your URL with the saved auth header. Best when you already operate a reverse proxy and want one less moving piece.
curl -fsSL https://ashh.ai/install.sh | sudo sh -s -- --pair lc_live_…The installer auto-detects your OS + CPU (Linux x86_64, Linux arm64, macOS Intel, macOS Apple Silicon), downloads the right binary, pairs, and on Linux installs a
ashh-connector systemd service that auto-starts on boot.Download the binary directly from /downloads/ashh-connector-<os>-<arch>,chmod +x it, then run ./ashh-connector --pair lc_live_… followed by ./ashh-connector. Same outcome, more steps. See connector/README.md for the systemd unit.
curl -fsSL https://ashh.ai/install.sh | sudo sh -s -- --uninstall
Pick whichever you're already comfortable operating. All three work identically — Ashh.ai calls https://your-url/api/chat with your auth header on every request.
Free, requires only a Cloudflare account + a domain on Cloudflare. Outbound-only from your GPU box; Cloudflare provides the public HTTPS endpoint.
cloudflared on your GPU box, run cloudflared tunnel login.cloudflared tunnel create ashh-gpucloudflared tunnel route dns ashh-gpu gpu.yourcompany.com~/.cloudflared/config.yml:tunnel: ashh-gpu
credentials-file: /home/you/.cloudflared/<id>.json
ingress:
- hostname: gpu.yourcompany.com
service: http://localhost:11434
originRequest:
httpHostHeader: localhost:11434
- service: http_status:404gpu.yourcompany.com with a service-token policy. Cloudflare gives you a CF-Access-Client-Id and CF-Access-Client-Secret pair.https://gpu.yourcompany.com and Auth header to CF-Access-Client-Id: <id>\nCF-Access-Client-Secret: <secret>(only one header per Ashh.ai field today — use a small Caddy or Cloudflare Worker if you need both).Simplest if you already use Tailscale. Funnel makes a tailnet service public over HTTPS without opening ports. No DNS to manage.
tailscale up.# /etc/caddy/Caddyfile
:9080 {
@authed header Authorization "Bearer your-secret-token"
handle @authed {
reverse_proxy localhost:11434
}
handle {
respond "Unauthorized" 401
}
}tailscale funnel --bg --https=443 9080https://gpu-box.tail-XXXX.ts.net and Auth header to Authorization: Bearer your-secret-token.Most full-control option. You own a domain, point an A record at your box, Caddy auto-provisions Let's Encrypt and proxies to Ollama.
# Caddyfile
gpu.yourcompany.com {
@authed header Authorization "Bearer your-secret-token"
handle @authed {
reverse_proxy localhost:11434
}
handle {
respond "Unauthorized" 401
}
}Then in Ashh.ai: Base URL https://gpu.yourcompany.com, Auth header Authorization: Bearer your-secret-token.
If your endpoint goes offline (connector crash, network down, reverse proxy 5xx), the bot falls back to the platform Ollama rather than 500-ing the chat — so visitors see some reply instead of an error. You'll see a yellow "stale" or red "error" pill on the endpoint detail page.
/bots/[id]. Per-bot retention policies coming soon.