RAG Pipeline

Enterprise-grade retrieval, tuned for support

Vector Embeddings

Voyage AI voyage-3 (1024 dimensions) by default. Switch to Ollama mxbai-embed-large for zero external API calls — no code changes, just an environment flag.

PostgreSQL + pgvector

Vectors stored in PostgreSQL 16 with the pgvector extension. HNSW index for fast approximate nearest-neighbour search at scale. Cosine distance (<=>) operator used throughout.

Retrieval & Reranking

Top-k=10 candidates retrieved, then reranked to top-n=5. Similarity threshold of 0.65 filters out low-quality matches before they reach the LLM — improving precision on narrow queries.

Smart Chunking

Recursive character splitter with 300-token chunks and 30-token overlap (both configurable). Preserves sentence boundaries for better semantic coherence across document types.

LLM Flexibility

Default to Anthropic Claude Haiku 4.5. On-prem fallback to Ollama with phi4-mini, llama3.2:3b/1b, qwen3:4b, or ministral-3:3b. Swap any without touching business logic.

Document Ingestion

PDF (PyMuPDF), DOCX, TXT, Markdown, HTML, CSV — up to 50 MB per file. Live progress polling during reprocessing so operators always know when knowledge is fresh.

Channels

Three channels, one admin panel

Embeddable JS Widget

Async script tag. API key format cb_live_*. Configurable branding, personality, and language. Appointment booking built in. Streams responses via WebSocket for real-time feel.

WhatsApp via Meta Cloud API

Connect your existing WhatsApp Business number. Webhook-based message routing through Kiedo's escalation and RAG pipeline. Full sentiment analysis and operator handoff support.

Telegram via aiogram

Built on the aiogram async framework for reliable, high-throughput Telegram integration. Same RAG pipeline, same operator console, same analytics — just a different channel endpoint.

Operators & Escalation

When AI isn't enough, humans take over instantly

Confidence-Scored Escalation

Every bot response carries a confidence score. When it drops below the configured threshold, an <ESCALATE reason=...> tag triggers a real-time operator alert via /ws/admin. The reason is visible in the console.

Multi-Role User System

Three role tiers: superadmin (platform-wide access), tenant admin (manages their bot and documents), and operator (handles live conversations and escalations). Granular permissions at every level.

Appointment & Booking Integration

The widget includes a booking flow natively. Customers can schedule calls or appointments directly in the chat — without leaving the conversation or navigating to a separate form.

Sentiment Analysis

Every conversation turn is scored for sentiment. Operators see the emotional arc of a conversation at a glance — so they can prioritise frustrated customers before escalation triggers.

Analytics

Observability as a first-class citizen

Grafana Dashboards — Provisioned

Grafana ships bundled with Kiedo and provisions its dashboards on deploy (port 3001). No manual setup. Conversation volumes, response times, escalation rates, and model latency — all wired up on first boot.

OpenTelemetry Tracing

OTel instrumentation ships built in — spans on embedding calls, RAG retrieval, bot engine processing, and all database calls. Connect to any OTel-compatible backend (Grafana Tempo, Jaeger, Datadog).

Per-Turn bot_traces Table

Every bot interaction writes a row to the bot_traces telemetry table — embedding latency, retrieved chunks, rerank scores, LLM response time, and final confidence score. Queryable with SQL.

Admin & Multi-Tenancy

Built for agencies and platform builders

React 18 Admin Panel

Built with React 18, Vite, and Tailwind CSS. Manage tenants, documents, bots, and billing from a single interface. Responsive and accessible out of the box.

White-Label & Reseller

Per-tenant personality customisation and a global prompt library. Agencies can resell Kiedo under their own brand — tenant admins never see "Kiedo" unless you want them to.

Pluggable Billing

Stripe, crypto, manual, or simulated providers — configured at deploy time. Subscription plans, promo codes, and a superadmin billing management view included. Swap providers without touching business logic.

Webhook Support

Webhooks for Stripe, Telegram, and WhatsApp events all handled at /api/v1/webhooks/*. Full REST API at /api/v1/* with OpenAPI docs available at /api/docs.

Security

Enterprise-ready security, out of the box

Argon2 + JWT RS256

Passwords hashed with Argon2id. JWT RS256 tokens with 15-minute access tokens and 7-day refresh tokens. Session token rotation on each refresh.

TOTP Two-Factor Auth

2FA via time-based OTP (TOTP) with QR code enrollment. Available for all user roles. Works with any standard authenticator app (Google Authenticator, Authy, 1Password).

Encrypted Secrets at Rest

All tenant secrets (API keys, webhook tokens) are encrypted at rest using the configured encryption_key. No plaintext secrets in the database.

Rate Limiting & Lockout

Login rate-limiting and account lockout policies via Redis. Configurable thresholds per tenant. Protects against credential-stuffing and brute-force attacks without additional infrastructure.

Full security overview →

See every feature in action

Our team will walk you through a live demo tailored to your use case — from RAG pipeline tuning to operator console setup.

Book a demo