Multi-LLM intelligent chat with RAG context, audio transcription, session management, and embeddable widgets
OpenRails AI Chat is the primary user-facing interface of the platform. It provides a rich, real-time conversational experience powered by multiple large language models, enhanced with retrieval-augmented generation (RAG) from your organization's knowledge base. Users can chat with AI that has full context of ingested documents, receive source-cited answers, transcribe audio, and interact through an embeddable widget on external sites.
Key Value: Unlike single-vendor chatbots, OpenRails AI Chat lets you route conversations to the best model for each task — use local self-hosted models for sensitive data, OpenAI for creative tasks, and Anthropic for analytical work — all within the same interface.
Support for multiple LLM providers including local self-hosted models, OpenAI (GPT-4o, GPT-4), Anthropic (Claude), and Google (Gemini). Switch models per conversation or per project. No code changes required when switching providers.
Every response can be enriched with context from ingested documents. Dual retrieval using vector search and knowledge graph ensures accurate, relationship-aware answers with source citations.
Built-in speech-to-text integration for audio transcription. Users can send voice messages that are automatically transcribed and processed by the AI. Supports multiple languages and audio formats.
Persistent conversation history with search, bookmarking, and export. Conversations are organized by project and can be shared with team members while respecting access controls.
Deploy AI chat on any website with a single script tag. JWT-authenticated, domain-whitelisted, and fully customizable with your brand colors and avatar. Full details on the Widget Embedding feature sheet.
Token-by-token streaming via WebSocket for instant responsiveness. Users see answers as they are generated, with progress indicators and the ability to stop generation mid-stream.
| Feature | Implementation | Details |
|---|---|---|
| LLM Abstraction | Unified provider interface | Swap models without application changes; per-project model configuration |
| Streaming | Real-time streaming | Smooth, real-time delivery |
| RAG Pipeline | Vector + Knowledge Graph | Dual retrieval for accurate, relevant answers |
| Audio | Speech-to-Text Engine | Multi-language transcription, WAV/MP3/M4A support |
| Context Window | Automatic management | Smart truncation and summarization for long conversations |
| File Attachments | Inline processing | Attach documents directly in chat for instant analysis |
Employees ask questions against company documentation and get cited answers instantly
Embed the widget on support pages for AI-powered self-service with escalation to agents
New employees interact with a knowledge-rich AI trained on company procedures and policies
Lawyers query case law and contracts with semantic search and relationship-aware retrieval
Analysts ask questions about ingested reports and receive summarized insights with sources
Leverage multi-language LLMs and speech-to-text transcription for global team collaboration
User Input → Context Assembly (RAG retrieval + conversation history) → LLM Router (model selection) → Streaming Response → Source Citations
Optional: Audio transcription (speech-to-text) before context assembly | PII de-identification before LLM call