High-performance LLM chat and audio transcription powered by GPU. OpenAI-compatible endpoints, zero cloud dependency.
Chat with Qwen 3.5 models (4B, 9B, 27B). Supports streaming, reasoning control, and conversation history.
Transcribe audio files using Whisper models (tiny, small, medium, large-v3). Supports multiple formats and languages.
Translate audio from any language to English using Whisper's built-in translation capabilities.
API key authentication. All data stays on your server. Zero external API calls.