Self-Hosted AI API

High-performance LLM chat and audio transcription powered by GPU. OpenAI-compatible endpoints, zero cloud dependency.

Start Chatting Transcribe Audio
💬

LLM Chat

Chat with Qwen 3.5 models (4B, 9B, 27B). Supports streaming, reasoning control, and conversation history.

🎙️

Audio Transcription

Transcribe audio files using Whisper models (tiny, small, medium, large-v3). Supports multiple formats and languages.

🌐

Translation

Translate audio from any language to English using Whisper's built-in translation capabilities.

🔒

Secure

API key authentication. All data stays on your server. Zero external API calls.