OpenAI-compatible REST API for LLM chat and audio transcription.
All requests require an API key in the Authorization header:
List all available models.
| Model | Parameters | Type |
|---|---|---|
qwen3.5:27b-Q4_K_M | 27B | LLM |
qwen3.5:9B-Q4_K_M | 9B | LLM |
qwen3.5:4b-q4_K_M | 4B | LLM |
qwen2.5:0.5b | 0.5B | LLM |
| Model | Accuracy | Type |
|---|---|---|
whisper-large-v3 | Best | STT |
whisper-medium | Very Good | STT |
whisper-small | Good | STT |
whisper-tiny | Basic | STT |
Generate a chat completion using an LLM model.
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | LLM model ID |
messages | array | Yes | Conversation messages [{role, content}] |
stream | boolean | No | Stream response (default: false) |
reasoning_effort | string | No | none, low, medium, high (default: none) |
temperature | float | No | 0.0 to 1.0 (default: 0.7) |
max_tokens | integer | No | Maximum tokens to generate |
Transcribe audio to text using Whisper.
| Parameter | Type | Required | Description |
|---|---|---|---|
file | file | Yes | Audio file (wav, mp3, ogg, flac, webm, mp4) |
model | string | No | Whisper model (default: whisper-large-v3) |
language | string | No | ISO-639-1 code (en, fr, es, etc.) |
response_format | string | No | json, text, verbose_json, srt, vtt |
temperature | float | No | 0.0 to 1.0 (default: 0.0) |
| Format | Description |
|---|---|
json | {"text": "..."} |
text | Plain text only |
verbose_json | Full details with timestamps and segments |
srt | SubRip subtitle format |
vtt | WebVTT subtitle format |
Translate audio from any language to English.
Use the official OpenAI Python SDK with this API.