API Version 1.0

Nebula AI API

OpenRouter-style API for AI model access. Self-hosted models at a fraction of the cost. Compatible with OpenAI's chat completions API.

curl -X POST https://api.nebulahq.work/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "qwen3:latest", "messages": [{"role": "user", "content": "Hello"}]}'

🔐

Authentication

API keys and authorization

💬

Chat Completions

Send messages and get AI responses

🤖

Models

Available models and pricing

💰

Pricing

Token-based pricing

Authentication

All API requests require an API key passed in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Get your API key from the dashboard.

Chat Completions

POST /v1/chat/completions

Send a chat conversation and get a completion response. Compatible with OpenAI's format.

Request Body

{
  "model": "qwen3:latest",           // Required: model name
  "messages": [                      // Required: conversation
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,                // Optional: 0.0-2.0, default 0.7
  "max_tokens": 2048,               // Optional: max response tokens
  "stream": false                   // Optional: for streaming responses
}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1712531200,
  "model": "qwen3:latest",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 12,
    "total_tokens": 22
  }
}

Available Models

Model	Context	Use Case	Input/1M	Output/1M
qwen3:latest	32K	General, reasoning	$0.20	$0.60
gemma3:1b	4K	Fast, lightweight	$0.05	$0.10
qwen3-vl:235b-cloud	32K	Vision, multimodal	$0.50	$1.50
deepseek-v3.1:671b-cloud	64K	Advanced reasoning	$0.35	$0.90
qwen/qwen3-coder:free	32K	Code generation (free)	Free	Free

Subscription Plans

Free

$0/mo

✓ 100K input tokens/mo
✓ 20K output tokens/mo
✓ Basic models
✓ Community support

Get Started

Developer

$10/mo

✓ 2M input tokens/mo
✓ 500K output tokens/mo
✓ All Ollama models
✓ Email support

POPULAR

Pro

$50/mo

✓ 15M input tokens/mo
✓ 3M output tokens/mo
✓ Vision models
✓ Priority support

Enterprise

$200/mo

✓ 100M input tokens/mo
✓ 20M output tokens/mo
✓ Custom fine-tuning
✓ 24/7 support

Code Examples

Python

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.nebulahq.work"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

response = requests.post(
    f"{BASE_URL}/v1/chat/completions",
    headers=headers,
    json={
        "model": "qwen3:latest",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)

print(response.json()["choices"][0]["message"]["content"])

JavaScript

const response = await fetch('https://api.nebulahq.work/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'qwen3:latest',
    messages: [{ role: 'user', content: 'Hello!' }]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);