API Version 1.0

Nebula AI API

OpenRouter-style API for AI model access. Self-hosted models at a fraction of the cost. Compatible with OpenAI's chat completions API.

curl -X POST https://api.nebulahq.work/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "qwen3:latest", "messages": [{"role": "user", "content": "Hello"}]}'

Authentication

All API requests require an API key passed in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Get your API key from the dashboard.

Chat Completions

POST /v1/chat/completions

Send a chat conversation and get a completion response. Compatible with OpenAI's format.

Request Body

{
  "model": "qwen3:latest",           // Required: model name
  "messages": [                      // Required: conversation
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,                // Optional: 0.0-2.0, default 0.7
  "max_tokens": 2048,               // Optional: max response tokens
  "stream": false                   // Optional: for streaming responses
}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1712531200,
  "model": "qwen3:latest",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 12,
    "total_tokens": 22
  }
}

Available Models

Model Context Use Case Input/1M Output/1M
qwen3:latest 32K General, reasoning $0.20 $0.60
gemma3:1b 4K Fast, lightweight $0.05 $0.10
qwen3-vl:235b-cloud 32K Vision, multimodal $0.50 $1.50
deepseek-v3.1:671b-cloud 64K Advanced reasoning $0.35 $0.90
qwen/qwen3-coder:free 32K Code generation (free) Free Free

Subscription Plans

Free

$0/mo
  • ✓ 100K input tokens/mo
  • ✓ 20K output tokens/mo
  • ✓ Basic models
  • ✓ Community support
Get Started

Developer

$10/mo
  • ✓ 2M input tokens/mo
  • ✓ 500K output tokens/mo
  • ✓ All Ollama models
  • ✓ Email support
Subscribe
POPULAR

Pro

$50/mo
  • ✓ 15M input tokens/mo
  • ✓ 3M output tokens/mo
  • ✓ Vision models
  • ✓ Priority support
Subscribe

Enterprise

$200/mo
  • ✓ 100M input tokens/mo
  • ✓ 20M output tokens/mo
  • ✓ Custom fine-tuning
  • ✓ 24/7 support
Contact Us

Code Examples

Python

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.nebulahq.work"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

response = requests.post(
    f"{BASE_URL}/v1/chat/completions",
    headers=headers,
    json={
        "model": "qwen3:latest",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)

print(response.json()["choices"][0]["message"]["content"])

JavaScript

const response = await fetch('https://api.nebulahq.work/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'qwen3:latest',
    messages: [{ role: 'user', content: 'Hello!' }]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);