One OpenAI-compatible API for every top model — predictable pricing, no rate caps. Run it on our cloud, your cloud, or on-prem in any country.
$ curl https://api.plugsky.com/v1/chat/completions \ -H "Authorization: Bearer sk-live-…" \ -d '{ "model": "plugsky-pro", "stream": true }' { "region": "auto", "latency_ms": 38, "content": "Hello 👋 Any model, your region, one API." }
Figures sourced from the Plugsky data room and pending independent validation before external distribution.
Backed by a world-class ecosystem — NVIDIA, Microsoft, OpenAI, AWS, Google, PayPal, Meta, Shopify — across enterprise AI, fintech, government and SaaS.
Call 31+ models through one OpenAI-compatible API — flat pricing from $5, no per-token bills, switch from lite to frontier in a single line.
White-label models with flat, predictable pricing — built-in routing, auto-fallback, and team seats that protect your margins as you scale.
Private, sovereign deployment in your own cloud or region — SSO/SAML, audit logs, GDPR & PDPL alignment, and SLAs up to 99.99%.
Global platforms impose rate caps, unpredictable token costs, and zero control over where your data runs. Plugsky removes those limits for any team, anywhere.
Throughput caps and queueing throttle production AI products at the worst time — peak demand.
Pure token billing makes margins impossible to plan. SaaS teams need fixed, high-volume packages.
Sending sensitive data to third-party clouds you don't control is a non-starter for regulated teams.
Six core products, one OpenAI-compatible API, one bill, one account. Each stands on its own — but they click together the moment you turn them on.
Call 31+ models from one OpenAI-compatible API — lite to frontier, switch in a single line. Free, paid and embedding models side by side.
View all 31 →Build and run agents with function-calling, memory, and orchestration. OpenAI Assistants-compatible for one-line migration.
Open Agents →Wire agents into your tools, APIs and data sources. Bring-your-own Python functions, webhooks, HTTP actions — no glue code.
Browse tools →Private RAG over your own docs with 4096-dim embeddings. Upload PDFs, code, knowledge bases — chat against them in one call.
Try RAG →Automatically blends cheaper and stronger models per task — fan out to N models in parallel, pick the best, save 60-80% vs always-frontier.
Open Fusion →Discover and install ready-made agents, tools and prompt packs in a click. Free with every plan.
Browse marketplace →Three first-party apps, all open-source under MIT / Apache-2.0, all pre-configured to talk to the Plugsky API. Pick your platform — install with a single command.
Fork of opencode, rebranded 100% as Plugsky. macOS, Linux, Windows. MIT-licensed.
Built on the open-source Jan project (Apache-2.0). Branded Plugsky with plugsky-pro as the default model.
Built on the open-source Open WebUI project. Docker or pip install. plugsky-fusion pre-listed.
All installers are MIT-licensed. See NOTICE for upstream attribution. Full software reference →
31 first-party and partner models behind one OpenAI-compatible endpoint. Free, paid, vision, reasoning, code, embedding — pick by capability, switch in a line. Hover the ⓘ icon for full details on any model.
High-volume APIs, simple migration, usage dashboards.
Reserved throughput & white-label AI to protect margins.
Private endpoints with full audit controls.
Compliance-ready RAG, SSO/RBAC, audit logs.
Wholesale API + per-client isolated deployments.
Low-latency regional endpoints for high-volume support.
Frontier labs win on raw intelligence. Plugsky wins on deployment control, economics & freedom from lock-in.
| Dimension | Global APIs | Plugsky |
|---|---|---|
| Deployment control | Limited | Your cloud, our cloud, or on-prem |
| Pricing | Per-token, rate-limit surprise bills | Flat-rate packages — unlimited usage within fair-use, no per-token bills |
| Data residency | Plan-dependent | Any region you choose — UAE, KSA, EU, US, on-prem |
| White-label | Limited | Core capability — rebrand models, resell under your own brand |
| GPU capacity | Centrally owned | Hybrid: owned data center + GPU Share Network + partner clouds |
| Models | 1-2 first-party + paid add-ons | 31 first-party + partner models behind one OpenAI-compatible API |
| Agent Cloud | Beta, separate billing | Built-in, OpenAI Assistants-compatible, free on every plan |
| RAG (Knowledge) | Add-on, separate vendor | Built-in, 4096-dim embeddings, private collections, included free |
| Model Fusion | Not available | First-class — fan out, vote, merge, save 60-80% on cost |
| Marketplace | Not available | 100+ ready-made agents, tools, prompt packs |
| Compliance | Plan-dependent | GDPR, PDPL, ISO 27001, SOC 2 Type II, HIPAA-ready, 99.99% SLA |
| Languages | English-first | 50+ languages, Arabic dialects native, RTL UI |
| Integrations | A few SDKs | 58+ — every IDE, CLI, desktop, framework, no-code tool |
| Onboarding | Read the docs, figure it out | $5 trial, change base_url, ship in 5 minutes |
“Ministry cut citizen service response from 48 hours to 4 minutes. 500K+ inquiries/month. Satisfaction jumped 67% → 91%. Zero IT headcount added.”
Head of AI · Government“FinTech switched from AWS to Azure in 24 hours with zero code changes. 43% cost reduction. Expanded to 3 countries instantly.”
COO · FinTech“Hospital deployed HIPAA-compliant, air-gapped AI in 5 days. 50K+ records/month. Physicians save 8 hrs/week. Zero breaches.”
Director of IT · Healthcare“We hit regulatory roadblocks expanding to Saudi on AWS. Plugsky deployed identical LLM infrastructure across AWS (UAE), Azure (Saudi) and on-prem (Egypt) instantly. This level of freedom is unprecedented in enterprise AI.”
Jordan Lee · CFO, BrightwaveStart with a 7-day trial for $5. Save 20% with annual billing. Need volume, compliance, or sovereign deployment? See enterprise packages →
|
Self-serve
Trial
7 days
|
Self-serve
Most popular
Starter
Solo devs & small projects
|
Self-serve
Builder
Teams in production
|
Self-serve
Scale
High-volume & scaling
|
Enterprise
Starter Enterprise
First enterprise pilot
|
Enterprise
Most popular
Growth Enterprise
Scaling AI internally
|
Enterprise
Enterprise
Regulated industries
|
Enterprise
Sovereign AI Cloud
Government & banks
|
|
|---|---|---|---|---|---|---|---|---|
| Price | $5 / 7 days | $20/mo | $60/mo | $120/mo | $15K – $25K/year | $50K – $100K/year | $150K – $500K+/year | $500K – $2M+/year |
| Billed annually (−20%) | — | $16/mo | $48/mo | $96/mo | — | — | — | — |
| Usage | Unlimited* — all self-serve plans | By deployment — annual contract | ||||||
| Best for | Trying Plugsky | Solo devs & small projects | Teams in production | High-volume & scaling | First enterprise pilot | Teams scaling AI internally | Regulated companies | Government & banks |
| Models | plugsky-micro only | up to plugsky-pro | up to plugsky-max | all + plugsky-frontier | up to plugsky-frontier | all models | all + private models | all + custom fine-tuning |
| Deployments | 1 | 1 | 1 | 1 | 1 | Up to 5 | Unlimited | Unlimited + sovereign |
| Seats | 1 | 1 | 5 | 25 | 5 | 25 | Unlimited | Unlimited |
| API keys | 2 | 5 | 20 | Unlimited | Unlimited | Unlimited | Unlimited | Unlimited |
| Rate limit* | 60 req/min | 120 req/min | 300 req/min | 1,000 req/min | Custom | Custom | Dedicated | Sovereign dedicated |
| Knowledge (RAG) | 100 docs | 1,000 docs | 20,000 docs | 200,000 docs | 5,000 docs | 50,000 docs | Unlimited | Unlimited + private |
| Support | Priority | Priority + onboarding | Email 48h SLA | Slack channel | Named engineer | 24/7 + dedicated CSM | ||
| SLA | — | — | — | 99.9% | 99.9% | 99.95% | 99.99% | Sovereign + custom |
| SSO / SAML | — | — | — | — | — | ✓ | ✓ | ✓ |
| Audit logs | — | — | — | ✓ | Basic | Full | Full + export | Full + SIEM export |
| On-prem / VPC | — | — | — | — | — | Optional | Optional | Air-gapped available |
| DPA & Compliance | — | — | — | — | DPA included | DPA + SOC 2 | DPA + SOC 2 + HIPAA | FedRAMP + sovereign |
| * Unlimited usage subject to fair-use rate limits. Enterprise packages are annual contracts, billed yearly. Custom configurations available — contact enterprise@plugsky.com for a tailored quote. | ||||||||
Setup, integration, RAG, fine-tuning, and migration support from our engineering team.
Plugsky operates the AI stack end-to-end — monitoring, optimization, model/runtime operations, and 24/7 support.
| Feature | Trial $5 / 7 days |
Starter $20 / mo |
Builder $60 / mo |
Scale $120 / mo |
|---|---|---|---|---|
| ⚡ Core platform | ||||
| All 31 AI models | ✓ | up to pro | up to max | all 31+ |
| OpenAI-compatible API | ✓ | ✓ | ✓ | ✓ |
| Streaming & function calling | ✓ | ✓ | ✓ | ✓ |
| Vision & multimodal | ✓ | ✓ | ✓ | ✓ |
| JSON mode & structured outputs | ✓ | ✓ | ✓ | ✓ |
| Embeddings (4096-dim) | ✓ | ✓ | ✓ | ✓ |
| Fair-use rate limit (req/min) | 60 | 120 | 300 | 1,000 |
| API keys | 1 | 5 | 20 | 100 |
| Usage dashboard & analytics | ✓ | ✓ | ✓ advanced | ✓ custom |
| 🤖 Agent Cloud | ||||
| OpenAI Assistants-compatible agents | ✓ | ✓ | ✓ | ✓ |
| Function calling & tools | ✓ | ✓ | ✓ | ✓ |
| Conversation memory | ✓ | ✓ | ✓ | ✓ |
| Code interpreter | — | ✓ | ✓ | ✓ |
| File search & vector retrieval | — | ✓ | ✓ | ✓ |
| Custom tools (BYO Python) | — | — | ✓ | ✓ |
| 📖 Knowledge (RAG) | ||||
| Document upload & chunking | ✓ | ✓ | ✓ | ✓ |
| Private collections | 10 | 1,000 | 20,000 | 200,000 |
| Multilingual embeddings (100+ langs) | ✓ | ✓ | ✓ | ✓ |
| Hybrid search (vector + keyword) | — | ✓ | ✓ | ✓ |
| Citations & source highlighting | — | ✓ | ✓ | ✓ |
| 🔀 Model Fusion | ||||
| Sequential chains | ✓ | ✓ | ✓ | ✓ |
| Parallel fan-out (up to 8 models) | — | ✓ | ✓ | ✓ |
| Cost-saver auto-escalation | — | ✓ | ✓ | ✓ |
| Majority vote & merge strategies | — | — | ✓ | ✓ |
| Custom routing rules | — | — | — | ✓ |
| 🛒 Marketplace & integrations | ||||
| 100+ ready-made agents & tools | — | ✓ | ✓ | ✓ |
| 58+ client integrations (IDEs, CLIs, desktop, frameworks) | ✓ | ✓ | ✓ | ✓ |
| plugsky CLI & Desktop (Jan, Open WebUI) | ✓ | ✓ | ✓ | ✓ |
| Bring-your-own models (custom fine-tunes) | — | — | — | ✓ |
| 👥 Team & security | ||||
| Team seats | 1 | 1 | 5 | 25 |
| Role-based access control (RBAC) | — | — | ✓ | ✓ |
| Audit logs | — | — | 30 days | unlimited |
| SSO / SAML | — | — | — | ✓ |
| GDPR / PDPL compliance | ✓ | ✓ | ✓ | ✓ |
| SOC 2 Type II / ISO 27001 | — | — | — | ✓ |
| Uptime SLA | — | — | 99.9% | 99.95% |
| Priority email support | — | — | ✓ | ✓ 24h |
| 💳 Billing | ||||
| Flat monthly fee (no per-token) | ✓ | ✓ | ✓ | ✓ |
| Annual billing (save 20%) | — | ✓ | ✓ | ✓ |
| No overage fees | ✓ | ✓ | ✓ | ✓ |
| Cancel anytime | ✓ | ✓ | ✓ | ✓ |
Need an enterprise or sovereign-cloud plan? Contact our team → — every enterprise contract is priced to your deployment and security requirements.
Data residency, compliance and the case for running inference where you choose.
EngineeringSwitch your base URL, keep your stack — OpenAI-compatible by design.
ProductBuild agents with tools, memory and private RAG on infrastructure you control.
The 20 questions customers ask most before signing up.
Yes. There are no per-token charges on any plan. Every plan is flat-rate, and usage is gated only by the fair-use rate limit for your tier (60, 120, 300, or 1,000 req/min). You can call a million times on a $20 plan and still pay $20.
Change your base_url to https://api.plugsky.com/v1, set your API key, and use model="plugsky-pro" instead of gpt-4o. Your existing code, SDK, and prompts work unchanged. Most teams migrate in 5 minutes.
31 first-party models (plugsky-micro through plugsky-frontier, plus Kimi, Qwen, DeepSeek, Llama, Mistral, NVIDIA Nemotron, Gemma) and the same OpenAI-compatible endpoint also supports partner models. Free, paid, vision, reasoning, code, embeddings — all behind one API.
Yes. Switch from AWS to Azure to Google Cloud to on-prem without rebuilding your infrastructure. That is the core benefit of Plugsky — no vendor lock-in, ever.
Yes — Enterprise includes on-premise and air-gapped deployment for maximum data control. Runs in your own data center, on your own hardware, with no outbound network calls.
Model Fusion lets you call multiple models in parallel and pick the best result. Set model="plugsky-fusion" and Plugsky runs your default chain (e.g. micro → pro → max) — fast and cheap, with auto-escalation on hard prompts. Saves 60-80% on production traffic.
Agent Cloud is our platform for building AI agents with function-calling, memory, tools, and orchestration. OpenAI Assistants-compatible for one-line migration. Free on every plan.
Private RAG over your own documents. Upload PDFs, code, or knowledge bases; Plugsky chunks them, embeds with 4096-dim vectors, and lets you chat against your data in one API call. No separate vector DB needed.
100+ ready-made agents, tools, and prompt packs. Browse in the dashboard, click "Deploy", and they're live in your workspace. Free with every plan.
58+ — every major IDE (VS Code, JetBrains, Cursor, Cline, Continue), CLI (plugsky CLI, Aider, Goose, OpenCode, Crush), desktop (Jan, Open WebUI, Chatbox, Jan, Msty, LM Studio), framework (LangChain, LlamaIndex, Haystack, CrewAI, PydanticAI, Semantic Kernel), and no-code (Flowise, Langflow, Dify, Activepieces, Pipedream, Retool).
ISO 27001, SOC 2 Type II, GDPR, PDPL-compliant, HIPAA-ready. FedRAMP in progress. Enterprise plans include a signed DPA, BAA, and full audit log export.
Self-serve plans: under 5 minutes. Enterprise private cloud: 1-3 days. On-premise: 3-7 days. White-glove migration with a named engineer: included in Enterprise.
Standard SLA 99.9%, Enhanced 99.95%, Custom up to 99.99% with financial penalties for downtime. The /status page is the public source of truth for live uptime.
Yes. Enterprise customers can negotiate custom terms, payment schedules, SLAs, regional deployment, and dedicated support engineers.
You pick the region — UAE-1, UAE-2, KSA, EU, US, or on-prem. Data never leaves the region you select. Full audit log of every API call is stored in the same region.
Yes. The Enterprise plan supports custom model uploads (fine-tuned or open-weights) served from your own infrastructure, behind the same OpenAI-compatible API. White-label option included.
Flat monthly or annual fee. No per-token charges, no per-request charges, no overage fees. You can upgrade, downgrade, or cancel anytime. Annual plans save 20%.
7 days of full access to all 31 models, Agent Cloud, RAG, Model Fusion, marketplace, and all integrations. No credit card required. At the end, pick a plan or your account goes read-only (we never auto-charge).
Verified operators contribute idle GPUs and earn 70% of the revenue Plugsky routes to them. Plugsky handles auth, metering, and routing. Non-sensitive workloads only — every payload is classified before routing.
Free: documentation, community Discord, GitHub issues. Pro: priority email, 24h response. Enterprise: named engineer, 4h SLA, Slack/Teams/phone. SLA: 99.9% uptime or you get credit.
Free to start. OpenAI-compatible. Live in minutes.