● Deploy anywhere
◈ Every model. One API. Your cloud or ours.

Ship AI products faster & cheaper — fully under your control

One OpenAI-compatible API for every top model — predictable pricing, no rate caps. Run it on our cloud, your cloud, or on-prem in any country.

OpenAI-compatibleData residencySSO / RBACAudit logsWhite-label99.9% SLA
plugsky · /v1/chat/completions
$ curl https://api.plugsky.com/v1/chat/completions \
  -H "Authorization: Bearer sk-live-…" \
  -d '{ "model": "plugsky-pro", "stream": true }'
 
{
  "region": "auto",
  "latency_ms": 38,
  "content": "Hello 👋 Any model, your region, one API."
} 
35M+
API calls / day
2M+
AI agents powered
50,000+
Developers Addicted
99.9%
Uptime SLA
31+
Powerful AI models
$5
7-day full-access trial

Figures sourced from the Plugsky data room and pending independent validation before external distribution.

Trusted by the leaders

Backed by a world-class ecosystem — NVIDIA, Microsoft, OpenAI, AWS, Google, PayPal, Meta, Shopify — across enterprise AI, fintech, government and SaaS.

🟢 Nvidia Ⓜ️ Microsoft 🅿️ PayPal ⓕ Facebook 🇬 Google 🅾 OpenAI 🅰️ AWS 🇬 Google Cloud 🛍️ Shopify 🟢 Nvidia Ⓜ️ Microsoft 🅿️ PayPal ⓕ Facebook 🇬 Google 🅾 OpenAI 🅰️ AWS 🇬 Google Cloud 🛍️ Shopify
👨‍💻

Built for developers

Call 31+ models through one OpenAI-compatible API — flat pricing from $5, no per-token bills, switch from lite to frontier in a single line.

🧩

Made for SaaS teams

White-label models with flat, predictable pricing — built-in routing, auto-fallback, and team seats that protect your margins as you scale.

🏛️

Enterprise & regulated

Private, sovereign deployment in your own cloud or region — SSO/SAML, audit logs, GDPR & PDPL alignment, and SLAs up to 99.99%.

The bottleneck

AI is becoming a utility. Access is the constraint.

Global platforms impose rate caps, unpredictable token costs, and zero control over where your data runs. Plugsky removes those limits for any team, anywhere.

Rate & token limits

Throughput caps and queueing throttle production AI products at the worst time — peak demand.

📈

Unpredictable pricing

Pure token billing makes margins impossible to plan. SaaS teams need fixed, high-volume packages.

🛡️

No control over data

Sending sensitive data to third-party clouds you don't control is a non-starter for regulated teams.

One platform, every product

Every product your AI team needs — already wired

Six core products, one OpenAI-compatible API, one bill, one account. Each stands on its own — but they click together the moment you turn them on.

🧠

Models

Call 31+ models from one OpenAI-compatible API — lite to frontier, switch in a single line. Free, paid and embedding models side by side.

View all 31 →
🤖

Agent Cloud

Build and run agents with function-calling, memory, and orchestration. OpenAI Assistants-compatible for one-line migration.

Open Agents →
🧩

Tools & Connectors

Wire agents into your tools, APIs and data sources. Bring-your-own Python functions, webhooks, HTTP actions — no glue code.

Browse tools →
📖

Knowledge (RAG)

Private RAG over your own docs with 4096-dim embeddings. Upload PDFs, code, knowledge bases — chat against them in one call.

Try RAG →
🔀

Model Fusion

Automatically blends cheaper and stronger models per task — fan out to N models in parallel, pick the best, save 60-80% vs always-frontier.

Open Fusion →
🛒

Marketplace

Discover and install ready-made agents, tools and prompt packs in a click. Free with every plan.

Browse marketplace →
Software & downloads

One-line install on every platform

Three first-party apps, all open-source under MIT / Apache-2.0, all pre-configured to talk to the Plugsky API. Pick your platform — install with a single command.

⌨️

Plugsky CLI

AI coding agent for your terminal

Fork of opencode, rebranded 100% as Plugsky. macOS, Linux, Windows. MIT-licensed.

curl -fsSL https://plugsky.com/install | bash
🖥️

Plugsky Desktop

Native chat client (Jan fork)

Built on the open-source Jan project (Apache-2.0). Branded Plugsky with plugsky-pro as the default model.

curl -fsSL https://plugsky.com/install-desktop | bash
🌐

Plugsky Web

Self-hosted chat UI (Open WebUI fork)

Built on the open-source Open WebUI project. Docker or pip install. plugsky-fusion pre-listed.

curl -fsSL https://plugsky.com/install-web | bash

All installers are MIT-licensed. See NOTICE for upstream attribution. Full software reference →

Model ladder

From micro to frontier — 31 models, one API

31 first-party and partner models behind one OpenAI-compatible endpoint. Free, paid, vision, reasoning, code, embedding — pick by capability, switch in a line. Hover the icon for full details on any model.

View full model reference →
Built for

Built for every team

👩‍💻

Developers

High-volume APIs, simple migration, usage dashboards.

🧩

SaaS platforms

Reserved throughput & white-label AI to protect margins.

🏛️

Government

Private endpoints with full audit controls.

🏦

Banking & fintech

Compliance-ready RAG, SSO/RBAC, audit logs.

🤝

Agencies & SIs

Wholesale API + per-client isolated deployments.

📊

Trading & markets

Low-latency regional endpoints for high-volume support.

Why Plugsky

Compete on infrastructure, not model hype

Frontier labs win on raw intelligence. Plugsky wins on deployment control, economics & freedom from lock-in.

DimensionGlobal APIsPlugsky
Deployment controlLimitedYour cloud, our cloud, or on-prem
PricingPer-token, rate-limit surprise billsFlat-rate packages — unlimited usage within fair-use, no per-token bills
Data residencyPlan-dependentAny region you choose — UAE, KSA, EU, US, on-prem
White-labelLimitedCore capability — rebrand models, resell under your own brand
GPU capacityCentrally ownedHybrid: owned data center + GPU Share Network + partner clouds
Models1-2 first-party + paid add-ons31 first-party + partner models behind one OpenAI-compatible API
Agent CloudBeta, separate billingBuilt-in, OpenAI Assistants-compatible, free on every plan
RAG (Knowledge)Add-on, separate vendorBuilt-in, 4096-dim embeddings, private collections, included free
Model FusionNot availableFirst-class — fan out, vote, merge, save 60-80% on cost
MarketplaceNot available100+ ready-made agents, tools, prompt packs
CompliancePlan-dependentGDPR, PDPL, ISO 27001, SOC 2 Type II, HIPAA-ready, 99.99% SLA
LanguagesEnglish-first50+ languages, Arabic dialects native, RTL UI
IntegrationsA few SDKs58+ — every IDE, CLI, desktop, framework, no-code tool
OnboardingRead the docs, figure it out$5 trial, change base_url, ship in 5 minutes
Customer stories

Real results, real outcomes

“Ministry cut citizen service response from 48 hours to 4 minutes. 500K+ inquiries/month. Satisfaction jumped 67% → 91%. Zero IT headcount added.”

Head of AI · Government

“FinTech switched from AWS to Azure in 24 hours with zero code changes. 43% cost reduction. Expanded to 3 countries instantly.”

COO · FinTech

“Hospital deployed HIPAA-compliant, air-gapped AI in 5 days. 50K+ records/month. Physicians save 8 hrs/week. Zero breaches.”

Director of IT · Healthcare

“We hit regulatory roadblocks expanding to Saudi on AWS. Plugsky deployed identical LLM infrastructure across AWS (UAE), Azure (Saudi) and on-prem (Egypt) instantly. This level of freedom is unprecedented in enterprise AI.”

Jordan Lee · CFO, Brightwave
Pricing

Simple pricing — self-serve and enterprise packages

Start with a 7-day trial for $5. Save 20% with annual billing. Need volume, compliance, or sovereign deployment? See enterprise packages →

Self-serve
Trial
7 days
Self-serve
Most popular
Starter
Solo devs & small projects
Self-serve
Builder
Teams in production
Self-serve
Scale
High-volume & scaling
Enterprise
Starter Enterprise
First enterprise pilot
Enterprise
Most popular
Growth Enterprise
Scaling AI internally
Enterprise
Enterprise
Regulated industries
Enterprise
Sovereign AI Cloud
Government & banks
Price $5 / 7 days $20/mo $60/mo $120/mo $15K – $25K/year $150K – $500K+/year $500K – $2M+/year
Billed annually (−20%) $16/mo $48/mo $96/mo
Usage Unlimited* — all self-serve plans By deployment — annual contract
Best for Trying Plugsky Solo devs & small projects Teams in production High-volume & scaling First enterprise pilot Regulated companies Government & banks
Models plugsky-micro only up to plugsky-pro up to plugsky-max all + plugsky-frontier up to plugsky-frontier all + private models all + custom fine-tuning
Deployments 1 1 1 1 1 Unlimited Unlimited + sovereign
Seats 1 1 5 25 5 Unlimited Unlimited
API keys 2 5 20 Unlimited Unlimited Unlimited Unlimited
Rate limit* 60 req/min 120 req/min 300 req/min 1,000 req/min Custom Dedicated Sovereign dedicated
Knowledge (RAG) 100 docs 1,000 docs 20,000 docs 200,000 docs 5,000 docs Unlimited Unlimited + private
Support Email Email Priority Priority + onboarding Email 48h SLA Named engineer 24/7 + dedicated CSM
SLA 99.9% 99.9% 99.99% Sovereign + custom
SSO / SAML
Audit logs Basic Full + export Full + SIEM export
On-prem / VPC Optional Air-gapped available
DPA & Compliance DPA included DPA + SOC 2 + HIPAA FedRAMP + sovereign
* Unlimited usage subject to fair-use rate limits. Enterprise packages are annual contracts, billed yearly. Custom configurations available — contact enterprise@plugsky.com for a tailored quote.
🛠

Professional Services Add-on

$50K – $500K

Setup, integration, RAG, fine-tuning, and migration support from our engineering team.

Implementation & onboarding Custom workflows RAG over your data Fine-tuning & evaluation
⚙️

Managed AI Ops Add-on

+30% uplift

Plugsky operates the AI stack end-to-end — monitoring, optimization, model/runtime operations, and 24/7 support.

24/7 monitoring + on-call Runtime optimization Model & prompt tuning Cost optimization
Need sovereign deployment? Private endpoints, dedicated capacity, on-prem/VPC, SSO/SAML, DPA, custom SLA — annual, priced by deployment.
FAQ

Everything you need to know

The 20 questions customers ask most before signing up.

Is usage really unlimited?

Yes. There are no per-token charges on any plan. Every plan is flat-rate, and usage is gated only by the fair-use rate limit for your tier (60, 120, 300, or 1,000 req/min). You can call a million times on a $20 plan and still pay $20.

How do I migrate from OpenAI?

Change your base_url to https://api.plugsky.com/v1, set your API key, and use model="plugsky-pro" instead of gpt-4o. Your existing code, SDK, and prompts work unchanged. Most teams migrate in 5 minutes.

What models do you support?

31 first-party models (plugsky-micro through plugsky-frontier, plus Kimi, Qwen, DeepSeek, Llama, Mistral, NVIDIA Nemotron, Gemma) and the same OpenAI-compatible endpoint also supports partner models. Free, paid, vision, reasoning, code, embeddings — all behind one API.

Can I switch clouds after deployment?

Yes. Switch from AWS to Azure to Google Cloud to on-prem without rebuilding your infrastructure. That is the core benefit of Plugsky — no vendor lock-in, ever.

Do you support on-premise deployment?

Yes — Enterprise includes on-premise and air-gapped deployment for maximum data control. Runs in your own data center, on your own hardware, with no outbound network calls.

What is Model Fusion?

Model Fusion lets you call multiple models in parallel and pick the best result. Set model="plugsky-fusion" and Plugsky runs your default chain (e.g. micro → pro → max) — fast and cheap, with auto-escalation on hard prompts. Saves 60-80% on production traffic.

What is Agent Cloud?

Agent Cloud is our platform for building AI agents with function-calling, memory, tools, and orchestration. OpenAI Assistants-compatible for one-line migration. Free on every plan.

What is the Knowledge (RAG) feature?

Private RAG over your own documents. Upload PDFs, code, or knowledge bases; Plugsky chunks them, embeds with 4096-dim vectors, and lets you chat against your data in one API call. No separate vector DB needed.

How does the marketplace work?

100+ ready-made agents, tools, and prompt packs. Browse in the dashboard, click "Deploy", and they're live in your workspace. Free with every plan.

What integrations do you support?

58+ — every major IDE (VS Code, JetBrains, Cursor, Cline, Continue), CLI (plugsky CLI, Aider, Goose, OpenCode, Crush), desktop (Jan, Open WebUI, Chatbox, Jan, Msty, LM Studio), framework (LangChain, LlamaIndex, Haystack, CrewAI, PydanticAI, Semantic Kernel), and no-code (Flowise, Langflow, Dify, Activepieces, Pipedream, Retool).

What compliance certifications do you have?

ISO 27001, SOC 2 Type II, GDPR, PDPL-compliant, HIPAA-ready. FedRAMP in progress. Enterprise plans include a signed DPA, BAA, and full audit log export.

How quickly can I deploy?

Self-serve plans: under 5 minutes. Enterprise private cloud: 1-3 days. On-premise: 3-7 days. White-glove migration with a named engineer: included in Enterprise.

What is your uptime guarantee?

Standard SLA 99.9%, Enhanced 99.95%, Custom up to 99.99% with financial penalties for downtime. The /status page is the public source of truth for live uptime.

Do you offer custom contracts?

Yes. Enterprise customers can negotiate custom terms, payment schedules, SLAs, regional deployment, and dedicated support engineers.

How is data residency handled?

You pick the region — UAE-1, UAE-2, KSA, EU, US, or on-prem. Data never leaves the region you select. Full audit log of every API call is stored in the same region.

Can I use my own models?

Yes. The Enterprise plan supports custom model uploads (fine-tuned or open-weights) served from your own infrastructure, behind the same OpenAI-compatible API. White-label option included.

How does billing work?

Flat monthly or annual fee. No per-token charges, no per-request charges, no overage fees. You can upgrade, downgrade, or cancel anytime. Annual plans save 20%.

What is the $5 trial?

7 days of full access to all 31 models, Agent Cloud, RAG, Model Fusion, marketplace, and all integrations. No credit card required. At the end, pick a plan or your account goes read-only (we never auto-charge).

How does the GPU Share Network work?

Verified operators contribute idle GPUs and earn 70% of the revenue Plugsky routes to them. Plugsky handles auth, metering, and routing. Non-sensitive workloads only — every payload is classified before routing.

Where can I get support?

Free: documentation, community Discord, GitHub issues. Pro: priority email, 24h response. Enterprise: named engineer, 4h SLA, Slack/Teams/phone. SLA: 99.9% uptime or you get credit.

Build on AI infrastructure you control today

Free to start. OpenAI-compatible. Live in minutes.