Plugsky Documentation — Complete A-Z Reference

Getting started

Five minutes from zero to your first model response. The Plugsky API is 100% OpenAI-compatible — change your base_url and your existing code, SDK, and prompts keep working.

Install

# Python (OpenAI SDK already works)
pip install opencode

# Node.js / TypeScript
npm i opencode

# Go
go get github.com/plugsky/opencode-go

# Or use raw HTTP — no SDK required

Your first API call

python

from openai import OpenAI

client = OpenAI(
    api_key="psk_live_…",                          # your Plugsky key
    base_url="https://api.plugsky.com/v1",         # the only line that changes
)

resp = client.chat.completions.create(
    model="plugsky-pro",
    messages=[{"role": "user", "content": "Say hello in 5 languages"}],
    max_tokens=200,
)
print(resp.choices[0].message.content)

That's it. Same request shape, same response shape, same streaming, same function-calling, same JSON mode. Full OpenAI-compatibility reference →

✓

Drop-in. Your existing OpenAI Python, Node, Go, Java, .NET, and cURL clients all work without code changes. Tools like LangChain, LlamaIndex, Vercel AI SDK, and the OpenAI Playground are first-class supported.

Core concepts

Model. The inference engine. plugsky-micro through plugsky-frontier, plus 13+ third-party (opencode.ai, NVIDIA NIM, Mistral, Cohere, Stability).
Request. A single API call. Token-counted. Priced.
Thread / Run. Stateful multi-turn conversation (Assistants API).
Tool. A function you expose to the model for function-calling.
Knowledge base / Vector store. Indexed documents the model can retrieve from (RAG).
Endpoint / Region. Where the model runs. me-central-1 (UAE), eu-west-1, us-east-1, plus customer VPC and on-prem.

Authentication

Plugsky uses bearer-token API keys. Keys are project-scoped, role-scoped, and rotatable without downtime.

API keys

Generate keys in the Dashboard → API keys. Each key carries a scope and a project. Three built-in roles:

Role	Can do	Cannot do
`read`	List models, get usage, view audit logs	Inference, key creation
`infer`	Chat / embeddings / image / audio / batch	Admin, billing, key creation
`admin`	All of the above + key mgmt + billing + RBAC	Org-level settings (root only)

Environment variables

bash

export PLUGSKY_API_KEY="psk_live_…"
export PLUGSKY_BASE_URL="https://api.plugsky.com/v1"
export PLUGSKY_PROJECT="prj_8x2…"   # optional, defaults to your first project
export PLUGSKY_REGION="me-central-1"

Scopes & roles

Each key has a comma-separated scope list. Examples: chat:write,embeddings:write,files:read. Use the narrowest scope that works — production keys should never have admin.

OAuth 2.0 (3rd-party apps)

Plugsky supports standard OAuth 2.0 authorization-code flow with PKCE for SaaS apps that want to offer "Sign in with Plugsky" or access their users' Plugsky workspaces. Read the full OAuth guide →

⚠

Rotate keys quarterly. Old keys stay valid for 24 hours after rotation so deployments can roll without a brownout. Rotate now →

API reference

Every endpoint, every parameter, every status code. Compatible with OpenAI's /v1/* namespace; Plugsky-specific extensions live under /v1/plugsky/*.

Chat completions

POST /v1/chat/completions

The primary inference endpoint. Supports streaming, function-calling, JSON mode, structured outputs, vision, and tool use.

Required: model, messages · Optional: temperature, top_p, n, stream, stop, max_tokens, presence_penalty, frequency_penalty, tools, tool_choice, response_format, seed, user

python

from openai import OpenAI
client = OpenAI(api_key="psk_live_…", base_url="https://api.plugsky.com/v1")

stream = client.chat.completions.create(
    model="plugsky-pro",
    messages=[{"role": "user", "content": "Write a haiku about GCC summers"}],
    stream=True,
    temperature=0.7,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Legacy completions

POST/v1/completions

Pre-chat raw completion endpoint. Backwards-compatible. Prefer /v1/chat/completions for new integrations.

Embeddings

POST/v1/embeddings

Vector embeddings for semantic search, RAG, clustering, and recommendations. Models →

python

resp = client.embeddings.create(
    model="plugsky-embed-2",
    input=["Plugsky is a sovereign AI cloud", "GCC banks run on us"],
)
print(len(resp.data[0].embedding), "dimensions")  # 1536

Image generation

POST/v1/images/generations

DALL·E-compatible image generation. Models: plugsky-imagine-xl, plugsky-imagine-fast.

Audio (ASR / TTS)

POST/v1/audio/transcriptions

Speech-to-text. 99 languages, Arabic dialect support, real-time streaming. whisper-plugsky model.

POST/v1/audio/speech

Text-to-speech. 30+ voices, SSML, Arabic & English, real-time.

Moderation

POST/v1/moderations

Classify text for harmful content. Returns per-category scores (hate, violence, sexual, self-harm, PII). Plugsky extension: pii, prompt-injection.

Files

POST/v1/files

Upload files for fine-tuning, batch, or vector-store ingestion. Up to 512 MB per file. PDF, DOCX, MD, TXT, JSON, CSV, audio, video.

Batch API

POST/v1/batches

Process up to 50,000 requests asynchronously. 24-hour SLA, 50% cheaper than sync. Returns JSONL results.

Fine-tuning

POST/v1/fine_tuning/jobs

Supervised fine-tuning (SFT) and DPO. Base models: plugsky-micro, plugsky-lite, plugsky-pro. Your data, your region, your keys.

Assistants / Agents

POST/v1/assistants

OpenAI Assistants-compatible stateful agents with tools (code interpreter, file search, function calling). 100% compatible with the OpenAI Assistants API.

Function calling / tools

Pass a tools array with JSON-Schema function definitions. The model returns a structured tool_calls payload you execute, then return the result. Streaming and parallel tool calls supported. Generate your function schema →

Responses (stateful)

POST/v1/responses

OpenAI Responses-compatible stateful endpoint. Built-in RAG, web search, code interpreter, and computer use. Recommended for new agentic builds.

Plugsky extensions

GET/v1/plugsky/usage

Per-key, per-model, per-day usage. Returns the same shape used by the Dashboard.

POST/v1/plugsky/route

Smart routing — pass model="auto" and Plugsky picks the best model for the prompt, your cost target, and your latency target.

Models

31 first-party models across free, paid, and embedding tiers — all served behind one OpenAI-compatible endpoint, billed from one invoice, governed by one set of policies. Use GET /v1/models for the live list. Hover the ⓘ icon in the dashboard Models page for full details on any model.

Free tier — instant, no API key required

Model	Context	Best for	Tier
`plugsky-micro`	131K	Fast, cheap — classification, simple chat, intent detection	free
`plugsky-lite`	32K	Support & chat automation — moderate complexity	free

Paid tier — balanced general agents

Model	Context	Best for	Tier
`plugsky-plus`	32K	Balanced general agent — good quality at lower cost	paid
`plugsky-pro`	65K	Coding & reasoning (default) — strong general purpose	default
`plugsky-max`	131K	Complex multi-step — deep reasoning	reasoning
`plugsky-frontier`	131K	Frontier-tier — Mistral Large 3 675B (EU origin, 128K context)	reasoning

Specialized — reasoning, vision, code, long-context

Model	Context	Best for	Capabilities
`plugsky-reasoning`	65K	Deep reasoning, math, code — NVIDIA Nemotron Super 120B	🧠 reasoning · 🔧 tools
`plugsky-kimi`	131K	Long-context (256K) — MoonshotAI Kimi K2.6	📄 long-context
`plugsky-deepseek-pro`	65K	Reasoning + code — DeepSeek V4 Pro	🧠 reasoning · 💻 code
`plugsky-deepseek-flash`	32K	Fast DeepSeek — V4 Flash	⚡ fast
`plugsky-gpt-oss`	32K	Open-source GPT — gpt-oss-120B (NVIDIA)	🧠 reasoning · 📄 long-context
`plugsky-qwen-next`	131K	Alibaba Qwen3 Next 80B MoE (256K context)	📄 long-context · ⚛ MoE
`plugsky-coder`	131K	Best open coding model — Qwen3 Coder 480B MoE	💻 code · 📄 long-context
`plugsky-minimax`	32K	NVIDIA MiniMax-M3 — strong multimodal + reasoning	👁 vision · 🎬 video · 🧠 reasoning
`plugsky-vision-fast`	32K	Multimodal fast — Llama 3.2 Vision 11B	👁 vision
`plugsky-llama4`	131K	Meta Llama 4 Maverick 17B (128 experts MoE)	⚛ MoE · 📄 long-context
`plugsky-qwen-vl`	262K	Qwen 3.5 397B MoE — multimodal + 256K context + reasoning	👁 vision · 🧠 reasoning · 📄 long-context
`plugsky-longctx`	131K	Mistral Large 3 675B — European, 128K context	📄 long-context · 🇪🇺 EU
`plugsky-mistral-medium`	131K	Mistral Medium 3.5 128B — fast 128K context	📄 long-context
`plugsky-gemma-4`	32K	Google Gemma 3 Nano 4B — fast + multimodal	👁 vision · ⚡ fast
`plugsky-nano`	1M	NVIDIA Nemotron 3 Nano 30B MoE — 1M context, fast	📄 long-context · ⚛ MoE
`plugsky-tiny`	131K	NVIDIA Nemotron Nano 9B v2 — small, fast, low cost	⚡ fast
`plugsky-coder-fast`	32K	Fast coding — Llama 3.2 3B (newer than 3.1)	💻 code · ⚡ fast
`plugsky-phi`	131K	NVIDIA Nemotron Mini 4B — ultra-compact, very fast	⚡ fast
`plugsky-ultra`	1M	NVIDIA Nemotron 3 Nano Omni 30B — omni-modal + reasoning, 1M context	👁 vision · 🧠 reasoning · 📄 long-context
`plugsky-gemma3-nano-2b`	32K	Google Gemma 3 Nano 2B — ultra-compact, fast	⚡ fast
`plugsky-gemma3-nano-4b`	32K	Google Gemma 3 Nano 4B — small + fast + multimodal	⚡ fast
`plugsky-mistral-small`	131K	Mistral Small 4 119B — fast + 128K context	📄 long-context

Embedding models — for RAG & semantic search

Model	Dimensions	Max tokens	Notes
`plugsky-embed`	4096	8192	Default. Best $/quality for RAG.
`plugsky-embed-nim`	4096	8192	NVIDIA NV-Embed v1 — best general embeddings.
`plugsky-embed-multilingual`	1024	8192	BGE-M3 — multilingual embeddings (100+ langs).

Smart routing & Model Fusion

Two ways to get cost savings automatically. Set model="plugsky-fusion" to use the dashboard's default chain (sequential, parallel, cost-saver — your choice). Or set model="auto" with a route_hint for Plugsky's classifier-based routing. Typical savings: 60-80% on production traffic. See Model Fusion in the dashboard for the full UI.

python

# Option 1: use your configured Fusion chain
resp = client.chat.completions.create(
    model="plugsky-fusion",   # runs the workspace's default chain
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.model)             # which model actually answered (e.g. "plugsky-micro")

# Option 2: smart routing with cost hint
resp = client.chat.completions.create(
    model="auto",              # classifier picks the best model
    route_hint="cost",         # cost | quality | latency
    max_cost_per_1m=0.50,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.model)             # the model Plugsky chose

SDKs

The OpenAI SDK works as-is. We also ship idiomatic first-party SDKs that add Plugsky-specific features (smart routing, batch, regional pinning).

v2.1.0

Python

3.8+. pip install opencode. Streaming, async, type hints.

v1.4.0

Node.js / TypeScript

18+. npm i opencode. Browser, Bun, Deno, Cloudflare Workers.

v0.9.0

Go

1.21+. go get github.com/plugsky/opencode-go. Context-aware.

v1.1.0

Java / Kotlin

JDK 11+. Maven & Gradle. Coroutines, Reactor, sync.

v0.6.0

Rust

1.74+. cargo add opencode. tokio, async-std, sync.

v1.0

cURL & raw HTTP

Any HTTP client. OpenAPI 3.1 spec published at /v1/openapi.json.

Framework integrations

LangChain — ChatOpenAI(base_url="https://api.plugsky.com/v1")
LlamaIndex — OpenAI(base_url=…)
Vercel AI SDK — openai("…", { baseURL: "https://api.plugsky.com/v1" })
Haystack — OpenAIGenerator(api_base=…)
Semantic Kernel — OpenAIChatCompletion(…endpoint=…)
AutoGen — OpenAIWrapper(base_url=…)
OpenAI Playground — Custom base URL field. Tested daily.

Example: streaming with back-pressure

python

import openai
client = openai.OpenAI(api_key="psk_live_…", base_url="https://api.plugsky.com/v1")

with client.chat.completions.stream(
    model="plugsky-pro",
    messages=[{"role":"user","content":"Tell me a 500-word story about Plugsky"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    final = stream.get_final_completion()
    print("\n--- usage:", final.usage)

Official software & installers

First-party desktop, CLI, and self-hosted apps — all open-source under MIT / Apache-2.0, branded Plugsky, and pre-configured to talk to the Plugsky API. Install with one command.

Plugsky CLI

AI coding agent for your terminal. Fork of opencode, rebranded 100% as Plugsky. MIT-licensed, supports all major platforms.

bash

# macOS / Linux
curl -fsSL https://plugsky.com/install | bash

# Windows (PowerShell)
irm https://plugsky.com/install | iex

# With API key
curl -fsSL https://plugsky.com/install | bash -s -- --api-key sk-live-...
plugsky

View on GitHub → · All 58 integrations →

Plugsky Desktop (Jan-based)

Native AI chat client for macOS, Linux, Windows. Built on the open-source Jan project (Apache-2.0). Branded Plugsky, pre-configured with plugsky-pro as the default model.

bash

curl -fsSL https://plugsky.com/install-desktop | bash
plugsky-desktop

Available for macOS (Apple Silicon + Intel), Linux (x64), Windows (x64).

Plugsky Web (Open WebUI-based)

Self-hosted AI chat UI in your browser. Built on the open-source Open WebUI project. Docker or pip install, both backends supported. The plugsky-fusion model is pre-listed.

bash

curl -fsSL https://plugsky.com/install-web | bash
plugsky-web start
# → opens http://localhost:8080

Latest releases (v0.1.0 — 2026-06-23)

v0.1.0

Plugsky CLI

macOS arm64, macOS x64, Linux x64, Linux arm64, Windows x64, Windows arm64. ~52 MB. curl -fsSL https://plugsky.com/install | bash

v0.1.0

Plugsky Desktop (Jan)

macOS arm64/x64 (universal), Linux x64 AppImage, Windows x64 installer. ~150 MB. curl -fsSL https://plugsky.com/install-desktop | bash

v0.1.0

Plugsky Web (Open WebUI)

Docker image (1.5 GB) or pip install. curl -fsSL https://plugsky.com/install-web | bash

All three are MIT-licensed. See NOTICE for full upstream attribution.

Operations

Rate limits & quotas

All plans include unlimited usage within fair-use rate limits — no per-token charges, no per-request charges, no overage fees. The only limit is the per-minute request rate (RPM) for your tier. Increase limits from the Dashboard or by emailing support@plugsky.com.

Plan	Monthly fee	Fair-use RPM	Concurrent	API keys	Seats
Trial	$5 / 7 days	60	5	1	1
Starter	$20 / mo	60	10	5	1
Builder	$60 / mo	300	50	20	5
Scale	$120 / mo	1,000	200	100	25
Enterprise	Annual contract	Custom (10K+)	Custom	Unlimited	Unlimited

Same flat rate on every model — no separate pricing for plugsky-frontier vs plugsky-micro. All 31 models are included on every paid plan. Hit a 429? The response includes a Retry-After header. The SDKs retry with exponential backoff automatically.

Hit a 429? The response includes a Retry-After header. Use the SDKs — they retry with exponential backoff automatically.

Retries & idempotency

All POST endpoints accept an Idempotency-Key header. Re-sending the same key returns the cached result for 24 hours. This makes your POSTs safe to retry without double-billing or double-creating resources.

bash

curl -X POST https://api.plugsky.com/v1/chat/completions \
  -H "Authorization: Bearer $PLUGSKY_API_KEY" \
  -H "Idempotency-Key: $(uuidgen)" \
  -H "Content-Type: application/json" \
  -d '{"model":"plugsky-pro","messages":[{"role":"user","content":"hello"}]}'

Errors & status codes

Code	Meaning	What to do
`400`	Bad request — malformed JSON, invalid param	Validate locally before sending
`401`	Invalid or missing API key	Check `Authorization` header
`403`	Key lacks required scope	Check key role in Dashboard
`404`	Model or resource not found	List `/v1/models` to see what's available
`409`	Conflict (duplicate idempotency key with different body)	Generate a fresh key per logical request
`429`	Rate limit hit	Honor `Retry-After`
`500`	Internal error	Retry with backoff. Open a ticket if persistent.
`503`	Upstream provider down	Smart-routed requests automatically failover

Webhooks

Subscribe to 9 event types: batch.completed, fine_tuning.completed, invoice.paid, key.rotated, quota.warning, quota.exceeded, model.deprecated, usage.threshold, audit.alert. HMAC-SHA256 signed. Configure in Dashboard → Webhooks.

Logs & observability

Every request is logged with: timestamp, model, tokens, latency, status, key ID, project ID, region, request ID, optional user tag. Export to Datadog, Splunk, Grafana, New Relic, OpenTelemetry, or your SIEM.

Status & SLAs

Live status: /status. Public incident history. Uptime SLAs:

Builder / Scale: 99.9% monthly uptime, 10% credit on miss
Enterprise: 99.95% monthly uptime, 25% credit, 99.99% on multi-region deployments

Deployment topologies

Same API, four deployment models. Pick one, or combine them across teams.

DEFAULT

Plugsky Cloud

Multi-tenant SaaS. me-central-1 (UAE) primary, eu-west-1, us-east-1, ap-southeast-1. Fastest to start.

ENTERPRISE

VPC deployment

Plugsky runs inside your AWS / GCP / Azure / OCI VPC. Your network, your peering, your KMS, your keys.

REGULATED

On-premises

Helm chart or air-gap installer for your data center. GPU pool: H100, H200, MI300X, or CPU-only.

SOVEREIGN

Air-gapped

No internet at all. Bundle ships on physical media. Monthly model refresh by courier.

FINANCIAL

Bring-your-own-cloud

Plugsky control plane runs in our cloud; inference runs in your cloud account. You pay your hyperscaler directly.

SAAS

White-label

Your brand on the dashboard, your domain, your colors. Resell AI under your own SKU.

Decision matrix

You need…	Use
Ship in 1 day, no compliance	Plugsky Cloud
Data stays in me-central-1	Plugsky Cloud (region pinned)
No data leaves your AWS / Azure / GCP account	VPC deployment
SAMA / CBUAE / NSD audit trail	VPC deployment + customer-managed keys
Air-gap, no internet	On-prem or air-gapped
Resell AI under your brand	White-label

Security & compliance

Security model

Encryption in transit: TLS 1.3 only, HSTS, modern ciphers
Encryption at rest: AES-256-GCM, customer-managed keys available
Network isolation: per-tenant VPC, security groups, no shared kernel
Tenant isolation: logical (RBAC + scoped keys) or physical (your own cluster)
Secret hygiene: keys never logged, never returned in responses, hashed at rest with Argon2id
Pen tests: quarterly by HackerOne + an external firm. Reports under NDA.
Bug bounty: up to $25,000. security@plugsky.com

Data residency

Choose per-request or pin globally. Regions: me-central-1 (UAE — default GCC), sa-central-1 (Riyadh — Enterprise), eu-west-1 (Dublin), eu-central-1 (Frankfurt), us-east-1, us-west-2, ap-southeast-1 (Singapore). Data never leaves the pinned region. Deep dive →

Compliance & certifications

SOC 2 Type II — annually audited, report under NDA
ISO 27001 — InfoSec management
ISO 27701 — Privacy management
ISO 27017 / 27018 — Cloud & PII
GDPR — EU data protection
HIPAA — Healthcare (Enterprise + BAA)
PCI DSS — Card data safety (no inference on card data unless on-prem)
FedRAMP Moderate — In process, available on Enterprise
UAE PDPL — Federal Data Protection Law
DIFC DPL — Dubai International Financial Centre
SAMA CSF — Saudi Central Bank cyber framework
NSD — National Security Directive alignment (Enterprise on-prem)

DPA & legal

Standard Contractual Clauses (SCCs) baked into the DPA. Sub-processor list published and updated within 30 days of any change. Read the DPA →

Audit logs

Every key action — creation, rotation, scope change, deletion — is logged with actor, timestamp, IP, and request body hash. Exportable to your SIEM (Splunk, Sentinel, QRadar, Chronicle) via webhook or Kinesis/Firehose.

BYOK / HSM

Bring Your Own Key. Plugsky never sees your key — you import it into our HSM integration (AWS KMS, GCP KMS, Azure Key Vault, HashiCorp Vault, Thales Luna, AWS CloudHSM). Key rotation, revocation, and audit all yours.

PII handling

Three modes: no-PII (strict filter, PII auto-redacted), detect-only (PII tagged but not modified), passthrough (your responsibility). Default is detect-only for inference, no-PII for embeddings. Run the residency checklist →

Billing

Plans

Flat monthly fee per workspace. All 31 models included on every paid plan — no per-token charges, no per-request charges, no overage fees, no surprise bills. Cancel or downgrade anytime.

Plan	Monthly fee	What's included	Best for
Trial	$5 / 7 days	All 31 models, 60 RPM, 1 seat, 1 API key, no card	First call, evaluation
Starter	$20 / mo	All models up to plugsky-pro, 60 RPM, 1 seat, 5 keys	Solo devs, side projects
Builder	$60 / mo	All models up to plugsky-max + vision, 300 RPM, 5 seats, 20 keys	Production teams
Scale	$120 / mo	All 31 models including frontier, 1,000 RPM, 25 seats, 100 keys, SSO	High-volume SaaS
Enterprise	Annual contract	Unlimited usage, 10K+ RPM, unlimited seats, on-prem, BYOK, 99.99% SLA, DPA, BAA, dedicated engineer	Banks, gov, regulated

Annual billing saves 20%. Enterprise plans are annual contracts priced to your deployment, security, and volume requirements — enterprise@plugsky.com.

Usage & metering

Unlimited usage on every plan. No per-token charges, no per-request charges, no overage fees. The only limit is the fair-use RPM for your tier (60, 300, 1,000, or 10K+ on Enterprise). Token counts are still returned in every response (usage.prompt_tokens, usage.completion_tokens) for observability — but they don't drive billing.

Invoices & taxes

Monthly billing on the 1st. PDF invoices emailed automatically. VAT-compliant for UAE (5%), KSA (15%), EU (reverse charge), and US (no sales tax on SaaS in most states). Wire transfer, ACH, SEPA, and major credit cards. Annual contracts: pay upfront, save 15%.

Quotas & limits

Hard $ caps at the project level prevent runaway spend. Soft warning alerts at 50%, 80%, 95%. Hard block at 100% (auto-reject 402). You can set overage_behavior=allow with a finance-approved key to allow overage up to 3× the cap with auto-billing.

Migration guides

From OpenAI

Generate a Plugsky key in the Dashboard
In your code, change base_url to https://api.plugsky.com/v1
Optionally map gpt-4o → plugsky-pro, gpt-4o-mini → plugsky-lite for cost savings
Run your existing evals — should pass unchanged
Switch DNS / cut over when ready

Need a per-language migration walkthrough? Full guide → or generate code for your stack →

From Anthropic

Plugsky exposes the Messages API at /v1/messages with full Anthropic compatibility. Just change base_url and your Claude SDK code works as-is. Map claude-3-5-sonnet → plugsky-pro for 40%+ savings, same quality tier.

From Azure OpenAI

Point your Azure SDK at https://api.plugsky.com/v1 (Azure SDK supports custom endpoints). Models keep their Azure names with the azure/ prefix. Existing content filters and Azure-specific features have Plugsky equivalents — see the full compatibility matrix →

From AWS Bedrock

Use the Bedrock SDK's endpoint_url parameter. The Converse API is supported at /v1/bedrock/converse for drop-in compatibility.

Reference

Glossary

Term	Definition
Token	The atomic unit of billing. ~4 chars in English. ~1.5 chars in Arabic.
Context window	Max tokens a model can see in a single call (input + output).
Embedding	A fixed-length vector representation of text. Used for semantic search.
RAG	Retrieval-Augmented Generation: retrieve relevant docs, stuff into prompt, generate.
Function calling	Model returns a structured tool call instead of free text. You execute, return result.
Fine-tuning	Continue training a base model on your data. SFT (supervised) or DPO (preference).
Distillation	Train a small model to mimic a large one. Cheaper, faster, similar quality.
Agent	Model + tools + memory + planning loop. Autonomous multi-step task execution.
Vector store	Indexed embeddings for fast similarity search. Plugsky includes one out of the box.
BYOK	Bring Your Own Key. You control the encryption keys. We can't read your data.
Sovereign	Hosted entirely inside one jurisdiction, with no foreign access. PDPL-compliant.

Changelog

/changelog — full release history. Subscribe to RSS or the model.deprecated webhook.

Support

Email: support@plugsky.com (24/5 on Builder+, 24/7 on Scale+)
Slack Connect: for Enterprise customers
Office hours: weekly community call, Thu 4pm GST — join →
Status: /status
Security issues: security@plugsky.com (PGP key in /.well-known/security.txt)