Quantlix Documentation
Quantlix is the AI Runtime Control Plane for production AI systems. Deploy, control, observe, and evaluate AI workflows from one runtime layer.
Start here
The shortest path to understanding the AI Runtime Control Plane and getting a request through it.
What is Quantlix?
Plain-language overview of the AI Runtime Control Plane, the four pillars, and how Quantlix differs from gateways, evals, and observability tools.
Open →AI Runtime Control Plane
Concept page: the runtime layer in the request path versus the control plane for configuration, and how the two surfaces work together.
Open →Quickstart
Get a real request flowing through the runtime in a few minutes — without an ML background.
Open →Architecture
Request lifecycle, managed vs self-hosted deployments, data flow, and what Quantlix stores.
Open →Runtime protection (Boundary)
How policy runs before the model on /run and /v1/* gateway routes, contextual packs, and where to set org, project, and deployment defaults.
Open →Boundary pricing
Starter / Team / Enterprise limits, protected-flow definition, and what is enforced in the API vs contact-led packaging.
Open →Public sandbox quick test
Run Boundary against curated prompts in a browser-first sandbox before signup. Fastest way to see allow/redact/block outcomes.
Open →Product & integrations
Deployments
Turn workflows, models, retrieval, and tools into production-grade deployments. Manage revisions, environments, and rollouts.
Read more →Templates
Start from concrete outcomes—safer chat, document Q&A, approvals—not blank graphs.
Read more →Workflows
Workflow + MCP guide: input normalization, redaction chains, model/agent prompt fields, functions, retries, approvals, and routing.
Read more →Workflow setup examples
Concrete graph recipes for ticket analysis, CRM summaries, native agents, functions, knowledge lookup, approvals, policy gates, and model fallback.
Read more →Deploy multi-model workflows
Step-by-step guide for multi-model workflows with model roles, native agents, functions, provider-backed deployments, tools, routing, policies, and evals.
Read more →Runtime policies
Schema enforcement, PII/payment blocking, budget gates, allow/block/redact decisions, packs, CLI, and API.
Read more →Budget control
Cap AI cost and usage at runtime with request ceilings, compute limits, retry controls, and agent/tool safeguards.
Read more →Boundary enforcement
Practical setup guide for schema validation, guardrails, budgets, and audit evidence at the request path.
Read more →Governed chat widget
Script embed, React/Vue/Svelte npm packages (@quantlix-ai/chat-*), publishable keys, and CORS for site widgets.
Read more →Knowledge & docs
Answers from your own documents—chunking, embeddings, and sources (document lookup; also called RAG).
Read more →Deploy a RAG model
Step-by-step guide: connect providers, create a knowledge base, ingest docs, test retrieval, build a workflow, and audit results.
Read more →Retrieval integrations
Vector backends (pgvector, Pinecone, Weaviate, Qdrant) and semantic search APIs.
Read more →Evals — quality gates
Quality gates before changes reach production. Suites, regressions, scoring dimensions, and how evals tie into deployments and traces.
Read more →Quality checks workspace
Golden datasets and side-by-side comparisons inside the dashboard.
Read more →Observability
Trace every request, model call, tool call, policy decision, and eval result in one timeline.
Read more →Provider integrations
Anthropic, OpenAI, Azure OpenAI, Bedrock, Groq, Together, Voyage AI, credentials, and provider-backed inference targets.
Read more →Pricing, usage, and limits
Plan concepts, usage drivers, managed vs self-hosted costs, budget gates, and audit export expectations.
Read more →CLI
quantlix deploy, run, status. Full command reference with options and examples.
Read more →API Reference
REST API documentation. OpenAPI spec. Deploy, run, status, usage.
Open API docs →Advanced runtime control & contracts
For teams tightening enforcement and schema contracts after the basics feel familiar.
Contracts
Schema and feature contracts. Versioned, strict by default.
Read more →Runtime guarantees
What the runtime layer promises: retry limits, rate control, validation rules, compute ceilings, audit exports.
Read more →Troubleshooting
Fix common deployment, provider, MCP, model prompt, redaction, and portal dev issues.
Read more →Platform
Security & compliance
MFA, SSO/OIDC, signed audit exports, GDPR rights, data flow, ownership, hosting, and provider boundaries.
Read more →Glossary
Plain-language definitions for deployments, workflows, MCP, policies, contracts, traces, evals, and audit events.
Read more →Student privacy
Block or flag when student PII patterns are detected. Education and privacy contexts.
Read more →Teams
Org → Team → Project → Deployment. Roles: team_admin, developer, viewer.
Read more →Retrieval integrations
Vector backends (pgvector, Pinecone, Weaviate, Qdrant) and semantic search APIs.
Read more →Knowledge source configuration
Ingestion: upload, S3, web. Chunking, embedding, vector index setup.
Read more →Quick start (API)
Deploy with a contract, run inference, and see enforcement at the boundary.
1. Deploy with enforcement
Use dashboard onboarding or CLI. Contract enforcement is enabled by default.
2. Run inference
Valid requests pass. Invalid requests are blocked at the boundary.
curl -X POST https://api.quantlix.ai//run \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"deployment_id": "YOUR_DEPLOYMENT_ID", "input": {"prompt": "Hello, world!"}}'Markdown mirror for editors: docs/START_HERE.md in the repo (linked from /docs/start above).