Quantlix Documentation

Quantlix is the AI Runtime Control Plane for production AI systems. Deploy, control, observe, and evaluate AI workflows from one runtime layer.

Start here

The shortest path to understanding the AI Runtime Control Plane and getting a request through it.

What is Quantlix?

Plain-language overview of the AI Runtime Control Plane, the four pillars, and how Quantlix differs from gateways, evals, and observability tools.

Open →

AI Runtime Control Plane

Concept page: the runtime layer in the request path versus the control plane for configuration, and how the two surfaces work together.

Open →

Quickstart

Get a real request flowing through the runtime in a few minutes — without an ML background.

Open →

Architecture

Request lifecycle, managed vs self-hosted deployments, data flow, and what Quantlix stores.

Open →

Runtime protection (Boundary)

How policy runs before the model on /run and /v1/* gateway routes, contextual packs, and where to set org, project, and deployment defaults.

Open →

Boundary pricing

Starter / Team / Enterprise limits, protected-flow definition, and what is enforced in the API vs contact-led packaging.

Open →

Public sandbox quick test

Run Boundary against curated prompts in a browser-first sandbox before signup. Fastest way to see allow/redact/block outcomes.

Open →

Product & integrations

Deployments

Turn workflows, models, retrieval, and tools into production-grade deployments. Manage revisions, environments, and rollouts.

Templates

Start from concrete outcomes—safer chat, document Q&A, approvals—not blank graphs.

Workflows

Workflow + MCP guide: input normalization, redaction chains, model/agent prompt fields, functions, retries, approvals, and routing.

Workflow setup examples

Concrete graph recipes for ticket analysis, CRM summaries, native agents, functions, knowledge lookup, approvals, policy gates, and model fallback.

Deploy multi-model workflows

Step-by-step guide for multi-model workflows with model roles, native agents, functions, provider-backed deployments, tools, routing, policies, and evals.

Runtime policies

Schema enforcement, PII/payment blocking, budget gates, allow/block/redact decisions, packs, CLI, and API.

Budget control

Cap AI cost and usage at runtime with request ceilings, compute limits, retry controls, and agent/tool safeguards.

Boundary enforcement

Practical setup guide for schema validation, guardrails, budgets, and audit evidence at the request path.

Governed chat widget

Script embed, React/Vue/Svelte npm packages (@quantlix-ai/chat-*), publishable keys, and CORS for site widgets.

Knowledge & docs

Answers from your own documents—chunking, embeddings, and sources (document lookup; also called RAG).

Deploy a RAG model

Step-by-step guide: connect providers, create a knowledge base, ingest docs, test retrieval, build a workflow, and audit results.

Retrieval integrations

Vector backends (pgvector, Pinecone, Weaviate, Qdrant) and semantic search APIs.

Evals — quality gates

Quality gates before changes reach production. Suites, regressions, scoring dimensions, and how evals tie into deployments and traces.

Quality checks workspace

Golden datasets and side-by-side comparisons inside the dashboard.

Observability

Trace every request, model call, tool call, policy decision, and eval result in one timeline.

Provider integrations

Anthropic, OpenAI, Azure OpenAI, Bedrock, Groq, Together, Voyage AI, credentials, and provider-backed inference targets.

Pricing, usage, and limits

Plan concepts, usage drivers, managed vs self-hosted costs, budget gates, and audit export expectations.

CLI

quantlix deploy, run, status. Full command reference with options and examples.

API Reference

REST API documentation. OpenAPI spec. Deploy, run, status, usage.

Open API docs →

Advanced runtime control & contracts

For teams tightening enforcement and schema contracts after the basics feel familiar.

Quick start (API)

Deploy with a contract, run inference, and see enforcement at the boundary.

1. Deploy with enforcement

Use dashboard onboarding or CLI. Contract enforcement is enabled by default.

2. Run inference

Valid requests pass. Invalid requests are blocked at the boundary.

curl -X POST https://api.quantlix.ai//run \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"deployment_id": "YOUR_DEPLOYMENT_ID", "input": {"prompt": "Hello, world!"}}'

3. CLI

Install, authenticate, deploy, and run. Enforcement applies to every request.

pip install quantlix
quantlix login
quantlix deploy qx-example
quantlix run <deployment_id> -i '{"prompt": "Hello!"}'

Markdown mirror for editors: docs/START_HERE.md in the repo (linked from /docs/start above).

Start here

What is Quantlix?

AI Runtime Control Plane

Quickstart

Architecture

Runtime protection (Boundary)

Boundary pricing

Public sandbox quick test

Product & integrations

Deployments

Templates

Workflows

Workflow setup examples

Deploy multi-model workflows

Runtime policies

Budget control

Boundary enforcement

Governed chat widget

Knowledge & docs

Deploy a RAG model

Retrieval integrations

Evals — quality gates

Quality checks workspace

Observability

Provider integrations

Pricing, usage, and limits

CLI

API Reference

Advanced runtime control & contracts

Contracts

Runtime guarantees

Troubleshooting

Platform

Security & compliance

Glossary

Student privacy

Teams

Retrieval integrations

Knowledge source configuration

Quick start (API)

1. Deploy with enforcement

2. Run inference

3. CLI