Quantlix Documentation

Quantlix is the AI Runtime Control Plane for production AI systems. Deploy, control, observe, and evaluate AI workflows from one runtime layer.

Start here

The shortest path to understanding the AI Runtime Control Plane and getting a request through it.

Product & integrations

Deployments

Turn workflows, models, retrieval, and tools into production-grade deployments. Manage revisions, environments, and rollouts.

Read more →

Templates

Start from concrete outcomes—safer chat, document Q&A, approvals—not blank graphs.

Read more →

Workflows

Workflow + MCP guide: input normalization, redaction chains, model/agent prompt fields, functions, retries, approvals, and routing.

Read more →

Workflow setup examples

Concrete graph recipes for ticket analysis, CRM summaries, native agents, functions, knowledge lookup, approvals, policy gates, and model fallback.

Read more →

Deploy multi-model workflows

Step-by-step guide for multi-model workflows with model roles, native agents, functions, provider-backed deployments, tools, routing, policies, and evals.

Read more →

Runtime policies

Schema enforcement, PII/payment blocking, budget gates, allow/block/redact decisions, packs, CLI, and API.

Read more →

Budget control

Cap AI cost and usage at runtime with request ceilings, compute limits, retry controls, and agent/tool safeguards.

Read more →

Boundary enforcement

Practical setup guide for schema validation, guardrails, budgets, and audit evidence at the request path.

Read more →

Governed chat widget

Script embed, React/Vue/Svelte npm packages (@quantlix-ai/chat-*), publishable keys, and CORS for site widgets.

Read more →

Knowledge & docs

Answers from your own documents—chunking, embeddings, and sources (document lookup; also called RAG).

Read more →

Deploy a RAG model

Step-by-step guide: connect providers, create a knowledge base, ingest docs, test retrieval, build a workflow, and audit results.

Read more →

Retrieval integrations

Vector backends (pgvector, Pinecone, Weaviate, Qdrant) and semantic search APIs.

Read more →

Evals — quality gates

Quality gates before changes reach production. Suites, regressions, scoring dimensions, and how evals tie into deployments and traces.

Read more →

Quality checks workspace

Golden datasets and side-by-side comparisons inside the dashboard.

Read more →

Observability

Trace every request, model call, tool call, policy decision, and eval result in one timeline.

Read more →

Provider integrations

Anthropic, OpenAI, Azure OpenAI, Bedrock, Groq, Together, Voyage AI, credentials, and provider-backed inference targets.

Read more →

Pricing, usage, and limits

Plan concepts, usage drivers, managed vs self-hosted costs, budget gates, and audit export expectations.

Read more →

CLI

quantlix deploy, run, status. Full command reference with options and examples.

Read more →

API Reference

REST API documentation. OpenAPI spec. Deploy, run, status, usage.

Open API docs →

Advanced runtime control & contracts

For teams tightening enforcement and schema contracts after the basics feel familiar.

Platform

Quick start (API)

Deploy with a contract, run inference, and see enforcement at the boundary.

1. Deploy with enforcement

Use dashboard onboarding or CLI. Contract enforcement is enabled by default.

2. Run inference

Valid requests pass. Invalid requests are blocked at the boundary.

curl -X POST https://api.quantlix.ai//run \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"deployment_id": "YOUR_DEPLOYMENT_ID", "input": {"prompt": "Hello, world!"}}'

3. CLI

Install, authenticate, deploy, and run. Enforcement applies to every request.

pip install quantlix
quantlix login
quantlix deploy qx-example
quantlix run <deployment_id> -i '{"prompt": "Hello!"}'

Markdown mirror for editors: docs/START_HERE.md in the repo (linked from /docs/start above).