Examples
Basic deployment budget
Cap direct provider-backed model calls before unexpected traffic reaches the model.
{
"pipeline_lock": {
"contract_version": "1.0",
"mode": "enforce",
"schema": {
"strict": true,
"input_schema": {
"type": "object",
"required": ["prompt"],
"properties": { "prompt": { "type": "string" } },
"additionalProperties": false
}
},
"policies": {
"actions": { "on_violation": "block", "emit_event": true },
"budget": {
"request_rate_per_minute": 60,
"max_compute_per_request_seconds": 120,
"retry_cost_multiplier_ceiling": 2.0
}
}
}
}Cost-sensitive pack
Use the preset when you want rate, compute, and retry ceilings without hand-writing the full policy.
curl -X POST https://api.quantlix.ai/deployments/DEPLOYMENT_ID/apply-pack \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"pack_id": "cost-sensitive"}'Agent budget pattern
Keep native tool-calling agents bounded so a tool loop cannot surprise finance.
{
"agent": {
"deployment_id": "dep_reasoning_model",
"prompt_field": "question",
"max_iterations": 4,
"tools": [
{
"name": "lookup_account",
"description": "Fetch account context",
"input_schema": {
"type": "object",
"properties": { "account_id": { "type": "string" } },
"required": ["account_id"]
},
"function": {
"execution_type": "http",
"method": "POST",
"endpoint": "https://crm.internal/account",
"timeout_ms": 5000
}
}
]
}
}Workflow cost pattern
Route cheap classification before expensive retrieval, tools, or stronger models.
input
-> policy_check
-> router / condition
-> cheap classifier model
-> retrieval or tool_call only when needed
-> final model
-> output