Runtime Layer Extraction · Bring Your Own Model
Govern a model you host anywhere
The external_model workflow node wraps the full Quantlix runtime control plane — input policy, HTTP call to your endpoint, output validation, and one auditable trace — around a model Quantlix never hosts. Your model stays where it is; governance runs through Quantlix.
When to use this
Use Runtime Layer Extraction when you need Quantlix policy, tracing, and output validation around a model that must stay on your infrastructure — without self-hosting the entire Quantlix stack or routing inference through a provider-backed deployment.
model node
Calls a Quantlix provider-backed deployment via /run. Input policy only; inference goes to the bound provider model.
external_model node
Calls your HTTP endpoint with a stored credential. Input and output policy on one trace. Additive — does not change the gateway or /run.
This is distinct from self-hosted setup, where you run the API, portal, orchestrator, and cluster yourself. Runtime Layer Extraction keeps Quantlix managed while your model endpoint stays in your environment.
Request flow
Request
Your app or CI pipeline sends a workflow run.
Input policy
Quantlix enforces rules before the model is called.
Your model
HTTP call to your hosted endpoint with a stored credential.
Output policy
Egress pass on the model response before it leaves the node.
Response
Logged on one trace with both enforcement events.
How it works
One node. Two enforcement passes. One trace. Quantlix walks every input rule, calls your hosted endpoint, walks every output rule on the response, and records the whole loop.
Input enforcement
PII detection, prompt-injection defenses, schema and budget rules run at the boundary — before your model is ever called. Blocked requests stop here, logged.
Your model, your environment
Quantlix calls your endpoint over HTTP with a stored, encrypted credential. The model runs where you run it; the URL and secret reference live in Quantlix, not in your pipeline.
Output validation
The egress pass enforces output rules and redaction on the model response before it leaves the node — closing the part of the timeline input-only governance misses.
input ──▶ external_model ──▶ output
│
├─ 1. input policy → enforcement event (run_id, trace_id)
├─ 2. apply redacted input
├─ 3. HTTP call → your hosted model
├─ 4. normalize response (response_path)
├─ 5. output policy → enforcement event (run_id, trace_id)
└─ 6. emit model_output, input_decision, output_decisionEgress is bundled into the node, on by default. The node refuses to disable it silently. Blocked input stops with a logged event; endpoint failures retry per policy and then fail closed — unscreened output is never forwarded.
Configuration
Set these on the node's config block. In the portal builder, the External model configuration panel writes them (Workflows → edit → add an External model (bring-your-own) step).
| Field | Notes |
|---|---|
| endpoint | Required. Your hosted model URL. |
| method | HTTP method. Default POST. |
| credential_ref | Id of an existing stored credential (decrypted server-side). Never a raw secret. |
| auth_header | Header for the credential. Default Authorization. |
| auth_scheme | Scheme prefix. Default Bearer. Set to "" for bare tokens (e.g. x-api-key). |
| request_template | Body sent to your model. "{messages}" injects the full array; {prompt} resolves to prompt text. |
| response_path | Required. Dotted path to assistant text (e.g. choices.0.message.content). |
| prompt_field | Payload key holding the prompt text. |
| timeout_ms | Per-request timeout. Default 30000. |
| egress | Output policy pass. Default true; disabling skips output validation and surfaces a warning. |
credential_ref holds the id of a credential already stored in Quantlix — for example from Providers. The portal never accepts or persists a raw API key in the workflow config; secrets stay encrypted at rest and decrypt server-side only at call time.
Setup
- Create or select a deployment the workflow will bind to for policy context.
- Ensure contextual policy applies to the output surface (contextual_policy.apply_to includes "output") so egress enforcement events are recorded. Many packs default to input-only; apply an enforcement pack that covers output, or configure apply_to explicitly.
- Store credentials in Quantlix (for example via Providers) and note the credential id for credential_ref.
- Open Dashboard → Workflows, add an input → external_model → output graph, and configure the External model panel (endpoint, request_template, response_path, credential_ref).
- Activate the workflow and run a test request. The trace should show two enforcement events (input + egress) on one run_id when output policy is configured.
You can also provision the graph via API or scripts/create_external_model_workflow.py for repeatable setup from CI. See Workflows for node semantics and execution APIs.
Evidence and audit trail
The external_model node is designed to produce a complete per-request timeline for models Quantlix does not host:
- Art. 9 — boundary enforcement before the model runs; policy firings are recorded as evidence the control was active.
- Art. 12 — input, policy decisions, and output on one
run_id, not separate fragments. - Art. 15 — output validation on the model response, logged alongside the input that produced it.
Quantlix provides runtime enforcement and exportable evidence trails — not a determination that your organization is compliant. Compliance depends on factors beyond any single control. See EU AI Act readiness.
Next steps
- Workflows & MCP integration — node types, graph wiring, and portal setup.
- Provider integrations — store credentials used by
credential_ref. - Self-hosted setup — when you need the full Quantlix stack in your environment instead.
- Observability — read traces and enforcement events for workflow runs.