Runtime Layer Extraction · Bring Your Own Model

Govern a model you host anywhere

The external_model workflow node wraps the full Quantlix runtime control plane — input policy, HTTP call to your endpoint, output validation, and one auditable trace — around a model Quantlix never hosts. Your model stays where it is; governance runs through Quantlix.

When to use this

Use Runtime Layer Extraction when you need Quantlix policy, tracing, and output validation around a model that must stay on your infrastructure — without self-hosting the entire Quantlix stack or routing inference through a provider-backed deployment.

model node

Calls a Quantlix provider-backed deployment via /run. Input policy only; inference goes to the bound provider model.

external_model node

Calls your HTTP endpoint with a stored credential. Input and output policy on one trace. Additive — does not change the gateway or /run.

This is distinct from self-hosted setup, where you run the API, portal, orchestrator, and cluster yourself. Runtime Layer Extraction keeps Quantlix managed while your model endpoint stays in your environment.

Request flow

Request

Your app or CI pipeline sends a workflow run.

Input policy

Quantlix enforces rules before the model is called.

Your model

HTTP call to your hosted endpoint with a stored credential.

Output policy

Egress pass on the model response before it leaves the node.

Response

Logged on one trace with both enforcement events.

How it works

One node. Two enforcement passes. One trace. Quantlix walks every input rule, calls your hosted endpoint, walks every output rule on the response, and records the whole loop.

Input enforcement

PII detection, prompt-injection defenses, schema and budget rules run at the boundary — before your model is ever called. Blocked requests stop here, logged.

Your model, your environment

Quantlix calls your endpoint over HTTP with a stored, encrypted credential. The model runs where you run it; the URL and secret reference live in Quantlix, not in your pipeline.

Output validation

The egress pass enforces output rules and redaction on the model response before it leaves the node — closing the part of the timeline input-only governance misses.

input ──▶ external_model ──▶ output
              │
              ├─ 1. input policy   → enforcement event (run_id, trace_id)
              ├─ 2. apply redacted input
              ├─ 3. HTTP call → your hosted model
              ├─ 4. normalize response (response_path)
              ├─ 5. output policy  → enforcement event (run_id, trace_id)
              └─ 6. emit model_output, input_decision, output_decision

Egress is bundled into the node, on by default. The node refuses to disable it silently. Blocked input stops with a logged event; endpoint failures retry per policy and then fail closed — unscreened output is never forwarded.

Configuration

Set these on the node's config block. In the portal builder, the External model configuration panel writes them (Workflows → edit → add an External model (bring-your-own) step).

FieldNotes
endpointRequired. Your hosted model URL.
methodHTTP method. Default POST.
credential_refId of an existing stored credential (decrypted server-side). Never a raw secret.
auth_headerHeader for the credential. Default Authorization.
auth_schemeScheme prefix. Default Bearer. Set to "" for bare tokens (e.g. x-api-key).
request_templateBody sent to your model. "{messages}" injects the full array; {prompt} resolves to prompt text.
response_pathRequired. Dotted path to assistant text (e.g. choices.0.message.content).
prompt_fieldPayload key holding the prompt text.
timeout_msPer-request timeout. Default 30000.
egressOutput policy pass. Default true; disabling skips output validation and surfaces a warning.

credential_ref holds the id of a credential already stored in Quantlix — for example from Providers. The portal never accepts or persists a raw API key in the workflow config; secrets stay encrypted at rest and decrypt server-side only at call time.

Setup

  1. Create or select a deployment the workflow will bind to for policy context.
  2. Ensure contextual policy applies to the output surface (contextual_policy.apply_to includes "output") so egress enforcement events are recorded. Many packs default to input-only; apply an enforcement pack that covers output, or configure apply_to explicitly.
  3. Store credentials in Quantlix (for example via Providers) and note the credential id for credential_ref.
  4. Open Dashboard → Workflows, add an input → external_model → output graph, and configure the External model panel (endpoint, request_template, response_path, credential_ref).
  5. Activate the workflow and run a test request. The trace should show two enforcement events (input + egress) on one run_id when output policy is configured.

You can also provision the graph via API or scripts/create_external_model_workflow.py for repeatable setup from CI. See Workflows for node semantics and execution APIs.

Evidence and audit trail

The external_model node is designed to produce a complete per-request timeline for models Quantlix does not host:

  • Art. 9 — boundary enforcement before the model runs; policy firings are recorded as evidence the control was active.
  • Art. 12 — input, policy decisions, and output on one run_id, not separate fragments.
  • Art. 15 — output validation on the model response, logged alongside the input that produced it.

Quantlix provides runtime enforcement and exportable evidence trails — not a determination that your organization is compliant. Compliance depends on factors beyond any single control. See EU AI Act readiness.

Next steps

Runtime Layer Extraction (BYO model) — Quantlix Docs — Quantlix