Multi-model workflows

Deploy multi-model workflows with Quantlix

This guide shows how to deploy an agent-style workflow where different supported steps can use different models, providers, tools, policies, budgets, and approval gates.

1. Define the workflow job

Start with the outcome, not the models: support triage, document Q&A, escalation, classification, tool execution, or multi-step analysis.

2. Connect model providers

Add the providers you need, such as Anthropic for reasoning, OpenAI or Azure OpenAI for general chat, Voyage AI for embeddings, Bedrock for AWS-governed access, or Groq/Together where supported.

3. Create one deployment per model role

Create separate provider-backed deployments for each role, for example planner, classifier, summarizer, and final answer model. Each deployment can have its own provider, contract, policies, and budget gates.

4. Create the workflow graph

Use the workflow editor to compose supported nodes such as input, retrieval, mcp_input, tool_call, function, model, agent, router, condition, approval, and output. Each model or agent node can bind to a different provider-backed deployment.

5. Add routing and fallbacks

Use router or condition nodes for tiering and escalation. Use fallback_node_id for supported fallback behavior when a node fails.

6. Add tools and MCP inputs

Use function/tool_call nodes for HTTP/webhook/internal tools, and mcp_input nodes for external MCP servers. Agent nodes can expose those functions as native provider tools.

7. Put policy boundaries before risky steps

Add schema contracts, redact_text, policy_check nodes, budget policies, approval nodes, and enforcement packs before expensive or sensitive model calls.

8. Bind deployments to model and agent nodes

In the workflow editor, select each model or agent node and choose the deployment that matches its role. You can also use a run default deployment for simple graphs.

9. Run and inspect the trace

Execute the workflow, then inspect node executions, tool calls, model payloads, latency, cost, policy decisions, and errors in observability.

10. Add evals for critical paths

Create golden datasets for the workflow’s important tasks. Use eval runs and comparisons to catch regressions after prompt, provider, routing, or policy changes.

Recommended graph pattern

Start simple, then add routing and fallbacks once the main path is reliable:

input
  → redact_text / policy_check
  → planner model
  → retrieval or mcp_input
  → agent or function
  → router / condition
  → specialist model
  → approval (optional)
  → final answer model
  → output

Bind each model or agent node to the deployment that fits the role. For example, use a cheaper classifier model for routing and a stronger reasoning model for native tool-calling.

Common questions

Can every model node use a different provider?

Yes. Quantlix workflows support per-model-step deployment bindings, so one graph can use different providers and models for different roles.

Does Quantlix execute agent nodes today?

Yes. Agent nodes run a bounded native tool-calling loop against a provider-backed chat deployment. Tools are configured as HTTP, webhook, or internal functions and every step is recorded in the workflow trace.

How do I prevent runaway workflow cost?

Use budget policies, retry limits, route-specific deployments, and approval gates before expensive or tool-heavy branches.

How do I audit what happened?

Each workflow execution records node-level inputs, outputs, errors, timings, traces, provider/model metadata, and policy decisions.