← Back to Blog

Retrieval integrations: Vector backends and RAG

Quantlix Team

RAG (Retrieval Augmented Generation) needs a vector store. Quantlix supports several: pgvector (built-in), Pinecone, Weaviate, and Qdrant. You choose the backend that fits your scale and deployment model.

Vector backends

| Backend | Type | Use case |

|---------|------|----------|

| **pgvector** | Built-in (Postgres) | Default. No extra setup. |

| **Pinecone** | Cloud | Managed, scalable. |

| **Weaviate** | Local or cloud | Self-hosted or cloud. |

| **Qdrant** | Local or cloud | Self-hosted or cloud. |

Create vector indexes in **Dashboard → Knowledge → Vector indexes**. Assign one to each knowledge base. pgvector works out of the box. For Pinecone, Weaviate, or Qdrant, you add the service (e.g. via Docker for local) and configure the URL and credentials.

Two ways to query

**Semantic search only** — `POST /retrieval/query` returns chunks. You get the relevant text chunks for a query; you decide what to do with them (e.g. pass to your own LLM).

curl -s -X POST "$API_URL/retrieval/query" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"knowledge_base_id": "'"$KB_ID"'", "query": "What is RAG?", "top_k": 5}' | python3 -m json.tool

**Full RAG** — `POST /rag/run` retrieves chunks and generates an answer with citations. You provide a chat model and a question; Quantlix does the rest.

curl -s -X POST "$API_URL/rag/run" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"knowledge_base_id": "'"$KB_ID"'", "provider_model_id": "'"$CHAT_MODEL_ID"'", "question": "What is RAG?", "top_k": 5}' | python3 -m json.tool

Returns `answer`, `citations`, `model`, and token usage.

Try it in the portal

Add a knowledge base, add sources, upload documents, run ingestion. Then use the **Try RAG** modal on the knowledge base page to ask questions and see answers with citations.