← Back to Blog

Retrieval integrations: Vector backends and RAG

Quantlix Team

RAG (Retrieval Augmented Generation) needs a vector store. Quantlix supports several: pgvector (built-in), Pinecone, Weaviate, and Qdrant. You choose the backend that fits your scale and deployment model.

Vector backends

BackendTypeUse case
pgvectorBuilt-in (Postgres)Default. No extra setup.
PineconeCloudManaged, scalable.
WeaviateLocal or cloudSelf-hosted or cloud.
QdrantLocal or cloudSelf-hosted or cloud.

Create vector indexes in Dashboard → Knowledge → Vector indexes. Assign one to each knowledge base. pgvector works out of the box. For Pinecone, Weaviate, or Qdrant, you add the service (e.g. via Docker for local) and configure the URL and credentials.

Two ways to query

Semantic search only — `POST /retrieval/query` returns chunks. You get the relevant text chunks for a query; you decide what to do with them (e.g. pass to your own LLM).

curl -s -X POST "$API_URL/retrieval/query" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"knowledge_base_id": "'"$KB_ID"'", "query": "What is RAG?", "top_k": 5}' | python3 -m json.tool

Full RAG — `POST /rag/run` retrieves chunks and generates an answer with citations. You provide a chat model and a question; Quantlix does the rest.

curl -s -X POST "$API_URL/rag/run" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"knowledge_base_id": "'"$KB_ID"'", "provider_model_id": "'"$CHAT_MODEL_ID"'", "question": "What is RAG?", "top_k": 5}' | python3 -m json.tool

Returns `answer`, `citations`, `model`, and token usage.

Try it in the portal

Add a knowledge base, add sources, upload documents, run ingestion. Then use the Try RAG modal on the knowledge base page to ask questions and see answers with citations.

Retrieval integrations: Vector backends and RAG — Quantlix — Quantlix