← Back to Blog

Knowledge source configuration: Build RAG from upload, S3, or web

Quantlix Team

A knowledge base is a RAG container: it groups sources (where documents come from) and defines how they're chunked, embedded, and indexed. You configure it once, then ingest and query.

The setup flow

  1. Provider — Add a provider with embeddings (e.g. Voyage AI)
  2. Chunking profile — Strategy (fixed, markdown, semantic), chunk size, overlap
  3. Embedding profile — Which provider model to use for embeddings
  4. Vector index — Backend (pgvector, Pinecone, Weaviate, Qdrant)
  5. Knowledge base — Container with default profiles + vector index
  6. Sources — Add upload, S3, or web source
  7. Ingestion — Upload docs, trigger ingestion job

All of this lives in Dashboard → Knowledge.

Source types

TypeUse case
uploadFiles uploaded via portal or API. PDFs, markdown, etc.
s3S3-compatible storage (MinIO, Scaleway, AWS). Bucket + path.
webFetch from a URL. Good for docs sites.

Each source has a sync mode: manual (you trigger ingestion), scheduled (runs on an interval), or webhook (external system triggers via URL).

The pipeline

Ingestion runs: fetch documents → chunk → embed → index into vector store. Documents and chunks are stored in Postgres; embeddings go to the vector backend. Once ingested, you can query via the retrieval API or the RAG endpoint.

Webhook for CI/CD

For sources that update from external systems (e.g. docs built in CI), use `sync_mode: "webhook"`. The API returns a webhook secret. When your docs change, call the webhook URL with the token — no API key needed. Quantlix triggers ingestion automatically.

Knowledge source configuration: Build RAG from upload, S3, or web — Quantlix — Quantlix