Deploying models with GPU

2025-02-18Quantlix Team

GPU acceleration can significantly reduce latency for larger models. Here's how to enable it.

Enabling GPU deployments

GPU-backed deployments are available on supported sandboxes and Enterprise engagements. Enable a GPU deployment with:

quantlix deploy qx-example-gpu --gpu --api-key <your_api_key>

We use NVIDIA RTX 4000 Ada GPUs with 20GB VRAM. GPU compute is billed at €0.50/hour.