Deployment

I3K RAG Enterprise is typically installed with a one-command script on a single Linux host. For larger or more constrained environments, multi-server and air-gapped topologies are also supported. The same FastAPI backend, Qdrant vector store and Ollama runtime are used in every topology — what changes is only how the components are wired together.

Topologies

All components — FastAPI backend, React + Vite frontend, Qdrant, Ollama, SQLite user DB, Apache Tika and Tesseract — run on the same server. Installation takes a single command and roughly one hour, most of which is spent pulling the Qwen3:14b-q4_K_M and Mistral 7B Q4 models plus the BAAI/bge-m3 embedding model (29 languages).

Single-host is suitable for teams up to a few hundred users and datasets up to tens of thousands of documents. The stack is production-ready up to 10,000+ documents on commodity hardware.

Air-gapped

For isolated networks with no outbound connectivity — typical of Defense, healthcare and critical infrastructure. The installer supports offline installation from pre-downloaded packages and model bundles. Transfer the bundle through your approved channel, run the installer, and the system comes up without ever calling out.

Multi-server (advanced)

For larger workloads, Qdrant and Ollama can be moved onto dedicated GPU nodes, separated from the FastAPI backend. Configuration is manual and applied after the standard install: point the backend at the remote Qdrant endpoint and the remote Ollama endpoint, then restart the API service.

Hardware requirements

ResourceRequirement
GPUNVIDIA CUDA with 8–16 GB VRAM (recommended), AMD ROCm, or CPU-only (reduced performance)
RAM16 GB minimum, 32 GB recommended
Storage50 GB minimum, scales with the dataset
OSUbuntu 20.04+ (22.04 recommended)
Network80+ Mbit/s recommended for initial setup (model download)

CPU-only is supported and useful for evaluation, but expect noticeably higher latency on generation. For production workloads, plan around a CUDA GPU with at least 12 GB VRAM.

Backup & restore

Backup is built in and uses rclone under the hood, so every one of the 70+ providers rclone supports is available out of the box.

  • Object storage: S3, MinIO, Backblaze B2, Wasabi and S3-compatible endpoints.
  • Consumer cloud: Google Drive, OneDrive, Dropbox, Mega, pCloud.
  • Self-hosted: WebDAV / Nextcloud, ownCloud.
  • Traditional: FTP, SFTP.

You can:

  • Schedule backups via cron (daily, hourly, custom cadence).
  • Configure a retention policy (keep last N daily, weekly, monthly snapshots).
  • Run zero-downtime backups — Qdrant snapshots and the SQLite user DB are captured consistently without interrupting the API.

Restore is the inverse operation against the same remote: pull the bundle, run the restore command, and the instance comes back at the chosen point in time.

Reverse proxy & TLS

The FastAPI backend listens on localhost:8000 and the React frontend on localhost:3000. Both should sit behind a reverse proxy that terminates TLS — Caddy, nginx or Traefik all work. A minimal Caddy configuration:

rag.example.com {
  reverse_proxy /api/* localhost:8000
  reverse_proxy localhost:3000
}

Caddy will provision and renew a Let's Encrypt certificate automatically. For nginx or Traefik, mirror the same routing: /api/* to port 8000, everything else to port 3000.

Production checklist

Before going to production, work through:

  • Frontend and backend behind a reverse proxy with TLS 1.3.
  • Automatic backups to an external destination configured via rclone.
  • Disk quotas and log rotation in place for the data and log directories.
  • Monitoring wired up — Prometheus + node_exporter is a fine baseline; any solution you already run will do.
  • Update plan documented — how upstream repo updates are pulled and applied.
  • Disaster recovery plan tested end-to-end on a non-production dataset.
  • JWT secret rotated away from the default value.
  • Admin user provisioned with a strong password.
  • IP-based access restrictions configured if your threat model requires them.

I3K RAG Enterprise is distributed under AGPL-3.0. The source repository — github.com/I3K-IT/RAG-Enterprise — is the canonical reference for installer scripts, configuration knobs and supported upgrade paths.

Deployment — I3K RAG Enterprise — I3K RAG Enterprise