Open-source · AGPL-3.0

Enterprise RAG.
Self‑hosted. EU‑sovereign.

The open-source RAG platform for organizations that cannot send their data to American clouds. One-command setup, on-premise or air-gapped.

50+ GitHub starsGDPR readyEU AI Act ready

Built on European open-source

Sovereignty isn't a slogan — it's a supply chain.

Every line we depend on is open-source, auditable, and developed inside the European Union. No US cloud in the request path.

+ more EU open-source projects we contribute to

Why I3K RAG Enterprise

Built for organizations that can't outsource their data.

Most RAG platforms assume you're happy sending your documents to a US cloud. We assume you're not — and we engineer accordingly.

EU-sovereign by design

Servers, data, models and the engineering team are all in the European Union. No transatlantic data transfers, no Schrems II exposure.

Backup & restore enterprise-grade

Full system backup with 70+ cloud providers via rclone. S3, MEGA, Google Drive, OneDrive, Dropbox, WebDAV/Nextcloud, FTP, SFTP, Backblaze B2, pCloud. Cron scheduling, retention policies, zero-downtime.

Runs on the hardware you have

NVIDIA CUDA, AMD ROCm, or CPU-only. No vendor lock-in. Choose during setup.

Multilingual out of the box

BAAI/bge-m3 embeddings support 29 languages out of the box. No per-language fine-tuning.

On-premise first

One-command install on Ubuntu 20.04+. Runs fully air-gapped on your hardware. No outbound calls to third-party APIs.

GDPR & EU AI Act ready

JWT auth, RBAC with 3 roles, audit log and retention policies. AGPL-3.0 source — auditable end to end.

How it works

The RAG pipeline in 4 steps.

Ingest, embed, retrieve, generate. Every stage runs locally. No data ever leaves your infrastructure.

  1. 01

    Ingest

    Upload via web UI or API. Apache Tika + Tesseract extract text from PDF, DOCX, PPTX, XLSX, ODT, RTF, HTML, XML and scanned documents (OCR).

    Tika · OCR

  2. 02

    Embed & store

    Documents are chunked and embedded with BAAI/bge-m3 (29 languages). Vectors stored in Qdrant with metadata for RBAC filtering.

    bge-m3 · Qdrant

  3. 03

    Retrieve

    Our in-house retrieval orchestrator runs semantic retrieval from Qdrant. Configurable relevance threshold and top-K. Role-based filtering at retrieval layer.

    I3K orchestrator

  4. 04

    Generate

    Retrieved chunks passed to EuLLM (default) or compatible LLM for grounded answer generation. Default models: Qwen3:14b-q4_K_M, Mistral 7B Q4. Fully local, zero external calls.

    EuLLM · Mistral 7B

  • 29 Languages

  • 70+ Backup destinations

  • 10,000+ Documents per node

  • 100% Local

Compare

I3K RAG Enterprise vs the alternatives.

We've tried to be fair. If we've got something wrong about a competitor, tell us.

CapabilityI3K RAG EnterpriseOnyxGleanCohere North
Open-source coreYesYesNoNo
EU-sovereign LLM engine included (EuLLM)YesNoNoNo
100% self-hostedYesYesHybridNo
Air-gapped deploymentYesPartialNoNo
EU-resident development & dataYesNoNoNo
Built-in backup (70+ cloud providers)YesNoN/AN/A
Multi-GPU support (NVIDIA/AMD/CPU)YesLimitedN/AN/A
Multilingual (29 languages, bge-m3)YesEN-firstEN-firstYes
GDPR + EU AI Act toolingYesDIYAdd-onAdd-on
No vendor lock-inYesYesNoNo

Comparison based on publicly available documentation as of 2026. Vendor capabilities evolve — verify against the latest releases.

Ready to run RAG on your own infrastructure?

Start with the open-source Community edition, or talk to us about Pro with structured extraction, SSO, audit log and SLA.