Editions

Community for the open team. Pro for production scale.

Same engine core, packaged for two very different needs. Community ships under AGPL-3.0 as a one-command install. Pro adds a structured-data warehouse, hybrid retrieval, a smart scheduler and an admin panel — for teams that need to query thousands of documents as data, not just text.

Community

AGPL-3.0 · open-source

Sovereign RAG, one command away.

One-command install in ~1 hour
Vector retrieval with multilingual embeddings
Integrated backup to 70+ destinations (rclone)
Runs on NVIDIA, AMD or CPU-only hosts
Up to ~10,000 documents per node

Get on GitHub

Pro

Commercial license

Most powerful

Production-grade for regulated workloads.

Custom EuLLM inference engine (sovereign default)
Hybrid retrieval: vector + BM25 + cross-encoder rerank
LLM-powered extraction into a SQL warehouse
Smart scheduler with live user coordination
Admin panel + fully air-gapped offline mode

Talk to sales

Shared foundation

Both editions run entirely inside your perimeter. No outbound calls, no telemetry, no managed dependencies.

FastAPI backend with JWT authentication and RBAC
React + Vite frontend
Qdrant vector store (Berlin-built, open-source)
Local LLM inference (Ollama or EuLLM)
Apache Tika + Tesseract for ingestion and OCR
10+ document formats including scanned PDFs
Different default ports — both editions can run on the same host

What Pro adds

Six concrete capabilities that the open Community edition does not ship.

Custom EuLLM inference engine

Pro ships with EuLLM as the default generation backend: an EU-sovereign LLM inference stack built around GGUF weights, with continuous batching and a 16K-token context window. Ollama remains available as a fallback profile for evaluation.

Continuous batching for high throughput on single-GPU hosts
16K context window for long contracts, transcripts and reports
Fully air-gapped: offline HuggingFace mode, no calls home
Compatible with Mistral 7B (built in France) and other GGUF models

Hybrid retrieval

Pure vector search misses precise keyword matches; pure BM25 misses semantics. Pro runs both, fuses them with Reciprocal Rank Fusion, and reranks the result with a cross-encoder.

Vector + BM25 with configurable lexical-semantic balance
Cross-encoder reranking for top-K precision
Multi-query expansion for ambiguous questions
Query analytics on every step of the pipeline

Structured data warehouse

Pro exclusive

Pro doesn't just index your documents — it extracts entities, events and amounts into a normalized relational database. That database becomes a source of verified facts the chat pipeline can quote with provenance.

Six-table schema: extraction jobs, entities, events, amounts, progress, document summaries
Indexed columns and foreign keys — analytical queries are fast
SQLite with WAL today; PostgreSQL-portable when you outgrow it
Four domain profiles: Generic, Intelligence/OSINT, Medical, Legal
Pydantic validation with automatic recovery for malformed LLM JSON
Every extracted fact links back to its source chunk for audit

Result: questions like "how many contracts mention vendor X", "list all events in Q3 2025" or "total exposure by counterparty" are answered from SQL, not by guessing through vector similarity.

Query router

A small classifier decides at runtime whether your question is aggregate, semantic or hybrid — then routes it to the right backend and assembles the answer.

Aggregate questions (count, list, group-by, date filters) → SQL warehouse
Semantic questions ("why", "how", "explain") → vector retrieval
Hybrid questions get both: SQL facts injected into the LLM prompt as "verified data", retrieved chunks as context
Deterministic, transparent, and inspectable per request

Smart scheduler with live coordination

Extraction is a heavy job. Pro coordinates it with the people actually using the system, instead of crashing the chat experience.

Configurable run time (default 20:00); checks every 60 seconds
Connected users get a WebSocket notification before extraction starts
Confirm now, or postpone — up to 3 times, 15 minutes each
Dynamic model switch during extraction: chat drops from 14B to 8B to free VRAM, then restores
Per-document checkpoints: resume without restarting from zero

Hands-free ingestion + admin panel

Drop a file in a watched folder and it lands in both the vector index and the extraction queue. The admin panel gives operators a single pane of glass.

Folder watcher for batch ingestion — no manual upload step
Extended OCR pipeline for handwritten forms, multi-column scans and low-quality faxes
Resource monitoring (CPU, memory, GPU) inside the UI
Service control: restart components from the admin panel
Extraction job dashboard with status, errors and manual rerun
Configuration UI for scheduler, retention and model selection

Community covers 80% of cases

Most teams don't need an extraction warehouse — and Community is the right choice for them.

Best fit: departments up to a few hundred users, datasets up to tens of thousands of documents, single-host deployments
Killer feature: integrated rclone backup with 70+ destinations, cron scheduling, retention and zero-downtime restores
AGPL-3.0 means you can audit it, modify it and embed it in your own AGPL-compatible work
No commercial contract, no vendor dependency — fork it the day we stop maintaining it

When to choose Pro

If any of these are true, the Pro edition pays for itself quickly.

You need aggregate queries over thousands of documents (count, list, group-by, filter by date)
Compliance requires deterministic structured outputs (KYC, regulatory reporting, audit trails)
You operate in a regulated vertical — legal, healthcare, intelligence/OSINT, defense — and benefit from a domain-tuned extraction profile
You need full air-gapped operation, including offline models and offline metadata
You already run Community and want Pro side-by-side on the same hosts during evaluation

Designed to coexist

Pro and Community use different default ports so you can run both on the same hardware during evaluation or migration.

Component	Community	Pro
Frontend	:3000	:3002
Backend API	:8000	:8001
Qdrant	:6333	:6334
LLM engine	:11434	:11435

Same data formats. Moving from Community to Pro is a configuration change and a license key, not a re-ingestion.

Start where you are. Upgrade when you need to.

The Community edition is on GitHub today. For Pro, book a 30-minute call and we'll come back with a deployment proposal in two working days.

Get Community Talk to sales about Pro

Ready to run RAG on your own infrastructure?

Start with the open-source Community edition, or talk to us about Pro with structured extraction, SSO, audit log and SLA.

Star on GitHub Book a call