// Air-Gapped Analytics Agent

Air-gapped analytics agent on Bielik - Polish government, cleared by the Cyberspace Defence Force

A Polish government agency needed an analytics agent its analysts could query in Polish, over 5,000+ internal documents spread across intranet, document repositories, and internal databases. No third-party LLM was on the table, and the network is air-gapped. We deployed Bielik on customer hardware at 30 tok/s, wired hybrid retrieval over the source systems, and shipped the stack through security testing by the Polish Cyberspace Defence Force (Wojska Obrony Cyberprzestrzeni).

offices

Poland

size

Government agency - anonymized

industry

Public sector

// Outcomes

The numbers that matter

  • 30 tok/s

    Bielik inference on-prem

  • 5,000+

    documents across internal systems

  • Air-gapped

    cleared by Cyberspace Defence Force

01 · Analysts working across five internal systems - and no LLM was allowed to leave the network

The Challenge

The customer is a Polish government agency. Their analysts answer questions that span multiple internal systems - the intranet, document repositories, internal databases, and other source systems we can't name. The work is mostly retrieval-and-synthesize: a question comes in, the analyst opens four or five tabs, reads across them, and writes the answer. Junior analysts spend most of a shift on the retrieval part.

Two constraints ruled out the obvious approach. First, the network is air-gapped - no internet, no third-party API, no SaaS. Sending a query to OpenAI or Anthropic was never on the table. Second, the working language is Polish, and the documents are written in Polish administrative and legal style - frontier-model performance on that distribution is reasonable but not the only criterion when the model has to live behind the perimeter anyway.

The procurement bar was also defense-grade. Whatever we shipped had to pass security testing by the Polish Cyberspace Defence Force (Wojska Obrony Cyberprzestrzeni) - the branch responsible for cyber-operations and accreditation across military and government networks. That set the engineering constraints from day one: signed install bundles, zero outbound packets, audit logs into the agency's existing security tooling, no opaque dependencies.

02 · Bielik on customer hardware, hybrid retrieval over the source systems, signed everything

Approach

Step 1: Bielik as the language model - Polish-native, open-weight, on-prem.

We picked Bielik (the Polish open-weight LLM family) as the generation model. Two reasons. The training distribution is heavily Polish, including administrative and legal text, which matters for the agency's corpus. And the weights are open and redistributable, which makes air-gapped deployment straightforward - there is no licence server to phone home, no usage-metering API, no provider that has to know the model is running. Inference runs on customer hardware at 30 tok/s, which is comfortably above the threshold where streaming feels responsive on a chat surface.

Step 2: Hybrid retrieval over five-thousand-plus documents from multiple source systems.

The corpus isn't a single Confluence space. It's documents pulled from the intranet, document repositories, and other internal systems - each with its own format quirks and access pattern. We built ingestion adapters for each source, normalized the documents into a common chunked representation, and indexed the result with a hybrid retriever (BM25 + dense embeddings) tuned for Polish. Top-50 recall feeds a cross-encoder reranker; the agent reasons over the top 5–10 with citations.

Step 3: An agent loop, not a one-shot RAG call.

Analyst questions are not single-hop. "Find me the latest version of X policy and tell me which prior policy it supersedes" requires two retrievals and a comparison. The agent orchestrates retrieval, document filtering, and synthesis as separate steps, with the trace exposed in the UI so the analyst can verify which document the answer came from. Tool calls are bounded - the agent can't perform actions outside the read surface.

Step 4: Air-gapped infrastructure, signed and verifiable end-to-end.

Everything ships as a signed install bundle. OS packages, Python wheels, OCI images, and model weights are mirrored locally - the install completes with zero outbound packets. cosign signatures are verified at every boundary; SBOM (CycloneDX) ships with each release. Updates roll forward as signed delta bundles operators carry across the gap on whatever media policy allows. Audit logs land in the agency's existing security tooling, with append-only hash chaining and tamper detection wired to their runbook.

Step 5: Cleared by the Cyberspace Defence Force.

The full stack - install bundle, dependency graph, network behaviour, audit pipeline - went through security testing by the Polish Cyberspace Defence Force. We delivered the documentation pack, addressed the findings, and shipped the cleared build into production. The same hardening posture is what we re-use on every air-gapped engagement we run.

03 · Discovery, paired install, hardening to clearance

Implementation

Phase 1 - Discovery and BoM (week 1–2).

Mapped the source systems, sized the hardware against target throughput, and aligned the bundle to the agency's accreditation paperwork. Output: signed BoM and a deployment design the customer's CISO approved before any image was built.

Phase 2 - Paired air-gapped install.

All work ran inside the customer's network, on their hardware, with their operators paired in. Bielik weights, ingestion services, retriever, agent loop, and audit forwarders all went in as a single signed bundle. Operators owned the runbooks at the end of the install, not after a separate handover.

Phase 3 - Retrieval and agent tuning on Polish administrative language.

We built a 200-question golden eval with the customer's analysts and tuned the retriever weights, chunking, and agent prompts against it. Every change shipped with a measured delta on recall, precision, and answer faithfulness - no anecdotal improvements.

Phase 4 - Cyberspace Defence Force security testing and clearance.

Coordinated the security review, addressed findings, and delivered the cleared production build. The accreditation pack is reusable for the agency's downstream surveillance reviews.

04 · An analyst tool inside the perimeter - not a slide deck

Outcome

A working analytics agent for analysts who can't use cloud LLMs.

Analysts now ask questions in Polish across 5,000+ documents from the intranet, document repositories, and internal systems, and get answers with citations they can verify. The retrieval-and-synthesize work that used to span four or five tabs is folded into a single conversation surface.

30 tok/s on Bielik, on customer hardware, with zero outbound traffic.

Streaming feels responsive on the chat surface. Inference, retrieval, and audit all happen inside the perimeter - no third-party API, no telemetry leaving the network, no licence server in the path.

Cleared by the Polish Cyberspace Defence Force.

The full stack passed security testing by the Cyberspace Defence Force (Wojska Obrony Cyberprzestrzeni). The install bundle, audit pipeline, and update procedure are now part of the agency's accredited operating environment.

// Expert insight

Most LLM stacks assume outbound HTTPS at install time. This one couldn't have it. Starting from the assumption that nothing reaches the internet - and that every dependency, weight, and audit log has to be signed and verifiable - is what made the Cyberspace Defence Force review go smoothly. The customer is named in their accreditation paperwork, not on our website.
Karol Gawron

Karol Gawron

Head of R&D @ bards.ai

// Ready to ship?

Let's build something that delivers numbers like these.

Book a meeting