Hush AI

Infrastructure

How the Sovereign Prism runs: agentic node orchestration, automated backups, LLM model fleet, and multi-layer hardening. Everything described here runs on hardware we physically own in the United Kingdom.

System Architecture Live

Internet → Cloudflare (TLS/DDoS) → Nginx (reverse proxy + rate limits)
     ↓
Sovereign Gateway (port 7777) — auth, routing, quota, think-token stripping
     ↓             ↓           ↓         ↓
LLM Proxy    Wiki App    Dashboard   Services
     ↓
LLM Server (llama.cpp) — local inference, no cloud
     ↓
Agentic Nodes: Iris → Researcher → Archivist → Analyst → Guardian
     ↓
Sovereign Refinery (SQLite + FTS + ChromaDB) — local NVMe storage

Agentic Node Orchestration

The Sovereign Prism runs as a multi-agent pipeline. Each node is a specialised Python service with its own systemd unit, watchdog supervision, and sandboxed filesystem access.

Iris

Node 1 · Telegram Bot

Receives user queries via Telegram, routes to the LLM proxy, and streams responses back. Handles inline commands, document forwarding, and multi-modal inputs.

Researcher

Node 2 · Web Intelligence

Performs deep web research using the Elite Scraper and SearXNG private search. Extracts, summarises, and archives source material for the knowledge base.

Archivist

Node 3 · Filing & Ingestion

Classifies incoming content, generates embeddings, and files articles into the Sovereign Refinery wiki with full-text search indexing.

Analyst

Node 4 · Content Analysis

Runs deeper analysis on filed content: cross-referencing, trend detection, and quality scoring across the knowledge base.

Guardian

Node 5 · Security Monitor

Monitors system health, detects anomalies, and enforces content policies. Acts as the internal security watchdog.

Wiki App

SaaS · Port 8300

Public wiki and archival application. JWT authentication, team management, RSS feed ingestion, and full-text search across the entire knowledge base.

LLM Model Fleet

All models run locally on AMD Strix Halo APU hardware with ROCm/HIP acceleration. No inference leaves the building.

Model	Role	Context	Quantisation
Qwen 3.6 35B A3B	Primary (Sovereign Prism)	262K	Q6_K_XL
Gemma 4 E4B	Gateway chatbot	32K	Q8_K_XL
Qwen 3.6 27B	Fast inference	65K	Q8_K_XL
Qwen3-VL-32B	Vision & multimodal	128K	Q8_K_XL
VL-Rethinker 72B	Deep reasoning	262K	Q4_K_M
Qwen3 Coder Next	Code generation	65K	Q6_K_XL

Automated Backup System Hardened

Schedule: Daily at 03:00 UTC via cron

Code backups — all source code, configs, and scripts (excluding secrets, models, and runtime data). Each backup is verified by extracting and checking for core files, then signed with a SHA-256 checksum.

Data backups — the entire Sovereign Refinery (wiki database, FTS index, embeddings). SQLite databases get a safe .backup snapshot before archival, guaranteeing consistency even during writes.

Rotation — backups older than 30 days are automatically purged. Checksums are rotated alongside their archives.

Verification — every code backup is automatically extracted to a temporary directory and checked for the presence of config.py, wiki_app.py, and utils.py before the backup is considered valid.

What the backup excludes (by design)

.env, agent_keys.env, .master_key — secrets never enter the backup archive
models/ — multi-gigabyte model files are managed separately
venv/, node_modules/, __pycache__/ — reproducible from requirements
.git/ — version history lives in the git repository

Service Management

Process supervision — each node runs as a sandboxed systemd service with Restart=always and 10-second restart delay. A dedicated watchdog process polls every 30 seconds and restarts any crashed node automatically.

Systemd sandboxing (applied to all services):

NoNewPrivileges=yes — prevents privilege escalation
ProtectSystem=strict — mounts the filesystem read-only except for explicitly allowed paths
PrivateTmp=yes — each service gets its own isolated /tmp
RestrictSUIDSGID=yes — blocks creation of SUID/SGID binaries
ReadWritePaths — limited to logs/ and Sovereign_Refinery/ only

Startup sequence — pre-flight health checks verify the LLM server, SearXNG, ChromaDB, bot token, and browser tool extractor before launching any nodes.

Security Hardening Layers

Layer 1: Network (UFW Firewall)

Default deny incoming, allow outgoing
SSH restricted to LAN subnets only (192.168.1.0/24, 192.168.100.0/24, 192.168.101.0/24)
HTTP/HTTPS open for hush-ai.uk public access
All internal services bind to 127.0.0.1 — never exposed to the network

Layer 2: SSH

Root login disabled
Password authentication disabled — key-only access
Max 3 authentication attempts per connection
Idle timeout: 300 seconds with 3 keepalive retries

Layer 3: Nginx Reverse Proxy

Server tokens hidden on all virtual hosts
HSTS with preload (1 year), X-Frame-Options DENY, X-Content-Type-Options nosniff
Per-route rate limiting with burst control
Dotfile and scanner bot blocking (WordPress, phpMyAdmin probes return 444)
TLS 1.2+ with strong cipher suite — Cloudflare Origin Certificate
Internal proxy lanes locked to LAN IPs only

Layer 4: Application

LAN Shield — IP-based access control with rate-limited authorization
JWT-based authentication for all user sessions
Think-token stripping on all LLM responses — internal reasoning never reaches the client
Secret masking in all application logs
TOTP two-factor authentication support
Login brute-force protection (15-minute lockout after repeated failures)

Layer 5: Startup Scripts

All scripts use set -euo pipefail — fail fast on any error
Model file existence verified before launch — clear error messages on missing models
Secrets excluded from version control via .gitignore
PID files are chmod 600 — not world-readable
Token parsing hardened against values containing = characters

Proxy Lane Architecture

Internal services are exposed through dedicated Nginx proxy lanes, each locked to LAN traffic only.

Lane	Service	Port	Backend
Alpha	Llama 4 Scout 109B	8001	DGX Spark (Alicia)
Omega	120B Expert	8002	Workstation (James)
Prism	27B Multimodal	8003	Local proxy
MCP	Sovereign MCP Server	8004	SSE-enabled
Echo	Audio Transcription	8005	100MB upload limit
SearXNG	Private Search	8006	Docker container
Cyclops	Document OCR & Vision	8007	55MB upload limit

Audit Trail

The Oracle Prism codebase undergoes regular security audits. Hardening is applied in phases with full rollback procedures documented for every change.

Last audit: 31 May 2026

Hardening phases completed: 8

Items addressed in latest pass:

Added shebang lines and set -euo pipefail to all startup scripts
Fixed duplicate shebang block in Qwen3-VL-32B launcher
Added pre-flight model file verification to all LLM launchers
Hardened systemd services with sandboxing directives
Added SHA-256 checksum generation to backup pipeline
Fixed token parsing vulnerability in start_all.sh
Fixed unquoted variable expansion in stop_all.sh
Removed duplicate .gitignore entries

← Back to Hush AI