Hush AI
Infrastructure
How the Sovereign Prism runs: agentic node orchestration, automated backups, LLM model fleet, and multi-layer hardening. Everything described here runs on hardware we physically own in the United Kingdom.
System Architecture Live
Internet → Cloudflare (TLS/DDoS) → Nginx (reverse proxy + rate limits)
↓
Sovereign Gateway (port 7777) — auth, routing, quota, think-token stripping
↓ ↓ ↓ ↓
LLM Proxy Wiki App Dashboard Services
↓
LLM Server (llama.cpp) — local inference, no cloud
↓
Agentic Nodes: Iris → Researcher → Archivist → Analyst → Guardian
↓
Sovereign Refinery (SQLite + FTS + ChromaDB) — local NVMe storage
Agentic Node Orchestration
The Sovereign Prism runs as a multi-agent pipeline. Each node is a specialised Python service with its own systemd unit, watchdog supervision, and sandboxed filesystem access.
Iris
Node 1 · Telegram Bot
Receives user queries via Telegram, routes to the LLM proxy, and streams responses back. Handles inline commands, document forwarding, and multi-modal inputs.
Researcher
Node 2 · Web Intelligence
Performs deep web research using the Elite Scraper and SearXNG private search. Extracts, summarises, and archives source material for the knowledge base.
Archivist
Node 3 · Filing & Ingestion
Classifies incoming content, generates embeddings, and files articles into the Sovereign Refinery wiki with full-text search indexing.
Analyst
Node 4 · Content Analysis
Runs deeper analysis on filed content: cross-referencing, trend detection, and quality scoring across the knowledge base.
Guardian
Node 5 · Security Monitor
Monitors system health, detects anomalies, and enforces content policies. Acts as the internal security watchdog.
Wiki App
SaaS · Port 8300
Public wiki and archival application. JWT authentication, team management, RSS feed ingestion, and full-text search across the entire knowledge base.
LLM Model Fleet
All models run locally on AMD Strix Halo APU hardware with ROCm/HIP acceleration. No inference leaves the building.
| Model | Role | Context | Quantisation |
| Qwen 3.6 35B A3B | Primary (Sovereign Prism) | 262K | Q6_K_XL |
| Gemma 4 E4B | Gateway chatbot | 32K | Q8_K_XL |
| Qwen 3.6 27B | Fast inference | 65K | Q8_K_XL |
| Qwen3-VL-32B | Vision & multimodal | 128K | Q8_K_XL |
| VL-Rethinker 72B | Deep reasoning | 262K | Q4_K_M |
| Qwen3 Coder Next | Code generation | 65K | Q6_K_XL |
Automated Backup System Hardened
Schedule: Daily at 03:00 UTC via cron
Code backups — all source code, configs, and scripts (excluding secrets, models, and runtime data). Each backup is verified by extracting and checking for core files, then signed with a SHA-256 checksum.
Data backups — the entire Sovereign Refinery (wiki database, FTS index, embeddings). SQLite databases get a safe .backup snapshot before archival, guaranteeing consistency even during writes.
Rotation — backups older than 30 days are automatically purged. Checksums are rotated alongside their archives.
Verification — every code backup is automatically extracted to a temporary directory and checked for the presence of config.py, wiki_app.py, and utils.py before the backup is considered valid.
What the backup excludes (by design)
- .env, agent_keys.env, .master_key — secrets never enter the backup archive
- models/ — multi-gigabyte model files are managed separately
- venv/, node_modules/, __pycache__/ — reproducible from requirements
- .git/ — version history lives in the git repository
Service Management
Process supervision — each node runs as a sandboxed systemd service with Restart=always and 10-second restart delay. A dedicated watchdog process polls every 30 seconds and restarts any crashed node automatically.
Systemd sandboxing (applied to all services):
- NoNewPrivileges=yes — prevents privilege escalation
- ProtectSystem=strict — mounts the filesystem read-only except for explicitly allowed paths
- PrivateTmp=yes — each service gets its own isolated /tmp
- RestrictSUIDSGID=yes — blocks creation of SUID/SGID binaries
- ReadWritePaths — limited to logs/ and Sovereign_Refinery/ only
Startup sequence — pre-flight health checks verify the LLM server, SearXNG, ChromaDB, bot token, and browser tool extractor before launching any nodes.
Security Hardening Layers
Layer 1: Network (UFW Firewall)
- Default deny incoming, allow outgoing
- SSH restricted to LAN subnets only (192.168.1.0/24, 192.168.100.0/24, 192.168.101.0/24)
- HTTP/HTTPS open for hush-ai.uk public access
- All internal services bind to 127.0.0.1 — never exposed to the network
Layer 2: SSH
- Root login disabled
- Password authentication disabled — key-only access
- Max 3 authentication attempts per connection
- Idle timeout: 300 seconds with 3 keepalive retries
Layer 3: Nginx Reverse Proxy
- Server tokens hidden on all virtual hosts
- HSTS with preload (1 year), X-Frame-Options DENY, X-Content-Type-Options nosniff
- Per-route rate limiting with burst control
- Dotfile and scanner bot blocking (WordPress, phpMyAdmin probes return 444)
- TLS 1.2+ with strong cipher suite — Cloudflare Origin Certificate
- Internal proxy lanes locked to LAN IPs only
Layer 4: Application
- LAN Shield — IP-based access control with rate-limited authorization
- JWT-based authentication for all user sessions
- Think-token stripping on all LLM responses — internal reasoning never reaches the client
- Secret masking in all application logs
- TOTP two-factor authentication support
- Login brute-force protection (15-minute lockout after repeated failures)
Layer 5: Startup Scripts
- All scripts use set -euo pipefail — fail fast on any error
- Model file existence verified before launch — clear error messages on missing models
- Secrets excluded from version control via .gitignore
- PID files are chmod 600 — not world-readable
- Token parsing hardened against values containing = characters
Proxy Lane Architecture
Internal services are exposed through dedicated Nginx proxy lanes, each locked to LAN traffic only.
| Lane | Service | Port | Backend |
| Alpha | Llama 4 Scout 109B | 8001 | DGX Spark (Alicia) |
| Omega | 120B Expert | 8002 | Workstation (James) |
| Prism | 27B Multimodal | 8003 | Local proxy |
| MCP | Sovereign MCP Server | 8004 | SSE-enabled |
| Echo | Audio Transcription | 8005 | 100MB upload limit |
| SearXNG | Private Search | 8006 | Docker container |
| Cyclops | Document OCR & Vision | 8007 | 55MB upload limit |
Audit Trail
The Oracle Prism codebase undergoes regular security audits. Hardening is applied in phases with full rollback procedures documented for every change.
Last audit: 31 May 2026
Hardening phases completed: 8
Items addressed in latest pass:
- Added shebang lines and set -euo pipefail to all startup scripts
- Fixed duplicate shebang block in Qwen3-VL-32B launcher
- Added pre-flight model file verification to all LLM launchers
- Hardened systemd services with sandboxing directives
- Added SHA-256 checksum generation to backup pipeline
- Fixed token parsing vulnerability in start_all.sh
- Fixed unquoted variable expansion in stop_all.sh
- Removed duplicate .gitignore entries
← Back to Hush AI