Server Hardware & Baselines¶

Overview¶

	phil-app	phil-db
IP	157.90.134.159	88.198.7.144
WireGuard	10.42.10.4	10.42.10.3
RAM	64 GB	128 GB
CPUs	12	12
Role	~100 Docker containers	MariaDB only
Swap	8 GB (`/var/swapfile`), `vm.swappiness=10`	8 GB configured, `vm.swappiness=0` (never used)
CPU governor	`powersave` (Hetzner default)	`powersave`
IO scheduler	default	`none` (NVMe, correct)

phil-app (157.90.134.159)¶

Disk Layout¶

Device	Mount	Size	Use%	FS	Notes
`/dev/md0`	`/boot`	1 GB	—	ext4	NVMe RAID1
`/dev/md1`	`/`	10 GB	~32%	ext4	NVMe RAID1 — system only
`/dev/md2`	`/tmp`	5 GB	~72%	ext4	NVMe RAID1
`/dev/md3`	`/var`	30 GB	~21%	ext4	NVMe RAID1 — logs, swapfile
`/dev/md4`	`/var/lib/docker`	848 GB	~24%	XFS	NVMe RAID1 — all Docker volumes + images
`/dev/sda`	`/srv/nextcloud_data`	938 GB	~6%	ext4	SATA HDD — Nextcloud primary data
StorageBox	`/mnt/nextcloud_data`	4.8 TB	—	SSHFS	Legacy — to be replaced (see roadmap P4)
StorageBox	`/mnt/paperless_data`	4.8 TB	—	SSHFS

Critical: / is only 10 GB. Never create swap files or large data on /. Swapfile lives at /var/swapfile on the 30 GB /var partition.

I/O Hotspots (md4)¶

Container	Observed writes	Root cause
`friendica-cron-1`	Previously 6.81 TB	OOM crash-loop — Guzzle buffering multi-GB federation responses. Fixed: `worker_memory_limit` capped
`opensocial-redis-1`	~87 GB	Redis RDB persistence (`--save 60 1`) — normal
`friendicame-redis-1`	~64 GB	Same
`opensocial-cron-1`	~1 GB	Previously 57k ENOSPC errors from 256 MB tmpfs. Fixed: tmpfs → 1 GB

Performance Baseline¶

Docker storage: overlay2 on /var/lib/docker (848 GB, ~24% used)
Docker logging: json-file, 25 MB × 5 files, compressed, non-blocking
~100 containers, ~22 GB RAM used
XFS quotas per container via userns-remap (UID offset 165536)

Ansible-managed Scripts¶

Script	Playbook	Location	Purpose
XFS Quota Exporter	`55_monitoring.yml`	`/opt/xfs_quota_exporter/`	Exports `xfs_quota_used_bytes` per Docker volume via textfile; scraped at `host.docker.internal:9101`
Node Reboot Required	`55_monitoring.yml`	systemd timer	Writes `node_reboot_required` Prometheus metric to `/var/lib/prometheus/node-exporter/`

phil-db (88.198.7.144)¶

Disk Layout¶

NVMe RAID1 — Samsung MZVL21T0HCLR 1 TB × 2:

Device	Mount	Size	Notes
`/dev/md0`	`/`	~1 TB	NVMe RAID1 — MariaDB data + system

Note: Both NVMe drives report SMART Critical Warning 0x04 (volatile memory backup degraded). Expected behavior at EOL per Hetzner — RAID1 provides redundancy. Excluded from SmartCriticalWarning alert. Remove exclusion if disks are replaced.

Key Databases¶

Database	Size	Owner
`friendica_opensocial`	~157 GB	opensocial.at
`friendica_friendicame`	~95 GB	friendica.me
`keycloak`	small	Keycloak
matrix stack	on postgres (phil-app)	—

Performance Baseline¶

InnoDB buffer pool: 80 GB, hit rate 99.93%
Buffer pool free pages: ~79% — pool is oversized for current workload
Temp tables to disk: 0.66% (excellent)
Slow queries: ~61k since last restart (threshold 1s)
Active connections: ~19 (of max 300)
Swap usage: 0 bytes (correct for MariaDB)

Ansible-managed Scripts¶

Script	Playbook	Location	Purpose
Borg Backup	`60_backup.yml`	`/opt/borg-backup/backup.sh`	`mariabackup` → `borg create` to Hetzner StorageBox
MySQL Snapshot	`40_mariadb.yml`	systemd timer `mysql_snapshot.timer`	Full diagnostics snapshot (InnoDB status, slow queries via `pt-query-digest`) → tar.gz → Nextcloud WebDAV
Node Reboot Required	`55_monitoring.yml`	systemd timer	Same as phil-app