Engagement · Non-profit · Bronxville, NY
Christ Church Bronxville. UDM Pro under load.
A year of intermittent UniFi console disconnections, traced to two compounding root causes — a legacy firmware backup file pushing the OS partition to 99% and an OOM killer reaping the unifi-core process under a four-application load. Resolved in two on-site sessions, with every change logged and signed off by the client.
- Client
- Christ Church Bronxville
- Sector
- Non-profit
- Location
- Bronxville, NY
- Engagement
- Forensic diagnostic + remediation
- Performed by
- ShiftCTRL
- Device under review
- UniFi Dream Machine Pro
Two compounding faults under a four-app load.
The UDM Pro was running all four UniFi applications — Network, Protect, Access, and Talk — on a 4 GB device under combined application load. The reported symptom was intermittent UniFi console disconnections under normal operating load.
Root cause analysis surfaced two compounding faults: a legacy firmware backup file at ~897 MB had pushed the OS partition to 99% capacity, and the Linux Out-of-Memory killer was issuing SIGKILL to the unifi-core process when available memory dropped to ~66 MB. Both faults were resolved on-site. With explicit client authorization, secondary remediation steps were also completed: VoIP log files purged, memory-snapshot logs cleared, the unifi-talk service restarted to release open file descriptors, and optional services tuned in coordination with the client to recover ~233 MB of RAM.
The device is now stable. The capacity constraint underneath it is structural, not configurational; the recommendations section below outlines the path from a no-cost log-rotation cron job to a single-device hardware refresh.
What was running, where.
Four findings. Two primary, two structural.
OS partition at 99% capacityPRIMARY
The root filesystem (/) sat at 99% capacity, driven by a ~897 MB legacy firmware backup file left behind by a prior firmware upgrade. When a Linux filesystem reaches capacity, the kernel cannot write process state, PID files, or runtime sockets, which manifests as sudden console disconnections.
OOM kills on unifi-corePRIMARY
With ~66 MB of available RAM, the Linux OOM killer was sending SIGKILL (9) to unifi-core — the central management process — producing the observed console drops. Per-app footprints under load: Network ~780 MB, Protect ~124 MB, Access ~50 MB, Talk ~100 MB, plus optional security services at ~233 MB. On a 4 GB device, this leaves no headroom under normal operating conditions.
FreeSwitch logs accumulating without rotationSECONDARY
The /var/log partition had grown to 89%, principally from FreeSwitch (UniFi Talk) log files. UniFi does not ship a built-in log rotation setting for the Talk application. A second behavior compounds the issue: deleting log files while unifi-talk holds open file descriptors does not reclaim disk blocks until the service restarts.
Hardware capacity constraintSTRUCTURAL
The UDM Pro’s 4 GB of RAM is operating at its practical limit with all four applications active. Even after software optimizations, the UniFi Network Java application alone consumes ~780 MB RSS, with optional security services adding ~233 MB on top. This is a structural hardware limitation, not a configuration error. The recommendations section addresses it in three steps: no-cost, no-additional-cost, and capital.
What we did, in two sessions.
SESSION 01 — INITIAL REMEDIATION
- Identified and removed a ~897 MB legacy firmware backup file from the OS partition. Usage dropped from 99% to ~45%.
- Cleared stale system log files in non-rotating directories.
- Reviewed running processes and memory allocation (
ps aux,free -h). - Identified the FreeSwitch log accumulation as secondary disk pressure and documented remediation commands for client review prior to execution.
SESSION 02 — REMEDIATION Authorized by Nelson, CCB
- Deleted FreeSwitch VoIP log files.
rm -rf /var/log/freeswitch/*— authorized by client. - Deleted memory-snapshot logs.
rm -rf /var/log/mem_snapshot/*— authorized by client. - Restarted the unifi-talk service to release open file handles. Log partition fell from 89% to 22% (711 MB free).
- Tuned optional services in coordination with the client to relieve memory pressure. ~233 MB RAM recovered. Available RAM rose from ~66 MB to ~368 MB.
Measured at the close of session 02.
From no-cost to capital. In that order.
Configure FreeSwitch log rotation.
UniFi Talk does not include a built-in log-rotation setting. Without intervention, FreeSwitch logs return the partition to a critical state. A cron job on a weekly schedule purges logs older than 7–14 days. Brief maintenance window, SSH access only.
Deploy the second UDM Pro in HA failover mode.
The client has a second UDM Pro currently idle. UniFi does not support application clustering across UDM Pros, so this unit cannot distribute RAM load. It can, however, run as a hot standby:
- The secondary takes over automatically if the primary fails.
- Failover requires no user-side action.
- Both units stay synchronized to the primary configuration.
Important: HA failover is a business-continuity measure, not a performance fix. RAM pressure on the primary is unaffected.
Move Protect to a dedicated UNVR.
UniFi Protect is architecturally designed to run on dedicated NVR hardware. Moving it to a UNVR frees ~124 MB of RAM on the primary, dedicates storage and processing to camera feeds, and restores headroom for the full application stack. Camera configurations migrate with minimal disruption.
UDM Pro Max as primary, 8 GB RAM.
The cleanest single-device resolution. The Max runs all four applications at full capacity; the existing UniFi backup restores directly. The retired UDM Pro becomes the HA standby (Rec 02), making the two investments complementary.
Best-outcome path: UDM Pro Max as primary + existing UDM Pro as HA failover — full RAM headroom and full redundancy across the application stack.