Run Your Logging Platform on a Separate Server
In earlier posts (Log What the LLM Says and Aggregate Logs and Set Alerts Early) we covered capturing structured logs and routing them into an aggregator for filtering and alerts. That’s the right first step. The next step is to run your logging platform externally (on a separate server or cluster, or use a managed service) rather than bundling it inside the same Docker Compose used for your app stack.
This article explains why separating logging infrastructure matters, how to wire it up, and simple examples (Graylog as an example) showing how your app stack (OpenWebUI, N8N, Ollama, Postgres) can send logs to a remote logging platform.
Why This Matters
Running your logging platform in the same compose as your app (co-located) is convenient for development, but it has major drawbacks in production:
- Blast radius: If the app causes resource exhaustion, it can bring down the logging stack too, making debugging impossible.
- Scalability: Logs grow fast. A single host’s disk/IO can become a bottleneck. External platforms are easier to scale independently.
- Isolation & security: Centralized logs often need strict access controls, encryption, and retention policies that are easier to enforce on dedicated infrastructure.
- Resilience & durability: An external cluster (or managed service) can provide HA, retention policies, and backups so logs survive app incidents.
- Operational separation: Teams that run the app don’t need full control over logging infra; SRE or platform teams can operate and secure it separately.
Think of logs as critical business data: treat the logging platform like a system of record, not a dev convenience.
What to Do
- Deploy your logging platform on separate infrastructure (a different VM, cluster, or managed service). For self-hosting: Graylog, ELK (Elasticsearch + Logstash + Kibana), or Grafana Loki are good options. For managed: Splunk Cloud, Elastic Cloud, Grafana Cloud, Sumo Logic.
- Standardize structured logs (JSON) across your services.
- Forward logs from your app servers to the remote logging host using one of:
- Docker logging drivers (e.g., GELF) that send container stdout to the remote collector.
- A lightweight forwarder like Fluent Bit / Fluentd on each host that tails files and forwards to the remote cluster.
- Agent libraries or GELF clients in your app (useful for low-volume, structured events).
- Protect the transport: use TLS / TCP where possible, firewall rules, and authentication. Do not expose UDP GELF inputs publicly without network controls.
- Configure retention, indices, and alerts on the external platform.
Production Tip
Run logging on separate infrastructure and treat logs as first-class data: secure the transport, plan retention and indexing, and give logs their own capacity and uptime SLOs.
Code Example
Below is a minimal docker-compose.yml you could run on a dedicated logging server. This keeps the logging stack independent from your app compose.
Note: This example is simplified for clarity. In production you’ll want secure passwords, secrets management, TLS, resource limits, and proper volume backups.
version: '3.8'
services:
mongo:
image: mongo:6.0
container_name: graylog_mongo
volumes:
- mongo_data:/data/db
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.8.0
environment:
- discovery.type=single-node
- ES_JAVA_OPTS=-Xms1g -Xmx1g
volumes:
- es_data:/usr/share/elasticsearch/data
graylog:
image: graylog/graylog:5.1
container_name: graylog
depends_on:
- mongo
- elasticsearch
environment:
GRAYLOG_PASSWORD_SECRET: "replace_with_random_string"
GRAYLOG_ROOT_PASSWORD_SHA2: "replace_with_sha2_hash_of_password"
GRAYLOG_HTTP_EXTERNAL_URI: "https://logs.example.com/"
ports:
- "9000:9000" # Web UI (restrict in firewall)
- "12201:12201/udp" # GELF UDP input (prefer internal network / VPN)
- "12201:12201/tcp" # optional: GELF TCP input (better for reliability)
volumes:
mongo_data:
es_data:
Run this on a dedicated machine, not on the same host running your app containers.
Example Output
If you configure an alert in Graylog for LLM error spikes, you might receive something like:
{
"event": "LLM Error Rate Alert",
"description": "Error rate exceeded 5% in the last 5 minutes",
"timestamp": "2025-08-10T14:55:32Z",
"affected_services": ["N8N LLM Workflow", "Ollama API"],
"link_to_dashboard": "http://your-graylog-server:9000/alerts/123"
}
Going Further
You can extend this approach by:
- Sending logs from multiple servers or services into the same Graylog instance.
- Integrating with Slack, email, or PagerDuty for real-time notifications.
- Enabling log retention policies and archiving to cheaper storage like S3.
Final Thought
An external logging platform acts like a “black box” or “flight data recorder” for your AI system. Even if the main engine fails, you still have the data to figure out what happened. Moving your logs off your primary environment is a small step that makes your AI automation dramatically more resilient… and more production ready!