<!-- OSSAlt AI-readable guide source -->
<!-- Canonical: https://ossalt.com/guides/self-host-netdata-2026 -->
<!-- Raw Markdown: https://ossalt.com/guides/self-host-netdata-2026/raw.md -->
<!-- Source path: content/guides/self-host-netdata-2026.mdx -->

---
og_image: "/images/guides/self-host-netdata-2026.webp"
title: "How to Self-Host Netdata Monitoring 2026"
description: "Self-host Netdata in 2026. GPL 3.0, ~69K stars, C — per-second metrics for servers, containers, databases. 800+ integrations, anomaly detection, alerts."
date: "2026-03-09"
author: "OSSAlt Team"
tier: 1
tags: ["netdata", "monitoring", "infrastructure", "self-hosting", "docker", "metrics", "2026"]
---

## TL;DR

[Netdata](https://www.netdata.cloud) (GPL 3.0, ~69K GitHub stars, C) is the most comprehensive real-time infrastructure monitoring tool available. It collects 1-second resolution metrics for CPU, memory, disk, network, processes, containers, databases, and 800+ other systems — and streams them to a beautiful web dashboard with zero configuration. Unlike Prometheus (pull-based), Netdata agents push metrics in real-time. Run it on any server for instant observability with no setup beyond `docker compose up`.

## Key Takeaways

- **Netdata**: GPL 3.0, ~69K stars, C — 1-second per-metric resolution, 800+ collectors
- **Zero config**: Automatically detects and monitors MySQL, PostgreSQL, Redis, Nginx, Docker, etc.
- **Anomaly detection**: ML-based anomaly detection on every metric, no configuration needed
- **Alerts**: Built-in alert conditions for common failure patterns with notifications
- **Distributed**: Each agent is standalone; use Netdata Parents for centralized multi-host view
- **vs Prometheus+Grafana**: Netdata is turnkey; Prometheus is more customizable but needs configuration


## Netdata vs Prometheus+Grafana vs Zabbix

| Feature | Netdata | Prometheus+Grafana | Zabbix |
|---------|---------|-------------------|--------|
| License | GPL 3.0 | Apache 2.0 | AGPL 2.0 |
| Setup time | Minutes | Hours | Hours |
| Resolution | 1 second | 15s default | 1 minute |
| Auto-discovery | Yes (800+ collectors) | Manual scrape config | Agent-based |
| Anomaly detection | Yes (ML, built-in) | Manual rules | Trigger-based |
| Long-term storage | 1 month (local) | Forever (disk) | DB-based |
| Dashboards | Built-in | Grafana required | Built-in |
| Alerting | Built-in | Alertmanager | Built-in |
| Agent RAM | ~100MB | ~50MB | ~200MB |


## Part 1: Docker Setup

### Single-node monitoring

```yaml
# docker-compose.yml
services:
  netdata:
    image: netdata/netdata:latest
    container_name: netdata
    restart: unless-stopped
    pid: host
    network_mode: host     # Required for full network monitoring
    cap_add:
      - SYS_PTRACE
      - SYS_ADMIN
    security_opt:
      - apparmor:unconfined
    volumes:
      - netdataconfig:/etc/netdata
      - netdatalib:/var/lib/netdata
      - netdatacache:/var/cache/netdata
      - /etc/passwd:/host/etc/passwd:ro
      - /etc/group:/host/etc/group:ro
      - /etc/localtime:/etc/localtime:ro
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /etc/os-release:/host/etc/os-release:ro
      - /var/log:/host/var/log:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    environment:
      NETDATA_CLAIM_TOKEN: "${NETDATA_CLAIM_TOKEN}"    # Optional: Netdata Cloud claim
      NETDATA_CLAIM_URL: "https://app.netdata.cloud"
      NETDATA_CLAIM_ROOMS: "${NETDATA_CLAIM_ROOMS}"

volumes:
  netdataconfig:
  netdatalib:
  netdatacache:
```

```bash
docker compose up -d
```

Visit `http://your-server:19999` — the Netdata dashboard loads immediately.

> **Note:** `network_mode: host` is needed for Netdata to see host network interfaces and monitor processes properly. Port 19999 is exposed by default.


## Part 2: HTTPS with Caddy

Since Netdata uses `network_mode: host`, it's already on port 19999. Add a Caddy reverse proxy on the same host:

```caddyfile
metrics.yourdomain.com {
    reverse_proxy localhost:19999
}
```

### Access Control

Netdata is open by default. Restrict access:

```bash
# Edit netdata.conf:
docker exec -it netdata sh
vi /etc/netdata/netdata.conf

# Add:
[web]
    allow connections from = localhost 192.168.0.0/24 10.0.0.0/8
```

Or restrict entirely and access only via Caddy with basic auth:

```caddyfile
metrics.yourdomain.com {
    basicauth {
        admin $2a$14$hashofyourpassword
    }
    reverse_proxy localhost:19999
}
```


## Part 3: Auto-Detected Collectors

Netdata automatically detects and configures:

| Service | What it monitors |
|---------|-----------------|
| **Docker** | Container CPU, memory, network, I/O |
| **PostgreSQL** | Queries/sec, connections, replication lag, table bloat |
| **MySQL/MariaDB** | Queries, threads, InnoDB metrics |
| **Redis** | Operations/sec, memory, hit rate, keyspace |
| **Nginx** | Requests/sec, connections, response codes |
| **Node.js** | Event loop lag, heap, GC (via node_exporter) |
| **systemd** | Service status, CPU, memory per service |
| **Disk** | IOPS, latency, utilization per device |
| **Network** | Packets/sec, bandwidth, errors per interface |
| **CPU** | Per-core utilization, interrupts, softirqs |

No configuration needed — Netdata finds running services automatically.


## Part 4: Custom Alerts

Default alerts cover common failure scenarios. Add custom ones:

```bash
# Edit alerts:
docker exec -it netdata sh
vi /etc/netdata/health.d/custom.conf
```

```conf
# Alert if disk usage > 85%:
alarm: disk_space_warning
on: disk.space
os: linux
lookup: average -10m unaligned of used
calc: $this * 100 / ($used + $avail)
every: 1m
warn: $this > 85
crit: $this > 95
info: disk ${label:mount_point} space utilization
delay: down 5m multiplier 1.5 max 1h
to: sysadmin

# Alert if PostgreSQL has too many connections:
alarm: pg_connections_warning
on: postgres.connections
lookup: average -5m unaligned
every: 1m
warn: $this > 80
crit: $this > 95
info: PostgreSQL active connections
to: dba
```


## Part 5: Notifications

Configure alert notifications:

```bash
# Edit notifications:
docker exec -it netdata vi /etc/netdata/health_alarm_notify.conf
```

```bash
# Slack:
SEND_SLACK="YES"
SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
DEFAULT_RECIPIENT_SLACK="#alerts"

# Telegram:
SEND_TELEGRAM="YES"
TELEGRAM_BOT_TOKEN="your-bot-token"
DEFAULT_RECIPIENT_TELEGRAM="your-chat-id"

# ntfy:
SEND_NTFY="YES"
NTFY_URL="https://ntfy.yourdomain.com"
DEFAULT_RECIPIENT_NTFY="alerts"

# PagerDuty:
SEND_PAGERDUTY="YES"
PAGERDUTY_SERVICE_KEY="your-service-key"
```


## Part 6: Distributed Multi-Server Setup

Monitor multiple servers from one UI using **Netdata Parents**:

### On the parent server

```yaml
# docker-compose.yml on parent server
environment:
  # Allow children to connect:
  NETDATA_ALLOW_CONNECTIONS_FROM: "10.0.0.0/8 192.168.0.0/24"
```

### On each child server

```bash
# stream.conf on each agent:
docker exec -it netdata vi /etc/netdata/stream.conf
```

```ini
[stream]
    enabled = yes
    destination = parent.yourdomain.com:19999
    api key = your-api-key   # Same on parent's netdata.conf

[parent.yourdomain.com:19999]
    enabled = yes
    api key = your-api-key
```

All child server metrics stream to the parent in real-time. The parent UI shows all servers in one dashboard.


## Part 7: Anomaly Detection

Netdata runs ML models on every metric:

- Each metric gets a Gaussian Mixture Model (GMM) trained on the last 2 hours
- **Anomaly Rate** = percentage of metrics in anomalous state right now
- No configuration needed — it just works
- Access: **Dashboard → Anomaly Advisor tab**

```bash
# Get anomaly rate via API:
curl "http://localhost:19999/api/v1/alarms?all" | jq '.alarms | to_entries[] | select(.value.status=="WARNING" or .value.status=="CRITICAL") | .value.name'
```


## Maintenance

```bash
# Update Netdata:
docker compose pull
docker compose up -d

# Check Netdata version:
docker exec netdata netdata --version

# Logs:
docker compose logs -f netdata

# Reload config (no restart):
docker exec netdata kill -HUP 1

# Backup config:
tar -czf netdata-config-$(date +%Y%m%d).tar.gz \
  $(docker volume inspect netdata_netdataconfig --format '{{.Mountpoint}}')
```


## Why Self-Host Netdata

Datadog starts at $15/host/month for infrastructure monitoring — a 10-server setup costs $1,800/year before adding APM, logs, or synthetics. New Relic's compute-based pricing lands in similar territory for any meaningful fleet. Grafana Cloud's free tier cuts off after 10,000 series, and production workloads routinely exceed that. Netdata gives you 1-second resolution monitoring for every metric, ML-based anomaly detection, and 800+ auto-configuring collectors — for the cost of the server you're already running.

The 1-second resolution is the headline differentiator. Prometheus defaults to 15-second scrape intervals, Zabbix to 1-minute polling. When a CPU spike lasts 3 seconds and causes an outage, 15-second resolution tells you "something happened." Netdata's 1-second resolution shows you exactly which processes spiked, for how long, and what happened to disk I/O and network at the same moment. This matters enormously for debugging intermittent issues that don't show up in coarser metrics.

Zero-configuration discovery is the other standout. Netdata detects running MySQL, PostgreSQL, Redis, Nginx, Docker, and dozens of other services automatically — no YAML scrape configs, no exporters to install. The moment you run a container, Netdata starts collecting from it. For teams that are already managing complex infrastructure, eliminating the maintenance overhead of Prometheus exporters and scrape config files is a significant operational saving.

The anomaly detection capability runs without any configuration. Netdata trains Gaussian Mixture Models on every metric and flags deviations from normal behavior. This catches subtle degradations — a slow memory leak, gradually increasing disk I/O, a creeping queue backup — before they become outages. This kind of ML-based alerting typically requires expensive observability platforms or significant custom engineering.

**When NOT to self-host Netdata.** Netdata stores metrics locally (1 month by default). If you need indefinite long-term metric retention for capacity planning or compliance, Prometheus with remote write to object storage is a better fit. Netdata's dashboards are excellent for real-time operations but are less flexible for custom reporting compared to Grafana. And if your team is already invested in the Prometheus/Grafana ecosystem, adding Netdata alongside it can create alert duplication rather than reducing overhead.


## Prerequisites

Netdata's resource footprint depends on how many metrics it collects. On a baseline server with moderate services, expect around 100–150MB RAM and 2–5% CPU at idle. With many containers and databases, RAM can climb to 300–400MB. A **Hetzner CX22** (2 vCPU, 4GB RAM) at €4.50/month handles a well-loaded single node comfortably. If you're running the multi-server Parent setup (collecting from 10+ child nodes), upgrade to a CX32 (8GB RAM). Refer to the [VPS comparison guide](/guides/self-hosting-vps-comparison-2026) for a detailed breakdown of provider options for monitoring workloads.

Docker Engine 24+ and Docker Compose v2 are required. Note that Netdata's compose configuration uses `network_mode: host` and mounts several `/proc` and `/sys` paths — this is necessary for Netdata to see host-level metrics rather than just container metrics. On SELinux-enabled systems (CentOS, Fedora), you may need to add `:z` to bind mounts or set `security_opt: label:disable`. The `SYS_PTRACE` and `SYS_ADMIN` capabilities are also required — these let Netdata inspect running processes and their resource usage. Without them, process-level monitoring (seeing which specific app is consuming CPU) is unavailable.

The Netdata dashboard runs on port 19999 with no authentication by default. This port should never be publicly exposed — proxy it through Caddy with at minimum HTTP basic auth. The recommended approach is to access it only over a private VPN or Tailscale/Headscale mesh, with no public exposure at all. Exposing monitoring dashboards publicly leaks infrastructure topology to anyone who discovers the URL.

For the multi-server setup, plan your network topology before deployment. The Parent node needs to be reachable from each child node on port 19999. If your servers are on different networks, you'll need either public exposure (not recommended without authentication) or a VPN mesh to link them privately.

DNS: create an A record for `metrics.yourdomain.com` pointing to your server. Caddy manages the TLS certificate automatically. Make sure port 80 and 443 are open in your firewall before Caddy starts, as Let's Encrypt uses an HTTP-01 challenge on port 80 for initial certificate issuance.


## Production Security Hardening

Netdata's dashboard shows detailed information about every process, service, and network connection on your server. Exposing it publicly without authentication is a significant information leak.

**UFW firewall.** Allow only SSH, HTTP, and HTTPS. Block 19999 from external access:

```bash
ufw default deny incoming
ufw default allow outgoing
ufw allow ssh
ufw allow 80/tcp
ufw allow 443/tcp
ufw enable
```

Port 19999 is blocked by default since `ufw default deny incoming` covers it. Confirm with `ufw status`.

**Caddy with basic auth.** Protect the metrics dashboard with a password at minimum:

```caddyfile
metrics.yourdomain.com {
    basicauth {
        admin $2a$14$yourhashhere
    }
    reverse_proxy localhost:19999
}
```

Generate the hash with: `caddy hash-password --plaintext yourpassword`.

**Fail2ban.** Enable the SSH jail to protect the server from brute-force login attempts:

```bash
apt install fail2ban -y
```

Create `/etc/fail2ban/jail.local`:

```ini
[DEFAULT]
bantime  = 2h
findtime = 15m
maxretry = 5

[sshd]
enabled = true
```

**SSH hardening.** Disable password authentication:

```bash
# /etc/ssh/sshd_config
PasswordAuthentication no
PermitRootLogin no
```

Restart SSH: `systemctl restart sshd`. Ensure your SSH key is in `~/.ssh/authorized_keys` before doing this.

**Automatic updates.** OS security patches should apply without manual intervention:

```bash
apt install unattended-upgrades -y
dpkg-reconfigure --priority=low unattended-upgrades
```

**Backup Netdata configuration.** The `netdataconfig` volume contains your custom alert rules and any tuned collector configs. Back this up using Restic to avoid reconfiguring from scratch after a server rebuild — see the [automated backup guide](/guides/automated-server-backups-restic-rclone-2026) for the full workflow. The metrics data itself (in `netdatalib` and `netdatacache`) is ephemeral by nature and generally not worth backing up.

For a complete hardening reference, see the [self-hosting security checklist](/guides/self-hosting-security-checklist-2026).


## Troubleshooting Common Issues

**Netdata container starts but shows no host metrics.** This usually means `network_mode: host` is not set, or the `/proc` and `/sys` mounts are missing. Verify the compose configuration includes both `network_mode: host` and all required volume mounts (particularly `/proc:/host/proc:ro` and `/sys:/host/sys:ro`). Without these, Netdata sees the container's namespaced view rather than the host.

**Docker container metrics missing.** Netdata needs access to the Docker socket to collect container metrics. Verify the volume mount `/var/run/docker.sock:/var/run/docker.sock:ro` is present in the compose file. Also confirm the `SYS_PTRACE` and `SYS_ADMIN` capabilities are listed under `cap_add:`.

**Alert notifications not sending.** Edit `/etc/netdata/health_alarm_notify.conf` inside the container and set `SEND_SLACK="YES"` (or your target). Then test with `docker exec netdata /usr/libexec/netdata/plugins.d/alarm-notify.sh test`. Common issues are incorrect webhook URLs or Telegram bot tokens with no permission to send messages to the target chat.

**High CPU usage on the Netdata container.** Netdata is written in C and is normally very efficient, but certain collectors (Python-based ones, in particular) can spike CPU if the monitored service is returning slow responses. Check the Netdata dashboard's own performance section — go to **Netdata Monitoring → Plugins** to see which collector is consuming CPU. Disable noisy collectors by editing the relevant config in `/etc/netdata/`.

**Metrics data lost after container restart.** Verify the Docker volumes `netdatalib` and `netdatacache` are defined and mounted correctly. If you're using bind mounts instead of named volumes, check file permissions. The Netdata process needs write access to its data directories.

**Cannot access the Netdata dashboard over Caddy.** Since Netdata uses `network_mode: host`, Caddy and Netdata are both on the host network and can communicate via `localhost:19999` without a Docker bridge. If Caddy can't reach Netdata, confirm the Netdata container is healthy: `docker compose ps` and `curl http://localhost:19999/api/v1/info`. If Caddy is itself running in a container (not host mode), use `host.docker.internal:19999` as the upstream address.

**Database-specific collectors not showing up.** Netdata auto-discovers services but needs to be able to connect to them. For PostgreSQL, Netdata needs the `pg_stat_*` views to be accessible. If PostgreSQL is running in a container without network mode host, Netdata may not be able to reach it. Configure the PostgreSQL collector manually in `/etc/netdata/python.d/postgres.conf` with the connection string for your PostgreSQL container's IP or Docker network name.

**Alerts firing incorrectly for expected conditions.** Netdata ships with default alert thresholds that may not match your specific workload. A 90% disk usage alert may trigger legitimately in production systems where high disk utilization is normal. Edit the relevant health configuration file inside the container to adjust thresholds for your environment. Use `docker exec -it netdata vi /etc/netdata/health.d/` to find and modify the relevant alert rules. Changes take effect on the next alert evaluation cycle without a restart.

**Netdata Parent not receiving data from child nodes.** Multi-server streaming requires the child's `stream.conf` to have the correct API key and destination. The API key must also be configured on the Parent's `netdata.conf` under the `[API_KEY]` section with `enabled = yes`. Firewall rules on the Parent must allow inbound connections on port 19999 from the child server IPs. Use `docker compose logs netdata` on both parent and child to see streaming connection attempts and any rejection messages.

**Very high cardinality metrics causing memory growth.** If you have many short-lived containers (CI/CD build agents, for example), Netdata tracks metrics for each container ID including terminated ones. Over time this increases memory use. Configure `[global] memory mode = ram` to limit memory (at the cost of losing data after restart) or set a shorter `history` duration. For ephemeral container environments, also configure Netdata to track containers by image name rather than container ID to reduce cardinality.


*See all open source monitoring tools at [OSSAlt.com/categories/devops](https://ossalt.com).*