Self-Hosting Grafana for Observability 2026

Grafana is the industry-standard open source observability platform — dashboards, alerting, and data exploration across metrics, logs, and traces. Self-hosting gives you unlimited dashboards, data sources, and users.

Requirements

VPS with 2 GB RAM minimum (4 GB with Prometheus + Loki)
Docker and Docker Compose
Domain name (e.g., grafana.yourdomain.com)
20+ GB disk

Step 1: Create Docker Compose

# docker-compose.yml
services:
  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SERVER_ROOT_URL=https://grafana.yourdomain.com
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=your-admin-password
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_SMTP_ENABLED=true
      - GF_SMTP_HOST=smtp.resend.com:587
      - GF_SMTP_USER=resend
      - GF_SMTP_PASSWORD=re_your_api_key
      - GF_SMTP_FROM_ADDRESS=grafana@yourdomain.com

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - prometheus_data:/prometheus
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=30d'

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    ports:
      - "9100:9100"
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--path.rootfs=/rootfs'

volumes:
  grafana_data:
  prometheus_data:

Step 2: Configure Prometheus

Create prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  # Prometheus self-monitoring
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Server metrics
  - job_name: 'node'
    static_configs:
      - targets: ['node-exporter:9100']

  # Add your application metrics
  # - job_name: 'my-app'
  #   static_configs:
  #     - targets: ['my-app:8080']
  #   metrics_path: /metrics

Step 3: Start the Stack

docker compose up -d

Step 4: Reverse Proxy (Caddy)

# /etc/caddy/Caddyfile
grafana.yourdomain.com {
    reverse_proxy localhost:3000
}

sudo systemctl restart caddy

Step 5: Add Data Sources

Open https://grafana.yourdomain.com
Login with admin credentials
Go to Connections → Data sources → Add data source

Data Source	URL	Use Case
Prometheus	`http://prometheus:9090`	Metrics (CPU, RAM, custom)
Loki	`http://loki:3100`	Log aggregation
PostgreSQL	`host:5432`	Database metrics
InfluxDB	`http://influxdb:8086`	Time series data
Elasticsearch	`http://elasticsearch:9200`	Logs and search
CloudWatch	AWS credentials	AWS metrics

Step 6: Import Pre-Built Dashboards

Grafana has 1000+ community dashboards at grafana.com/grafana/dashboards.

Essential dashboards to import:

Dashboard ID	Name	For
1860	Node Exporter Full	Server metrics
3662	Prometheus Overview	Prometheus health
14055	Docker Containers	Container metrics
12708	PostgreSQL	Database metrics
763	Redis	Redis metrics

To import:

Dashboards → New → Import
Enter the dashboard ID
Select your data source
Click Import

Step 7: Create Custom Dashboards

Example: Application metrics dashboard

Dashboards → New Dashboard → Add visualization
Select Prometheus data source
Use PromQL queries:

# CPU usage per container
rate(container_cpu_usage_seconds_total[5m]) * 100

# Memory usage
container_memory_usage_bytes / 1024 / 1024

# HTTP request rate
rate(http_requests_total[5m])

# HTTP error rate (5xx)
rate(http_requests_total{status=~"5.."}[5m])

# Request latency (p95)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

Step 8: Set Up Alerting

Go to Alerting → Alert rules → New alert rule
Define your condition:

Example alerts:

Alert	Condition	Severity
High CPU	CPU > 80% for 5 min	Warning
Disk full	Disk usage > 90%	Critical
Service down	Up metric = 0 for 1 min	Critical
High error rate	5xx > 1% of requests	Warning
Memory pressure	RAM > 90% for 10 min	Warning

Configure Contact points (where alerts go):
- Email (via SMTP)
- Slack webhook
- Discord webhook
- PagerDuty
- Telegram

Step 9: Add Log Aggregation with Loki (Optional)

Add to docker-compose.yml:

  loki:
    image: grafana/loki:latest
    container_name: loki
    restart: unless-stopped
    ports:
      - "3100:3100"
    volumes:
      - loki_data:/loki
    command: -config.file=/etc/loki/local-config.yaml

  promtail:
    image: grafana/promtail:latest
    container_name: promtail
    restart: unless-stopped
    volumes:
      - /var/log:/var/log:ro
      - ./promtail-config.yml:/etc/promtail/config.yml
    command: -config.file=/etc/promtail/config.yml

Create promtail-config.yml:

server:
  http_listen_port: 9080
positions:
  filename: /tmp/positions.yaml
clients:
  - url: http://loki:3100/loki/api/v1/push
scrape_configs:
  - job_name: system
    static_configs:
      - targets: [localhost]
        labels:
          job: varlogs
          __path__: /var/log/*.log

Production Hardening

Backups:

# Grafana (dashboards, users, settings)
docker cp grafana:/var/lib/grafana/grafana.db /backups/grafana-$(date +%Y%m%d).db

# Prometheus data (if needed — usually regenerated from exporters)
docker run --rm -v prometheus_data:/data -v /backups:/backup alpine \
  tar czf /backup/prometheus-$(date +%Y%m%d).tar.gz /data

Data retention:

# Prometheus: keep 30 days
command: '--storage.tsdb.retention.time=30d'

# Loki: configure retention in loki config
# limits_config:
#   retention_period: 744h  # 31 days

Updates:

docker compose pull
docker compose up -d

Security:

Disable sign-up: GF_USERS_ALLOW_SIGN_UP=false
Set up OIDC auth for team access
Restrict Prometheus/Loki to internal network
Use read-only data source connections

Resource Usage

Stack	RAM	CPU	Disk
Grafana only	512 MB	1 core	5 GB
Grafana + Prometheus	2 GB	2 cores	20 GB
Full stack (+ Loki)	4 GB	4 cores	50 GB

VPS Recommendations

Provider	Spec (full stack)	Price
Hetzner	4 vCPU, 8 GB RAM	€8/month
DigitalOcean	2 vCPU, 4 GB RAM	$24/month
Linode	2 vCPU, 4 GB RAM	$24/month

Why Self-Host Grafana

Grafana Cloud's free tier limits you to 10,000 active series, 50 GB of logs, and 50 GB of traces — limits that a single production application can exceed within days of launch. The Pro plan starts at $8/month but scales linearly with usage. A team monitoring 5 microservices across staging and production can easily hit $50–100/month just in Grafana Cloud costs, before you factor in separate charges for Prometheus-compatible metric storage.

Self-hosted Grafana is free regardless of how many dashboards, users, or data sources you connect. The only cost is your server — a Hetzner CX21 (€3.79/month, 2 vCPU, 2 GB RAM) handles Grafana for teams of up to 20 users with room to spare. Add Prometheus and Loki on the same server and you have a complete observability stack for under €10/month.

Data sovereignty: Grafana Cloud routes your metrics and logs through Grafana Labs' infrastructure. If you're under GDPR, HIPAA, or SOC 2 compliance requirements, self-hosting keeps sensitive telemetry — which may include user IDs, request paths, and business metrics — within your own infrastructure. Financial services, healthcare, and SaaS companies with enterprise customers increasingly find self-hosting non-negotiable for their monitoring stack.

Plugin ecosystem: Grafana has 300+ community plugins. Some — including specialized data source connectors and custom authentication plugins — are only available when self-hosting. Grafana Cloud restricts plugin installation and frequently limits which plugins are available on lower-tier plans.

When NOT to self-host Grafana: If you have fewer than 5 services to monitor, Grafana Cloud's free tier is probably sufficient. The operational overhead of managing your own Grafana instance — updates, backups, SSL, data retention tuning — isn't worth it at small scale. Also consider the managed option if your team has no DevOps capacity; Grafana on a VPS requires active maintenance and is not "set and forget."

Prerequisites (Expanded)

Understanding what each requirement actually means before you start helps avoid mid-deployment surprises.

2 GB RAM minimum (4 GB with Prometheus + Loki): The RAM requirement isn't Grafana itself — Grafana alone runs comfortably in 256 MB. The larger footprint comes from Prometheus, which holds its time-series data in memory for fast querying, and Loki, which buffers log ingestion. For a production stack with all three services, plan for 4 GB minimum. If you're only running Grafana with external data sources (like a hosted Prometheus), 1 GB is workable.

20+ GB disk: Prometheus compresses metrics well, but data still accumulates. With a 30-day retention window and moderate scrape frequency (15s), expect 5–15 GB of Prometheus data. Loki log storage varies dramatically by log verbosity — a chatty application can generate gigabytes per day. Start with 40 GB and monitor disk usage with Grafana's Node Exporter dashboard.

Ubuntu 22.04 LTS is the recommended OS. It ships with the Docker version Compose v2 expects, has extensive community troubleshooting documentation, and receives security patches through 2027. Debian 12 is a solid alternative.

For choosing the right VPS for your observability stack — particularly if you're deciding between Hetzner, DigitalOcean, and Vultr — see the VPS comparison for self-hosters. Network egress pricing matters when exporters are sending metrics from multiple servers.

Production Security Hardening

Grafana with weak security is a significant risk — it has access to metrics from every service you're monitoring, and those metrics often reveal sensitive operational details. Take the following hardening steps seriously.

Firewall with UFW: Expose only the ports that need to be public. Prometheus (9090) and node-exporter (9100) should never be reachable from the internet.

sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

Fail2ban: Brute-force attacks on Grafana's login page are common once the service is publicly accessible.

sudo apt install fail2ban

Create /etc/fail2ban/filter.d/grafana.conf:

[Definition]
failregex = logger=context userId=0 orgId=0 uname= t=\S+ level=warn msg="Invalid username or password" \S+ remote_addr=<HOST>
ignoreregex =

Add to /etc/fail2ban/jail.local:

[grafana]
enabled = true
port = http,https
filter = grafana
logpath = /var/log/grafana/grafana.log
maxretry = 5
bantime = 3600

Keep secrets out of docker-compose.yml: Store GF_SECURITY_ADMIN_PASSWORD, SMTP credentials, and any API keys in a .env file that is excluded from version control. Never commit credentials to a Git repository.

Disable SSH password authentication: After confirming key-based SSH access works:

sudo sed -i 's/#PasswordAuthentication yes/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo systemctl restart sshd

Restrict Prometheus access: Prometheus has no authentication by default. Add a reverse proxy with basic auth in front of it, or restrict it to Docker's internal network only (remove the ports entry from the Compose service definition).

Automatic security updates:

sudo apt install unattended-upgrades
sudo dpkg-reconfigure --priority=low unattended-upgrades

For a complete hardening checklist covering log monitoring, certificate management, and network security for self-hosted services, see the self-hosting security checklist.

Troubleshooting Common Issues

Grafana shows "Data source not found" for Prometheus

The most common cause is a typo in the Prometheus URL. When Grafana and Prometheus are in the same Docker Compose network, the correct URL is http://prometheus:9090 — using the service name, not localhost. If you named the service differently in your Compose file, use that name. Verify connectivity with:

docker exec grafana wget -qO- http://prometheus:9090/-/healthy

Prometheus can't scrape targets — "connection refused"

Check that your scrape target is actually accessible from within the Prometheus container. A common issue is scraping localhost:9100 for node-exporter when they're in different containers. Use the Docker service name (node-exporter:9100) instead. Also check that the target port is not behind a firewall — node-exporter's port 9100 should be reachable within Docker's internal network but not exposed publicly.

Grafana dashboards show "No data" after importing

Dashboard templates reference a data source by name. If your Prometheus data source is named "Prometheus" but the dashboard template expects "prometheus" (lowercase) or "default", queries will return no data. Go to Dashboard settings → Variables and update the data source variable to match your actual data source name.

Container uses too much disk space

Prometheus stores data in prometheus_data Docker volume. If you didn't set a retention policy, it accumulates indefinitely. Check volume size with docker system df -v. Add --storage.tsdb.retention.time=30d to your Prometheus command flags to limit retention. For Loki, configure retention_period in the Loki config file.

Alerts fire but notifications aren't being sent

Test your notification channel from Alerting → Contact points → Test. If the test succeeds but real alerts don't notify, check that your alert rule's evaluation group matches the correct contact point. Also verify the SMTP credentials if using email alerts — authentication failures are logged in Grafana's logs:

docker compose logs grafana | grep -i smtp

Grafana is slow or unresponsive under load

Grafana itself is lightweight. Slowness usually originates from expensive PromQL queries running against large Prometheus datasets. Use the Query Inspector in dashboard edit mode to measure query execution time. Add rate() intervals that match your scrape interval, and avoid querying more time range than needed. For very large datasets, consider Thanos or Cortex as a Prometheus long-term storage backend.

Extending Grafana with Additional Data Sources

Grafana's power comes from its breadth of data source integrations. Beyond Prometheus and Loki, the plugin ecosystem connects Grafana to dozens of specialized data stores.

InfluxDB is a popular time-series database that pairs naturally with Grafana — many IoT and hardware monitoring setups use InfluxDB + Grafana + Telegraf as an alternative to the Prometheus stack. PostgreSQL and MySQL can be queried directly in Grafana, making it useful for business dashboards that pull from application databases. Elasticsearch integrations bring log search capabilities that complement Loki for cases where full-text search is needed.

Alertmanager integration is where Grafana's alerting capabilities fully emerge. Rather than managing alert rules and routing separately in Alertmanager, Grafana's unified alerting (introduced in Grafana 9+) centralizes alert definitions, silences, and routing in the Grafana UI. Contact points can route to Slack, PagerDuty, email, OpsGenie, Telegram, or any webhook endpoint. Alert rules can reference any configured data source, so you can alert on Prometheus metrics, Loki log patterns, and database queries from the same interface.

Dashboard sharing deserves mention for teams. Grafana supports public dashboards (accessible without login) for sharing infrastructure status with stakeholders, embedded panels in other web applications via iframe, and snapshot URLs that capture a dashboard's current state for sharing in incident reports. These features make Grafana useful not just for monitoring but as a reporting and communication tool.

For a broader view of observability tooling, see the best open source monitoring tools — Netdata, Zabbix, and Uptime Kuma each serve different niches that complement Grafana's dashboarding layer.

Set up automated server backups with restic to protect your Grafana dashboards and Prometheus data.

Compare monitoring and observability tools on OSSAlt — features, data sources, and pricing side by side.

See open source alternatives to Grafana on OSSAlt.

The SaaS-to-Self-Hosted Migration Guide (Free PDF)