Self-Hosting Guide: Deploy Grafana for Observability
·OSSAlt Team
grafanamonitoringself-hostingdockerguide
Self-Hosting Guide: Deploy Grafana for Observability
Grafana is the industry-standard open source observability platform — dashboards, alerting, and data exploration across metrics, logs, and traces. Self-hosting gives you unlimited dashboards, data sources, and users.
Requirements
- VPS with 2 GB RAM minimum (4 GB with Prometheus + Loki)
- Docker and Docker Compose
- Domain name (e.g.,
grafana.yourdomain.com) - 20+ GB disk
Step 1: Create Docker Compose
# docker-compose.yml
services:
grafana:
image: grafana/grafana:latest
container_name: grafana
restart: unless-stopped
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SERVER_ROOT_URL=https://grafana.yourdomain.com
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=your-admin-password
- GF_USERS_ALLOW_SIGN_UP=false
- GF_SMTP_ENABLED=true
- GF_SMTP_HOST=smtp.resend.com:587
- GF_SMTP_USER=resend
- GF_SMTP_PASSWORD=re_your_api_key
- GF_SMTP_FROM_ADDRESS=grafana@yourdomain.com
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
ports:
- "9090:9090"
volumes:
- prometheus_data:/prometheus
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention.time=30d'
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
ports:
- "9100:9100"
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--path.rootfs=/rootfs'
volumes:
grafana_data:
prometheus_data:
Step 2: Configure Prometheus
Create prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
# Prometheus self-monitoring
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# Server metrics
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
# Add your application metrics
# - job_name: 'my-app'
# static_configs:
# - targets: ['my-app:8080']
# metrics_path: /metrics
Step 3: Start the Stack
docker compose up -d
Step 4: Reverse Proxy (Caddy)
# /etc/caddy/Caddyfile
grafana.yourdomain.com {
reverse_proxy localhost:3000
}
sudo systemctl restart caddy
Step 5: Add Data Sources
- Open
https://grafana.yourdomain.com - Login with admin credentials
- Go to Connections → Data sources → Add data source
| Data Source | URL | Use Case |
|---|---|---|
| Prometheus | http://prometheus:9090 | Metrics (CPU, RAM, custom) |
| Loki | http://loki:3100 | Log aggregation |
| PostgreSQL | host:5432 | Database metrics |
| InfluxDB | http://influxdb:8086 | Time series data |
| Elasticsearch | http://elasticsearch:9200 | Logs and search |
| CloudWatch | AWS credentials | AWS metrics |
Step 6: Import Pre-Built Dashboards
Grafana has 1000+ community dashboards at grafana.com/grafana/dashboards.
Essential dashboards to import:
| Dashboard ID | Name | For |
|---|---|---|
| 1860 | Node Exporter Full | Server metrics |
| 3662 | Prometheus Overview | Prometheus health |
| 14055 | Docker Containers | Container metrics |
| 12708 | PostgreSQL | Database metrics |
| 763 | Redis | Redis metrics |
To import:
- Dashboards → New → Import
- Enter the dashboard ID
- Select your data source
- Click Import
Step 7: Create Custom Dashboards
Example: Application metrics dashboard
- Dashboards → New Dashboard → Add visualization
- Select Prometheus data source
- Use PromQL queries:
# CPU usage per container
rate(container_cpu_usage_seconds_total[5m]) * 100
# Memory usage
container_memory_usage_bytes / 1024 / 1024
# HTTP request rate
rate(http_requests_total[5m])
# HTTP error rate (5xx)
rate(http_requests_total{status=~"5.."}[5m])
# Request latency (p95)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
Step 8: Set Up Alerting
- Go to Alerting → Alert rules → New alert rule
- Define your condition:
Example alerts:
| Alert | Condition | Severity |
|---|---|---|
| High CPU | CPU > 80% for 5 min | Warning |
| Disk full | Disk usage > 90% | Critical |
| Service down | Up metric = 0 for 1 min | Critical |
| High error rate | 5xx > 1% of requests | Warning |
| Memory pressure | RAM > 90% for 10 min | Warning |
- Configure Contact points (where alerts go):
- Email (via SMTP)
- Slack webhook
- Discord webhook
- PagerDuty
- Telegram
Step 9: Add Log Aggregation with Loki (Optional)
Add to docker-compose.yml:
loki:
image: grafana/loki:latest
container_name: loki
restart: unless-stopped
ports:
- "3100:3100"
volumes:
- loki_data:/loki
command: -config.file=/etc/loki/local-config.yaml
promtail:
image: grafana/promtail:latest
container_name: promtail
restart: unless-stopped
volumes:
- /var/log:/var/log:ro
- ./promtail-config.yml:/etc/promtail/config.yml
command: -config.file=/etc/promtail/config.yml
Create promtail-config.yml:
server:
http_listen_port: 9080
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets: [localhost]
labels:
job: varlogs
__path__: /var/log/*.log
Production Hardening
Backups:
# Grafana (dashboards, users, settings)
docker cp grafana:/var/lib/grafana/grafana.db /backups/grafana-$(date +%Y%m%d).db
# Prometheus data (if needed — usually regenerated from exporters)
docker run --rm -v prometheus_data:/data -v /backups:/backup alpine \
tar czf /backup/prometheus-$(date +%Y%m%d).tar.gz /data
Data retention:
# Prometheus: keep 30 days
command: '--storage.tsdb.retention.time=30d'
# Loki: configure retention in loki config
# limits_config:
# retention_period: 744h # 31 days
Updates:
docker compose pull
docker compose up -d
Security:
- Disable sign-up:
GF_USERS_ALLOW_SIGN_UP=false - Set up OIDC auth for team access
- Restrict Prometheus/Loki to internal network
- Use read-only data source connections
Resource Usage
| Stack | RAM | CPU | Disk |
|---|---|---|---|
| Grafana only | 512 MB | 1 core | 5 GB |
| Grafana + Prometheus | 2 GB | 2 cores | 20 GB |
| Full stack (+ Loki) | 4 GB | 4 cores | 50 GB |
VPS Recommendations
| Provider | Spec (full stack) | Price |
|---|---|---|
| Hetzner | 4 vCPU, 8 GB RAM | €8/month |
| DigitalOcean | 2 vCPU, 4 GB RAM | $24/month |
| Linode | 2 vCPU, 4 GB RAM | $24/month |
Compare monitoring and observability tools on OSSAlt — features, data sources, and pricing side by side.