Open-source alternatives guide
Best Open Source Datadog Alternatives for Metrics, Logs, and APM in 2026
Datadog costs $15-35/host/month. The best open source alternatives: Grafana+Prometheus, Netdata, SigNoz, VictoriaMetrics, and Zabbix — compared by use case.
TL;DR
Datadog charges $15–35/host/month — $150–350/month for 10 servers. The open source stack covers the same ground for server cost only. For most teams: Grafana + Prometheus + Loki replaces Datadog's metrics and log pipeline; SigNoz replaces Datadog's APM; Netdata provides real-time per-second metrics with zero config. No single tool matches Datadog feature-for-feature, but the combination gets you 90% there at 10% of the cost.
Key Takeaways
- Grafana + Prometheus + Loki: Best metrics + logs stack, ~62K + 55K + 23K stars
- SigNoz: Best APM (distributed tracing + metrics + logs unified), ~19K stars
- Netdata: Best real-time monitoring, ~73K stars, 1-min setup, per-second resolution
- VictoriaMetrics: Best Prometheus alternative for high-cardinality data, ~12K stars
- Zabbix: Best for enterprise/network monitoring, GPL, ~8K stars
- Cost: $0 vs Datadog's $150–350/month for 10 hosts
What Datadog Provides (and OSS alternatives)
| Datadog Feature | Open Source Replacement |
|---|---|
| Infrastructure metrics | Prometheus + Grafana / Netdata |
| Log management | Loki + Grafana / ELK Stack |
| APM (distributed tracing) | SigNoz / Jaeger + Grafana Tempo |
| Dashboards | Grafana |
| Alerting | Grafana Alerts / Alertmanager |
| Container monitoring | cAdvisor + Prometheus |
| Synthetic monitoring | Checkly (OSS) / Blackbox Exporter |
| Profiling | Grafana Pyroscope |
1. Grafana + Prometheus + Loki: The Core Stack
The most widely deployed open source monitoring stack. Covers metrics, logs, and dashboards with a consistent UI.
- Prometheus (~55K stars): Metrics collection and storage
- Grafana (~62K stars): Dashboards, visualization, alerting
- Loki (~23K stars): Log aggregation and search
- node_exporter + cAdvisor: Host and container metrics
Deployment: See our full setup guide at /guides/grafana-prometheus-loki-self-hosted-observability-stack-2026.
Cost comparison:
- Datadog Pro: $23/host × 10 hosts = $230/month
- Grafana+Prometheus+Loki: ~$15/month VPS running the stack = $15/month
Gaps vs Datadog:
- No built-in APM / distributed tracing (add SigNoz for this)
- Requires more configuration upfront
- No hosted SaaS option (use Grafana Cloud free tier if you want managed)
2. SigNoz: The APM Replacement
SigNoz is an open source APM (Application Performance Monitoring) platform with ~19K GitHub stars. It's the most direct replacement for Datadog APM — unified metrics, logs, and distributed tracing in one UI, built on OpenTelemetry.
What SigNoz Does
- Distributed tracing: Track requests across microservices (like Datadog APM)
- Service maps: Auto-generated topology of your services
- Metrics: Infrastructure and custom application metrics
- Log management: Correlated with traces and metrics
- Alerts: Anomaly detection and threshold alerts
Quick Deploy
# docker-compose.yml — SigNoz uses ClickHouse as the backend
# The full compose file has many services; use the official installer:
# Clone and run:
git clone https://github.com/SigNoz/signoz.git
cd signoz/deploy
./install.sh
# Access at: http://your-server:3301
Instrument Your App (OpenTelemetry)
SigNoz uses the OpenTelemetry standard — the same SDK works for any OTEL-compatible backend:
// Node.js instrumentation:
import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: 'http://your-signoz-server:4318/v1/traces',
}),
metricReader: new PeriodicExportingMetricReader({
exporter: new OTLPMetricExporter({
url: 'http://your-signoz-server:4318/v1/metrics',
}),
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
# Python:
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
exporter = OTLPSpanExporter(endpoint="http://your-signoz-server:4318/v1/traces")
Resource requirements: SigNoz needs ~4GB RAM for a small production deployment (ClickHouse is resource-intensive).
3. Netdata: Real-Time Monitoring
Netdata (~73K stars) provides per-second metrics with zero configuration. Install in 60 seconds, get instant visibility into CPU, memory, disk, network, Docker containers, Postgres, Redis, Nginx, and 800+ more.
Best for: Real-time operational monitoring — "is the server behaving normally right now?"
# Install:
bash <(curl -Ss https://my-netdata.io/kickstart.sh)
# Immediate dashboard at:
http://your-server:19999
See our full guide: /guides/how-to-self-host-netdata-real-time-server-monitoring-2026
Gaps vs Datadog:
- No long-term metric retention in free version
- No APM / distributed tracing
- Less powerful alerting than Prometheus Alertmanager
4. VictoriaMetrics: High-Performance Prometheus
VictoriaMetrics is a fast, cost-efficient monitoring solution compatible with Prometheus and Grafana. ~12K GitHub stars, Apache 2.0 license.
Why VictoriaMetrics instead of Prometheus
- 5–10x lower storage requirements than Prometheus for the same data
- Better performance at high cardinality (millions of time series)
- Longer retention possible on the same hardware
- Drop-in Prometheus replacement — Grafana, Alertmanager, and all Prometheus integrations work unchanged
Quick Deploy
services:
victoriametrics:
image: victoriametrics/victoria-metrics:latest
ports:
- "8428:8428"
volumes:
- victoriametrics_data:/victoria-metrics-data
command:
- '--storageDataPath=/victoria-metrics-data'
- '--retentionPeriod=12' # 12 months retention
Prometheus Config (just change remote_write URL)
# prometheus.yml — use VictoriaMetrics as remote storage:
remote_write:
- url: http://victoriametrics:8428/api/v1/write
# Or replace Prometheus entirely — VictoriaMetrics has its own scrape config
When to use: If Prometheus is struggling with storage costs or high-cardinality metrics from hundreds of services/containers.
5. Zabbix: Enterprise Infrastructure Monitoring
Zabbix is a mature enterprise monitoring platform with ~8K GitHub stars and 20+ years of production use. GPL 2.0 license.
What Zabbix Specializes In
- Network monitoring: SNMP support, network device monitoring (switches, routers, firewalls)
- Agent-based monitoring: Zabbix agent installed on servers
- Auto-discovery: Discovers network devices automatically
- Template library: 1,000+ pre-built templates for common software
- Enterprise features: SLA tracking, trend prediction, audit log
When to Choose Zabbix
- You're monitoring network infrastructure (not just servers/containers)
- Your team already knows Zabbix
- You need SNMP monitoring for network devices
- You want a GUI-first monitoring tool (not YAML config)
Gaps: Steeper learning curve, heavier than Netdata, less modern than Grafana.
6. Elastic Stack (ELK): Log-Focused Observability
Elasticsearch + Logstash + Kibana — the original large-scale log management stack.
| Component | Purpose |
|---|---|
| Elasticsearch | Full-text search and analytics engine for logs |
| Logstash | Log ingestion and transformation pipeline |
| Kibana | Visualization and search UI |
| Beats (Filebeat) | Lightweight log shippers |
License note: Elasticsearch and Kibana changed from Apache 2.0 to SSPL (non-OSS) in 2021. OpenSearch (AWS fork) maintains the Apache 2.0 license.
When to use ELK/OpenSearch: You need powerful full-text log search across petabytes of logs. For most self-hosters, Loki + Grafana provides a simpler and lighter alternative.
Decision Guide
For most self-hosted teams (start here):
Grafana + Prometheus + Loki
→ Add Netdata for real-time per-second monitoring
→ Add SigNoz for distributed tracing if needed
For high-cardinality metrics (100K+ time series):
VictoriaMetrics as Prometheus backend
For real-time only (simplest setup):
Netdata → 60 seconds to install, zero config
For APM + distributed tracing:
SigNoz (all-in-one) or Jaeger + Grafana Tempo
For network/SNMP monitoring:
Zabbix
For log search at scale:
OpenSearch (Apache 2.0 Elasticsearch fork) + OpenSearch Dashboards
Cost Breakdown: Self-Hosted vs Datadog
Setup: 10 application servers + monitoring server
| Solution | Infrastructure | License | Total/month |
|---|---|---|---|
| Datadog Pro (10 hosts) | Included | $230 | $230 |
| Datadog Enterprise (10 hosts) | Included | $350 | $350 |
| Grafana+Prometheus+Loki | $6–15/month VPS | $0 | $6–15 |
| SigNoz | $8/month VPS | $0 | $8 |
| Netdata | $0 (runs on monitored servers) | $0 | $0 |
| Full self-hosted stack | $15–20/month | $0 | $15–20 |
Alerting and Incident Response Without Datadog
A monitoring stack without alerting is just a dashboard — the operational value comes from reliable, low-noise alerting that routes the right signal to the right person.
Grafana Alerting (v11+) is now the standard alerting layer for the self-hosted stack. Grafana Alerting supports:
- Alert rules on any Grafana data source (Prometheus, Loki, InfluxDB, PostgreSQL)
- Alert routing to PagerDuty, OpsGenie, Slack, email, and webhook
- Alert grouping and inhibition rules (silence low-priority alerts when a high-severity one fires)
- Multi-dimensional alerts (alert per-service, per-host, per-endpoint — matching Datadog's monitor scoping)
- Notification policies with escalation chains
For teams coming from Datadog, the functional equivalence is high. The configuration syntax differs (Grafana uses YAML-based provisioning; Datadog uses a GUI with HCL export), but the alert model — conditions, thresholds, anomaly detection, composite monitors — maps directly.
Alertmanager (the Prometheus native alerting component) handles routing, grouping, and deduplication of alerts generated by Prometheus recording rules. For simpler stacks without Grafana, Alertmanager + Prometheus recording rules is a low-overhead alerting setup.
SigNoz includes a built-in alerting UI that creates alerts directly from traces, metrics, and logs in a single interface. For teams migrating from Datadog's APM-integrated monitors (alert when p99 latency > 500ms for service X), SigNoz's correlation between traces and alerts is the closest open source equivalent.
Migrating from Datadog: Practical Steps
The migration path from Datadog follows a parallel-run approach to avoid monitoring gaps:
Step 1: Install the self-hosted stack alongside Datadog. Run Prometheus and the Grafana stack for 1-2 weeks while still paying for Datadog. This validates that your metrics collection is complete and catches any gaps in coverage.
Step 2: Recreate your Datadog dashboards in Grafana. Datadog dashboard exports (JSON) don't import directly to Grafana, but the panel types (timeseries, table, histogram, heatmap) have direct equivalents. For teams with many dashboards, Grafana's provisioning API allows bulk dashboard creation.
Step 3: Migrate Datadog monitors to Grafana alerting rules. Map each Datadog monitor to a Prometheus recording rule + Grafana alert rule. Test alerts fire correctly by temporarily lowering thresholds.
Step 4: Validate log coverage with Promtail + Loki. Datadog's log management is often the most-used feature after metrics. Loki + Promtail replaces log ingestion; Grafana Explore provides the log search interface. For high-volume log environments, benchmark query performance against your retention requirements.
Step 5: Cancel Datadog. After 2-4 weeks of parallel operation with no gaps, the migration is complete.
For the complete self-hosting setup, see our Grafana + Prometheus self-hosted observability guide and the best open source monitoring tools roundup for additional tool comparisons.
Infrastructure Requirements for Your Monitoring Stack
One practical question when evaluating the migration is what server resources the self-hosted stack requires. Datadog's agent runs on every monitored host and ships data to Datadog's cloud — you pay per host but don't provision monitoring infrastructure yourself. Self-hosting shifts that burden to you.
Grafana + Prometheus stack requirements (monitoring 10-30 servers):
- A single VPS with 2 vCPUs and 4GB RAM handles the Prometheus + Grafana + Loki + Alertmanager stack comfortably at this scale
- Disk: budget 10-20GB per month for Prometheus TSDB (default 15-day retention), 5-15GB per month for Loki log storage depending on verbosity
- At 30-50 servers: upgrade to 4 vCPUs / 8GB RAM; consider VictoriaMetrics as the Prometheus backend for its better compression and lower memory footprint
Netdata (agent-based, runs on each monitored server):
- Netdata runs on the same server it monitors, adding only 1-2% CPU overhead and 100-200MB RAM per host
- No separate monitoring server needed for standalone Netdata — each node has its own dashboard
- For centralized dashboards across many Netdata nodes: Netdata Cloud (free tier available) or Grafana federation
SigNoz (all-in-one APM):
- Minimum 4 vCPUs / 8GB RAM for the SigNoz stack (ClickHouse-backed trace storage is resource-intensive)
- Recommended 8 vCPUs / 16GB RAM for production workloads with >10 instrumented services
- Docker Compose deployment is well-documented; Kubernetes Helm chart available for larger deployments
The infrastructure cost at $6-20/month is the headline number, but don't ignore the operational time budget: expect 4-8 hours of initial setup, then 1-2 hours per month for maintenance (updates, retention tuning, dashboard additions). At any engineering hourly rate, this remains dramatically cheaper than Datadog billing.
One underrated benefit of self-hosting observability: your monitoring data is yours indefinitely. Datadog's retention is limited by plan tier — Infrastructure Pro retains metrics for 15 months maximum. Grafana + Prometheus with a VictoriaMetrics long-term storage backend can retain years of metric history at minimal cost ($5-10/month for a high-density storage VPS). For teams that need historical trending for capacity planning, compliance audits, or incident post-mortems, unlimited retention is a structural advantage that no Datadog tier can match at equivalent cost.
For teams starting the migration, Netdata's self-hosting guide covers the quickest path to immediate monitoring coverage while the longer-term Grafana stack is being configured.
Compare all open source Datadog alternatives at OSSAlt.com/alternatives/datadog.
The SaaS-to-Self-Hosted Migration Guide (Free PDF)
Step-by-step: infrastructure setup, data migration, backups, and security for 15+ common SaaS replacements. Used by 300+ developers.
Join 300+ self-hosters. Unsubscribe in one click.