Best Open Source Alternatives to Datadog in 2026

Datadog's per-host pricing starts reasonable — then you add APM, log management, RUM, and synthetics. Suddenly you're paying $23-65/host/month per product. Open source observability has matured dramatically, and the Grafana + Prometheus stack now handles everything Datadog does.

TL;DR

The Grafana stack (Grafana + Prometheus + Loki + Tempo) is the most complete Datadog replacement — metrics, logs, traces, and dashboards in one ecosystem. SigNoz offers a single-binary alternative with Datadog-like UX built on OpenTelemetry. Uptrace is the lightweight option for smaller teams.

Key Takeaways

Grafana stack is the industry standard — used by thousands of companies, massive community, handles any scale
SigNoz is the closest UX match to Datadog — single platform for metrics, traces, and logs with built-in dashboards
Prometheus is unmatched for metrics — the CNCF standard, supported by every cloud-native tool
The cost difference is dramatic — Datadog at 50 hosts with APM + logs costs $50K-150K/year; self-hosting costs $5K-15K/year
OpenTelemetry is the key — vendor-neutral instrumentation means you can switch backends without changing application code
Trade-off: You manage infrastructure; Datadog manages it for you

The Comparison

Feature	Datadog	Grafana Stack	SigNoz	Uptrace
Price	$15-65/host/mo	Free (OSS)	Free (OSS)	Free (OSS)
Metrics	✅	Prometheus/Mimir	✅	✅
Logs	✅	Loki	✅	✅
Traces	✅	Tempo	✅	✅
Dashboards	✅	Grafana (best)	✅	✅
Alerting	✅	✅	✅	✅
APM	✅	Tempo + Grafana	✅	✅
RUM	✅	Faro	Coming	❌
Synthetics	✅	k6	❌	❌
Profiling	✅	Pyroscope	❌	❌
OpenTelemetry	✅	✅	✅ (native)	✅ (native)
Single binary	N/A (SaaS)	No (multiple)	Yes	Yes
Setup complexity	Low	Medium-High	Low	Low

1. The Grafana Stack (LGTM)

The complete open source observability platform.

The Grafana ecosystem provides a component for every observability pillar:

Component	Role	Replaces
Grafana	Dashboards & visualization	Datadog dashboards
Prometheus	Metrics collection & storage	Datadog metrics
Loki	Log aggregation	Datadog logs
Tempo	Distributed tracing	Datadog APM
Mimir	Long-term metrics storage	Datadog metrics (at scale)
Pyroscope	Continuous profiling	Datadog profiling
k6	Load testing & synthetics	Datadog synthetics
Faro	Frontend monitoring	Datadog RUM
Alloy	Telemetry collector	Datadog Agent

Quick Setup

# docker-compose.yml — minimal LGTM stack
version: '3.8'
services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"

  loki:
    image: grafana/loki:latest
    ports:
      - "3100:3100"

  tempo:
    image: grafana/tempo:latest
    ports:
      - "3200:3200"
      - "4317:4317"  # OTLP gRPC

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
    depends_on:
      - prometheus
      - loki
      - tempo

Strengths

Community: Largest observability community, thousands of pre-built dashboards
Flexibility: Mix and match components, replace any piece
Scale: Mimir handles billions of metrics (used by AWS, Grafana Cloud)
Dashboards: Grafana dashboards are the gold standard — nothing else comes close
Ecosystem: Native integrations with every cloud-native tool

Trade-offs

Multiple components to deploy and manage
More operational overhead than a single-binary solution
Learning curve for PromQL, LogQL, TraceQL

Best for: Organizations with DevOps/SRE teams, large-scale infrastructure, anyone already familiar with Prometheus.

2. SigNoz

The Datadog-like experience, fully open source.

GitHub: 20K+ stars
Stack: Go, React, ClickHouse
License: AGPL-3.0 (recently changed from MIT)
Deploy: Docker, Helm, manual

SigNoz is the closest thing to a drop-in Datadog replacement. It's a single platform — not a collection of tools — with built-in dashboards for metrics, traces, and logs. The UX feels familiar to anyone coming from Datadog.

Standout features:

Unified metrics, traces, and logs in one UI
Built on OpenTelemetry (native OTLP support)
ClickHouse backend for fast queries at scale
Service maps and dependency graphs
Custom dashboards with query builder
Alert rules with multiple channels (Slack, PagerDuty, email)
Exceptions tracking
Infrastructure monitoring

Quick Setup

git clone https://github.com/SigNoz/signoz.git
cd signoz/deploy
docker compose -f docker/clickhouse-setup/docker-compose.yaml up -d

Instrumenting Your App

// OpenTelemetry setup — works with SigNoz, Grafana, or any OTLP backend
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: 'http://signoz:4318/v1/traces',
  }),
  metricExporter: new OTLPMetricExporter({
    url: 'http://signoz:4318/v1/metrics',
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

Best for: Teams wanting a single-platform experience, organizations without dedicated SRE teams, anyone migrating from Datadog who wants familiar UX.

3. Uptrace

Lightweight observability with OpenTelemetry.

GitHub: 3K+ stars
Stack: Go, Vue.js, ClickHouse
License: BSL 1.1
Deploy: Docker, binary

Uptrace is leaner than SigNoz — fewer features, but simpler to deploy and operate. It focuses on traces and metrics with a clean interface. Good for smaller teams that don't need every bell and whistle.

Standout features:

Distributed tracing with service graphs
Metrics with dashboards
Log management
Alerting with notification channels
SQL-based query language (familiar for most developers)
Single binary deployment option

Best for: Small-to-medium teams, simpler architectures, teams wanting the lightest possible observability setup.

Cost Comparison

Scenario	Datadog	Grafana Stack	SigNoz
10 hosts, metrics only	$230/month	$50/month (VPS)	$30/month (VPS)
10 hosts + APM + logs	$1,150/month	$100/month	$50/month
50 hosts + APM + logs	$5,750/month	$300/month	$200/month
100 hosts, full stack	$15,000+/month	$800/month	$500/month
Annual savings (50 hosts)	—	$65,400/year	$66,600/year

Self-hosted costs = server infrastructure + engineer time. Datadog costs = subscription only.

The OpenTelemetry Advantage

The key insight: instrument with OpenTelemetry, then choose your backend. OTel is vendor-neutral — the same instrumentation code works with Grafana, SigNoz, Datadog, or any OTLP-compatible platform.

This means:

Instrument your app once with OpenTelemetry SDKs
Send data to your open source backend
If you outgrow self-hosted, switch to a managed backend without code changes
If you switch from Datadog, your instrumentation transfers

Decision Guide

Choose the Grafana Stack if:

You have a DevOps/SRE team to manage it
You want the most flexible, component-based approach
Grafana dashboards are important to you
You need to scale to hundreds of hosts
You want the largest community and ecosystem

Choose SigNoz if:

You want a single platform (not multiple components)
You're coming from Datadog and want familiar UX
You don't have a dedicated SRE team
ClickHouse performance for log/trace queries matters
You want the simplest path to full observability

Choose Uptrace if:

You have a small infrastructure (< 20 hosts)
You want the lightest possible solution
SQL-based querying is more comfortable than PromQL
You need something running quickly with minimal setup

Observability at Scale: When Self-Hosting Gets Complex

The tools listed above work well for typical infrastructure (under 50 hosts, moderate log volume, standard metrics). At scale, self-hosted observability introduces operational complexity that warrants specific attention before you commit to the architecture.

Prometheus cardinality limits. Prometheus stores metrics as time series, and each unique combination of metric name plus label values creates a new time series. High cardinality — many unique label values — consumes disproportionate memory and slows queries. Common cardinality traps: using user IDs, request IDs, or customer names as label values. A metric with 100,000 unique user IDs creates 100,000 time series for a single metric name. Prometheus's in-memory storage becomes a limiting factor at high cardinality. Solutions include VictoriaMetrics (a Prometheus-compatible TSDB with significantly better cardinality handling) or Thanos/Cortex for horizontally scaling Prometheus storage.

Log volume and retention economics. Loki stores logs cheaply by index-only approach: it only indexes metadata (labels), not log content, and stores log chunks in object storage (S3, GCS, or local). This makes Loki much more cost-effective for log retention than Elasticsearch (which indexes the full text of every log line). For 100 GB/day of logs retained for 30 days: Loki on object storage costs approximately $5-15/month in storage. Elasticsearch on an equivalent disk costs 10-20x more. For teams migrating from Elasticsearch-based log stacks, the economic case for Loki is compelling.

ClickHouse as the observability database. SigNoz uses ClickHouse as its backend — and ClickHouse's columnar storage makes it exceptionally fast for log and trace queries at scale. Full-text search across billions of log lines in under a second is realistic on modest hardware. If your log queries are slow on Loki or if you need trace analytics across millions of spans, SigNoz's ClickHouse backend is the right architecture. The tradeoff: ClickHouse requires more operational expertise than Loki's simpler model.

Distributed tracing and the missing context problem. Metrics and logs tell you what happened and when. Distributed tracing tells you why — the complete call chain across microservices that led to a slow request or error. Neither Prometheus+Grafana nor Uptime Kuma include distributed tracing. SigNoz includes tracing via OpenTelemetry. Uptrace supports traces, metrics, and logs in one system. If your application is microservices-based and you experience latency or error attribution problems across service boundaries, distributed tracing is essential observability tooling.

Alert fatigue management at scale. As you add more services and more metrics, the number of possible alert conditions grows faster than your team's ability to respond. Prometheus AlertManager's inhibition rules and grouping reduce noise — but require intentional configuration. Review alert rules quarterly: delete alerts that fire but never result in action, increase thresholds for alerts that fire too frequently to be meaningful, and add routing rules so alerts reach the right team rather than a single shared channel. An untended alert configuration degrades over time.

Synthetic monitoring for external validation. All the tools described monitor internal metrics. Synthetic monitoring validates from the outside — running scheduled HTTP checks against your application's public endpoints from external locations to verify that users can actually reach your service. Uptime Kuma handles this for basic HTTP/TCP checks. For more sophisticated synthetic monitoring (multi-step flows, simulated user journeys, API endpoint chaining), tools like Playwright-based testing scripts scheduled via cron provide comparable capability to Datadog Synthetics. The key is monitoring from outside your infrastructure so that network issues, DNS failures, or CDN problems that don't affect internal metrics are still caught.

Dashboard sprawl and governance. As self-hosted Grafana matures, dashboard proliferation becomes a problem. Every team creates their own dashboards; dashboards become stale when services change; no one knows which dashboard is canonical for a given service. Establish dashboard governance: a standard template for service-level dashboards (latency, error rate, throughput, saturation — the RED/USE methodology), a convention for dashboard naming (prefix with team name), and a process for archiving dashboards when services are decommissioned. Grafana's folder structure helps organize by team; combined with dashboard permissions, it prevents unauthorized modification of shared dashboards.

For the step-by-step Grafana + Prometheus + Loki setup, see Grafana + Prometheus + Loki self-hosted observability stack 2026. For the broader monitoring tools comparison including Uptime Kuma and NetData, see best open source monitoring tools 2026. For server sizing and cost planning for observability infrastructure, see self-hosting VPS comparison 2026.

Compare open source monitoring and observability tools on OSSAlt — features, deployment complexity, and community health side by side.

See open source alternatives to Datadog on OSSAlt.

The SaaS-to-Self-Hosted Migration Guide (Free PDF)