Angga.
← Back to all posts
observabilitydevopsgrafana

Observability on one EC2 box: the Grafana stack

Self-hosting traces, logs, and metrics on a single EC2 instance with Grafana, Alloy, Tempo, and Loki — how the pieces fit, the push-vs-pull split that trips people up, wiring it for a Go backend, and the caveats nobody mentions.

· 8 min read

Datadog is wonderful until the invoice arrives. For a side project or a small production service, you can run a genuinely good observability stack yourself on a single EC2 box — traces, logs, and metrics, all correlated, queried from one UI. The pieces come from Grafana Labs and go by the initialism LGTM: Loki, Grafana, Tempo, Mimir — plus OpenTelemetry as the glue.

This is how the pieces actually fit, the push-vs-pull distinction that trips people up, how I'd wire it for a Go backend, and the caveats that matter before you lean on a single node.

The cast

Grafana — the front door

The visualization and dashboarding layer: the UI where you query, explore, and alert. It connects to data sources (Tempo, Loki, Prometheus) and renders dashboards. It stores nothing itself — it's a query-and-render layer over the backends.

Grafana Alloy — the collector

The unified telemetry collector, successor to Grafana Agent. It runs on the box alongside your app and does three things: receives telemetry (traces, logs, metrics), processes it (filter, batch, relabel, enrich), and forwards it to the right backend. Config is written in the Alloy syntax (.alloy files) as a set of pipelines. Mental model: it's the plumbing between your app and storage.

OpenTelemetry — the instrumentation standard

Not a server — a protocol + SDK ecosystem. Your app uses OTel SDKs to emit traces, metrics, and logs over the OTLP protocol. Alloy has a built-in OTLP receiver (otelcol.receiver.otlp) that accepts this and routes it into the pipeline. The win of OTel is vendor-neutral instrumentation: instrument once, swap backends later without touching app code.

Grafana Tempo — traces

The distributed tracing backend. Stores spans, queried with TraceQL. It's cheap because it keeps traces in object storage (S3, local disk) and doesn't index span content. Correlates with Loki for trace-to-log jumps.

Grafana Loki — logs

The log aggregation backend — "like Prometheus, but for logs." The key trick: it indexes only labels, not the full log content, which makes it very storage-efficient. Queried with LogQL. Alloy forwards logs via the loki.write component. The killer feature: a trace ID in a log line lets you jump from a slow trace in Tempo straight to the logs for that exact request.

(Metrics go to Prometheus or Mimir — optional here, and scraped rather than pushed.)

How they fit together

Step by step:

  • Your app emits traces / logs / metrics via the OTel SDK over OTLP.
  • Alloy receives them, batches and relabels, and fans out to the backends.
  • Tempo stores traces; Loki stores logs; Prometheus/Mimir stores metrics.
  • Grafana queries all three and renders them in one place.
  • Click a slow trace in Tempo, jump straight to the correlated logs in Loki via the shared trace ID.

Push vs pull: the distinction that trips people up

Worth slowing down on, because this stack uses both.

Pull-based

The collector goes to the source and fetches on a schedule.

[Your app]  <-- scrape every 15s --  [Alloy / Prometheus]
 exposes /metrics
  • The app exposes a /metrics endpoint; the collector scrapes it periodically.
  • The collector controls the timing.
  • Pros: a failed scrape is an obvious "service is down" signal; the collector controls the rate so it can't be flooded; new targets are just config.
  • Cons: the app must expose a reachable endpoint; awkward behind NAT/firewalls, or for short-lived jobs (lambdas, batch).
  • Used by: Prometheus metrics, Alloy scraping.

Push-based

The source sends to the collector.

[Your app]  ---- push on event/interval ---->  [Alloy / backend]
  • The app actively pushes telemetry; the collector just listens.
  • The app controls the timing.
  • Pros: works anywhere — behind NAT, serverless, short-lived jobs; lower latency (sent on event); no endpoint to expose.
  • Cons: silent failure is harder to detect (the app stops pushing but looks fine); the app controls the rate, so it can overwhelm the collector.
  • Used by: OTel traces, Loki logs, OTLP.

Which part of the stack uses which

OTel -> Alloy            push   app sends OTLP traces/logs to Alloy
Alloy -> Tempo           push   Alloy forwards traces
Alloy -> Loki            push   Alloy forwards logs
Alloy scrapes metrics    pull   Alloy scrapes /metrics endpoints
Grafana -> Tempo/Loki    pull   Grafana queries backends on demand

The rule of thumb: push for traces and logs (event-driven, bursty, often behind NAT), pull for metrics (steady-state, and a failed scrape is itself a useful signal).

Wiring a Go backend

Two pieces: instrument the app, and configure Alloy to receive it.

App side — the OTel SDK exports over OTLP to Alloy's receiver on localhost:

exporter, _ := otlptracegrpc.New(ctx,
    otlptracegrpc.WithEndpoint("localhost:4317"),
    otlptracegrpc.WithInsecure(),
)
tp := sdktrace.NewTracerProvider(
    sdktrace.WithBatcher(exporter),
    sdktrace.WithResource(resource.NewWithAttributes(
        semconv.SchemaURL,
        semconv.ServiceName("checkout-api"),
    )),
)
otel.SetTracerProvider(tp)

Wrap your HTTP handlers and DB calls with the OTel middleware (otelhttp, otelsql) and you get spans automatically.

Alloy side — receive OTLP and fan out:

otelcol.receiver.otlp "default" {
  grpc { endpoint = "0.0.0.0:4317" }
  http { endpoint = "0.0.0.0:4318" }
  output {
    traces = [otelcol.exporter.otlp.tempo.input]
    logs   = [otelcol.exporter.loki.default.input]
  }
}

otelcol.exporter.otlp "tempo" {
  client { endpoint = "localhost:3200" }
}

The point: your Go app only ever talks to Alloy at localhost:4317. Alloy owns the knowledge of where Tempo and Loki live. Swap a backend — change Alloy, leave the app alone.

EC2 deployment layout

On a single box, run each piece as a systemd service or a Docker container:

Service   Port        Role
--------  ----------  ------------------------------
Grafana   3000        UI
Alloy     4317/4318   OTLP receiver (gRPC / HTTP)
Tempo     3200        Trace storage + query
Loki      3100        Log storage + query

For a dev or single-node setup, Docker Compose is the easiest way to wire it together — one file, one docker compose up.

The honest caveats

A single EC2 box is a dev / small-production topology, not an HA one. Before you rely on it:

  • Use object storage (S3) for Tempo and Loki, not the local disk — otherwise a terminated instance takes your telemetry with it.
  • Watch cardinality. Loki is cheap *because* it only indexes labels. Put a high-cardinality value — user ID, request ID — into a label and you reverse that and blow up the index. Same discipline as Prometheus labels: keep IDs in the log line, never in the label set.
  • A single node is a single point of failure for your observability. If the box dies you lose the very thing that would tell you why. For anything serious, the backends move to their own nodes — or you reach for Grafana Cloud.

Closing

The mental model that makes this click: OTel is the language, Alloy is the switchboard, Tempo / Loki / Prometheus are the filing cabinets, and Grafana is the reading room. Your app speaks OTel to one local address and never has to know the rest.

Get the push/pull split right — push the bursty event data, pull the steady metrics — and a single EC2 box gives you trace-to-log correlation that genuinely rivals the paid tools, for the price of the instance.

Enjoyed this? More posts coming weekly — see the full archive.