Dashboard
The Overview page (the default route after sign-in) is your roll-up of every endpoint in scope. It’s built around six KPI cards on top, a stack of time-series charts in the middle, and per-endpoint rankings at the bottom — all driven by the same time-window picker so the whole page tells a consistent story.
Filter bar
Sticky at the top of the page:
| Control | Behaviour |
|---|---|
| Endpoints | Multi-select with text search. Empty selection means “fleet-wide” (every endpoint). |
| Range | 24h · 7d · 30d · 90d. Drives every chart and KPI on the page. |
| Refresh | Forces a refetch immediately. The dashboard polls; this is the bypass. |
The header next to the title shows X of Y endpoints in scope, the active range as Metrics over <range> rolling · status reflects now, and an “Updated Xs ago” freshness counter.
How fresh the data is
The dashboard polls — it doesn’t subscribe to realtime changes. Three timers govern freshness:
- Time-series refetch — fires when the range or endpoint selection changes.
- In-window memos — recompute every minute so the right edge of every chart slides forward without a refetch.
- Freshness label — ticks every 5 seconds so the “Updated Xs ago” counter is current.
For a 24h range the dashboard pulls hourly rollups plus a synthetic in-progress bucket from raw check rows, so the right edge is genuinely live. For 7d / 30d / 90d it pulls daily summaries plus today’s hourlies as fill-in.
Fleet hero
The top strip of counters:
- Healthy / Degraded / Down / Inconclusive / Paused — current status snapshot from each endpoint’s
last_status. - Active incidents — count of incidents in the
activestate. - Notifications sent — count of dispatches in the selected window.
- Incidents in window — opens within the selected range.
KPI cards
Six cards across, each with a number + a small inline sparkline drawn from the same buckets the charts use:
| KPI | What it shows |
|---|---|
| Fleet uptime | % of buckets that were healthy across the selection. |
| Global P95 | 95th-percentile response time across all included endpoints. |
| Error rate | Share of buckets that recorded any non-healthy status. |
| Avg response | Average response time, with an early-half vs late-half trend % on the back. |
| Checks ran | Total probes executed in the window. |
| Incidents | Total incidents with an open / resolved breakdown. |
Error budget banner
A gradient bar shows budgetRemaining against your SLO target (set under Settings → Check defaults and SLO). When you’re well inside budget the bar reads green; as you spend, it shifts amber and then red.
Charts
Stacked one above the other, all sharing the time window:
- Response Time Percentiles — P50 / P95 / P99 lines.
- Uptime % vs SLO — uptime line plus your SLO target, with a toggleable downtime series for context.
- Error Rate — non-healthy share over time.
- Status Bar Chart — stacked healthy / degraded / down per bucket, useful for spotting clustered failures.
Endpoint heatmap
Every endpoint × every bucket as a coloured cell. Lets you scan an entire fleet in seconds and spot an endpoint that’s been silently degraded for hours.
Per-endpoint chart
A response-time area chart of the top 8 endpoints by average latency in the window. Use it to find the small set of endpoints driving your fleet P95.
Active incidents + live activity
Two side-by-side panels at the bottom of the chart stack:
- Active incidents — every currently-active incident with its endpoint, cause, and time-since-open.
- Live activity feed — the most recent incident state changes (opens and resolves).
SLO compliance
A per-endpoint list of window-uptime against your SLO target. Endpoints below target sort first.
Rank cards
Three “top offenders” cards on the bottom row:
- Slowest — top endpoints by P95 in the window.
- Flakiest — top endpoints by incident count.
- Highest error rate — top endpoints by share of non-healthy buckets.
Empty state
If you haven’t added any endpoints yet, the page shows a centred icon, the line “No endpoints yet”, and an Add Endpoint button. Every other widget is hidden until you have at least one endpoint with at least one check on it.