Automate on-call
at agent-scale 

Engineering teams are shipping faster than ever, and we've all felt where the operational floor gives out. OnePatch is the observability stack we wanted — built for teams whose code moves faster than their pager rotation can keep up.

can you make a dashboard for cpu metrics
pin
monitors
dashboards
chats 1
+ new chat
today
can you make a das…
OnePatch In…
damasqas.com
SR
Shalin
shalin@damasqa…
can you make a dashboard for cpu metrics
8 events· 1 skill · 5 bash · 2 read
dashboards/host-metrics/ already covers host metrics. Building a CPU-focused one in Dashboard.tsx. Smoke-testing the queries first.
5 events· 1 write · 3 bash · 1 skill
Queries return rows, no runtime errors. Hovering each chart to verify tooltips.
Bash· agent-browser hover .recharts-surface 2>&1
All four panels verified — KPIs and charts paint with real values, tooltips hover-tested.
Dashboard.tsx shipped.
  • 4 KPIs: latest CPU %, load 1m / 5m / 15m
  • Full-width line: CPU utilization averaged across cores
  • Half-width line: utilization per core (cpu0, cpu1)
  • Half-width stacked area: CPU time by state
message the agent…

// vision

An autonomous SRE agent on your rotation that actually drives incident resolution, instead of just paging you.

// today

Get there today by chatting with an agent to build live dashboards and alert monitors over your OTel data.


// 01 · alert noise

The alert that matters is buried in the ones that don't.

You're already drowning in so much alert noise that you've tuned it out, right when the one that matters hits.

// before
A acme.io
channels
#general
#engineering
#production-alerts29
#deploys
#oncall
# production-alerts
PD
PagerDuty APP 12:41
resolvedauth-svc.canary.error_rate · auto · 47s
PD
PagerDuty APP 12:41
resolvednotification-svc.queue.lag · auto · 31s
PD
PagerDuty APP 12:42
firingcheckout-svc.p99 > 1.2s · paged @oncall
PD
PagerDuty APP 12:42
resolvedauth-svc.canary.error_rate · auto · 52s
PD
PagerDuty APP 12:43
resolvedsearch-svc.cache.miss · auto · 28s
PD
PagerDuty APP 12:43
resolvednotification-svc.queue.lag · auto · 19s
PD
PagerDuty APP 12:43
resolvedcart-validator.timeout · auto · 41s
PD
PagerDuty APP 12:44
resolvedauth-svc.canary.error_rate · auto · 38s
PD
PagerDuty APP 12:41
resolvedauth-svc.canary.error_rate · auto · 47s
PD
PagerDuty APP 12:41
resolvednotification-svc.queue.lag · auto · 31s
PD
PagerDuty APP 12:42
firingcheckout-svc.p99 > 1.2s · paged @oncall
PD
PagerDuty APP 12:42
resolvedauth-svc.canary.error_rate · auto · 52s
PD
PagerDuty APP 12:43
resolvedsearch-svc.cache.miss · auto · 28s
PD
PagerDuty APP 12:43
resolvednotification-svc.queue.lag · auto · 19s
PD
PagerDuty APP 12:43
resolvedcart-validator.timeout · auto · 41s
PD
PagerDuty APP 12:44
resolvedauth-svc.canary.error_rate · auto · 38s
// after · onepatch
A acme.io
channels
#general
#engineering
#onepatch-alerts1
#deploys
#oncall
# onepatch-alerts
OP
OnePatch APP 12:42
monitor fired · checkout-svc · p99 1.42s ▲ 38%
28 transient alerts auto-resolved by OnePatch. 1 actionable alert left.
// 02 · tool sprawl

By the time you've stitched it together, the SLO is already gone.

An incident opens seven tabs. Logs, traces, the dashboard from last quarter, the runbook in Notion, the deploy in GitHub, the Slack thread.

// before
Datadog
Grafana
Sentry
GitHub
Slack
← →
https://app.datadoghq.com/dashboard/wfp-mh3-x82/checkout-svc?env=prod-us-east
checkout-svc · prod-us-east past 1h · 5m
⚠ p99 1.42s · breaching SLO ack'd · jess.l
p99 latency
1.42s
errors / min
128
requests / s1,247
// after · onepatch
OnePatch
← →
https://app.onepatch.dev/chat/alerts/4471
checkout-svc · investigating rca-02 · query-agent
why is checkout slow right now?
Bashduckdb -c "SELECT p99 FROM otel.spans WHERE svc=…"
Read/deploys.log · last 30m
Skillrca · correlate(deploy,latency)
p99 1.42s ▲ 38% since deploy a7c8e3f (9m ago). Cart-validator regression → db pool exhaustion on writer-2 (94% util).
ask a follow-up…
one tab. metrics, traces, logs, and the offending deploy already correlated.


Make on-call zen.

Stop paging eng for problems agents can solve.