Incidents represent time-sensitive threats to your infrastucture such as URL downtime, missed heartbeat or a low-level infrastructure alert.
The current on-call person is alerted when Incident is created.
If the current on-call person doesn't acknowledge the incident within a specified time period the entire team is alerted.
You can configure on-call escalations and escalation periods for each Monitor and Heartbeat separately.
facebook.com goes down at 3:25AM
New incident is created
Better Uptime calls, send an SMS and an e-mail to the current on-call person
the current on-call person is asleep
they don't acknowledge the incident and continue dreaming 😴
after 3 minutes Better Uptime alerts (call, SMS, e-mail) the entire team
After you acknowledge the incident no other team members get alerted.
You can manually acknowledge the incident in the upper-right corner on the incident detail page.
When Better Uptime calls you, you are prompted to press 1 to acknowledge the incident.
If you don't want to acknowledge the incident — you may be without your computer and can't start resolving the problem right away — just hang up and other team members will get alerted.
To acknowledge the incident click the Acknowledge incident button in the e-mail you receive when a new incident is started.
Once the incident is acknowledged you will be able to resolve it.
Incidents are automatically resolved after the endpoint becomes available again.
You can manually resolve the incident by clicking Resolve in the upper-right corner of the incident detail page or wait until.
We take a screenshot and save a raw response of your website every time an incident caused by downtime happens. They can be extremely useful when figuring out exactly what happened.
You can find the screenshot and the response in the headline on an incident detail page.
You can collaborate on resolving an incident with your colleagues using comments. Upload screenshots, share your insights, and collaborate on resolving the incident together.
You can use Markdown in the comments.
Post mortems are short summaries of incidents.
They typically describe why an incident happened, estimated cost, and how to prevent similar incidents in the future.
The best teams write and share their post mortems after each significant incident.