Alerting System
Uptime is important for a backend, we need to be notified when shit hits the fan. The first step of figuring out the service isn't working is to check its status regularly.
However, placing the alerting system on the same machine as the service is a bit unwise. If the server crashes, it takes you alerting system with it. The only way to efficiently achieve this is to implement distributed alerting. We're cheap, so we don't want to pay for someone like PagerDuty/Datadog/etc, enter DIY solutions :D
The problems we need to solve:
- Service to send HTTP(S) requests to an endpoint and interpret the result
- Serivice to store that information in a time series database - makes it easier to query
- Service to read time series database and fire an alert when rules are met
- Service to detect the alert and send it to a phone/email/human shock collar
Service to send HTTP(S) requests to an endpoint and interpret the result:
Prometheus is supposed to have given humanity the gift of fire, well, the software gives us the gift of monitoring - Clearly the better gift. Prometheus takes metrics and organizes the info into a time series database. It then exposes a port that is ready to be scraped by the monitoring system.