Have you ever thought about how to monitor your NetEye system or other critical applications in a network failure scenario?
To manage this scenario, in some customer cases some solutions have been implemented using SMS notifications, thus relying on the support of the mobile network as a notification channel.
But what happens when even the mobile network doesn’t work?
Today we have a more effective solution thanks to Atlassian’s OpsGenie Heartbeats.
OpsGenie is used by many Wuerth Phoenix customers in addition to the NetEye ecosystem to achieve various objectives including:
But in this post I want to show you the Heartbeat Monitoring feature, which is very useful in case there’s some kind of network failure and NetEye can’t even reach OpsGenie.
In computer science, a heartbeat is a periodic signal generated by hardware or software to indicate normal operation or to synchronize other parts of a computer system. [1]
Atlassian defines OpsGenie Heartbeats in this way: “The Heartbeats feature can be used to ensure that your environment is able to connect to OpsGenie continuously and that:
Heartbeats can also be used to monitor that your periodic tasks are running as expected by using a suitable Heartbeat expiration interval regarding your periodic task run time.“
The idea of a heartbeat is really simple: if the heartbeat stops for a given period of time, you get alerted.
Important: The Heartbeat Monitoring feature is available to Standard and Enterprise plans only.
For details on configuring Heartbeats you can refer to the documentation provided by Atlassian from which I quote this short and important extract to help you better understand this functionality:
When a Heartbeat is added to Opsgenie with an interval of X minutes, your system is expected to send HTTP based Heartbeat requests periodically, at least every X minutes. If a Heartbeat request is not received for more than X minutes, Opsgenie will conclude that there is a problem between your system and Opsgenie, and create an alert according to your settings.
In the following scenario, I created a heartbeat named neteye-master-heartbeat-crontab as follows:
So if within 10 minutes OpsGenie doesn’t receive a ping to this Heartbeat, it will send an alarm signal.
To generate the ping from NetEye you can proceed with a simple curl command in crontab which will ping the heartbeat created using an HTTP POST action.
curl test example
curl -X POST 'https://api.opsgenie.com/v2/heartbeats/neteye-master-heartbeat-crontab/ping' --header 'Authorization: GenieKey XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXX'
{"result":"PONG - Heartbeat received","took":0.007,"requestId":"f7a9fdf0-73b2-4f33-9346-5251d11a9978"}
crontab example
#OpsGenie heartbeat neteye-master-heartbeat-crontab ping
*/5 * * * * curl -s -X POST 'https://api.opsgenie.com/v2/heartbeats/neteye-master-heartbeat-crontab/ping' --header 'Authorization: GenieKey XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXX
In the following scenario, I created a heartbeat named neteye-master-heartbeat-check-scheduler as follows:
This Heartbeat is configured like the previous one, but with a different name. If within 10 minutes OpsGenie doesn’t receive a ping to this Heartbeat it will send an alarm signal, but we will ensure that the ping will be sent from a scheduled NetEye check.
The easiest way to create a scheduled NetEye check is to use this plugin from the OpsGenie GitHub release page, creating a command and a service check following this example:
'/neteye/shared/monitoring/plugins/oghb-linux-amd64' '-action' 'send' '-apiKey' 'XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXX' '-name' 'neteye-master-heartbeat-check-scheduler'
INFO[0000] Successfully sent heartbeat [neteye-master-heartbeat-check-scheduler]
OpsGenie allows for the creation of different Heartbeats that can be used for different purposes and in different systems to ensure that communication between the “monitored” system and OpsGenie is always valid. Otherwise, OpsGenie will send an alert.
In the examples above we’ve seen how to monitor NetEye to ensure that it’s always able to communicate with OpsGenie. Clearly these configurations can be extended to all critical systems for which the same guarantees are desired.
For further information on the integration between OpsGenie and NetEye you can also read this blog post.
[1] Wikipedia source
Did you find this article interesting? Does it match your skill set? Our customers often present us with problems that need customized solutions. In fact, we’re currently hiring for roles just like this and others here at Würth Phoenix.