Using Alerts in HQ to Manage Resources
The alerting functionality in HQ is one of the major pieces of the product. Once you understand HQ's monitoring functionality, the natural next step is being able to set alerts and be notified if/when anything exceeds the normal operating threshold.
Alerting is an important tool when it comes to managing resources. For this reason HQ's alerting engine was designed to be powerful, as well as flexible enough to allow for even the most demanding alerting and notification scenarios. The general steps for creating an alert are:
Determine which resource or resource type needs the alert
Example: A Tomcat server keeps crashing. You need to be alerted about the resource usage of this server as well as its availability.
Determine which metric(s) the alerts will be associated with
Our example Tomcat server appears to be exhausting its available memory and then crashing. Alerts should be put on the the metrics JVM Free Memory and Availability
Determine the threshold(s) where alerts will fire
A warning alert would be nice for when the JVM Free Memory gets low, so the threshold for that alert might be set at <10M. That could give you some time to investigate before your server crashes.
Then a high severity alert should be set for Availability <100%. That will let you know that your Tomcat server has gone down and needs immediate attention.
Determine when you want the alert to fire
Sometimes you need to know immediately when a critical event happens. Sometimes you want to be alerted if a specific event has been happening frequently over a specified period of time. HQ is flexible enough to allow these and many more alerting options.
In our example, perhaps you want to know about it only when the JVM Free Memory is <10M for 30 minutes during a period of one hour. Almost certainly, you'd want to know immediately when the server goes down.
Determine the action(s) to be taken
You'll want an email notification sent when the server gets low on free memory. An email to yourself might be appropriate, as well as an alert notification sent to the network operations center, so the on-duty administrator is notified.
Perhaps a simple restart of the server is all that is necessary to put things back on track when it crashes. HQ supports adding a control action as part of an alert event. You can have HQ send you email notification as well as restart the Tomcat server.