Defining Alerts
Defining Alerts for an Individual Resource
Available in HQ Open Source unless marked by * for HQ Enterprise only
After your resources are in the HQ inventory and metrics are being collected, the first step in using the alerting functionality is defining the alerts. This page explains how to define an alert for an individual resource. Alert definitions can be made complex, but the minimum requirements are few: a name and a single condition under which the alert will fire.
Alerts can also be defined for resource types (a very similar process).
- Defining an alert for an individual resource
- Conditions for triggering the alert
- When and how often the alert will fire
- Filtering alerts to fine tune alert functionality
If you have any comments or suggestions for this help page, please submit them at the bottom of the page by clicking Add Comment.
Defining an Alert for an Individual Resource
To define an alert for an individual resource:
- Determine which resource the alert will be defined for.
- Navigate to the "New Alert" screen, where alerts can be defined for this resource.
There are a couple ways of getting there:- On the Current Health screen for that resource, click the
tab.
One way to get to the Current Health page is to click the name of the resource on the Browse Resources screen (accessible via the masthead menu). Users now see a list of alerts defined for this resource. Click
.
OR - On any metric chart for this resource, click
in the Tools Menu. This route has the benefit of pre-populating the metric drop-down with the particular metric that was being charted.
- On the Current Health screen for that resource, click the
- Name the alert.
Give it a name that will clearly tell you what the problem is. The name of the alert will be displayed in the subject header of an alert notification email like this:Subject: [HQ] !! resource-name alert-name
For example, if the resource is a MySQL 3.x server residing on a Linux platform named grimlock.hyperic.net, the Tomcat resource name is probably grimlock.hyperic.net MySQL 3.x. In an alert that fires when the MySQL server goes down, name it simply "down!" and give it a high priority. Then the subject header of the alert notification mail would look like:
Subject: [HQ] !!! grimlock.hyperic.net MySQL 3.x down!
- Select the alert's Priority.
- Enable or disable the alert definition in Active.
- Determine why the alert will fire.
Alerts can be defined for metric thresholds, inventory property changes, or log/config tracking events. Additionally, if the resource supports control actions, an alert can be defined for control actions executed on the resource. - (optional) Add multiple conditions.*
Click Add Another Condition, select the appropriate AND or OR operator from the drop-down, and add the new condition. Repeat for as many conditions as you need for the alert. - (optional) Select another alert definition for which this alert definition will be a recovery alert. *
At least one alert must already be defined for the resource before you can designate this alert definition as a recovery alert. - Determine when and how often the alert will fire when the defined conditions are met.
- (optional) Select a "filter" for the alert definition.*
- Click
.
This takes you to the screen where the now minimally complete alert definition can be optionally fleshed out. - (optional) Select a control action for HQ to perform if the alert is triggered.*
- Click
in the "Control Action" section. - Select the following values and click
.
- Resource Type of the resource on which to perform the action, which filters the values in:
- Resource Name, where users select the specific resource on which to perform the action
- Control Action: The action to perform on the selected resource when the alert is triggered
- Click
- (optional) Select an escalation scheme for this alert definition.
- (optional) Selects users (either via their roles or their usernames) or arbitrary email addresses to notify via email when the alert is triggered.
- On the "Notify Roles", "Notify HQ Users", or "Notify Other Recipients" tab, respectively, click

- For "Notify Roles" and "Notify HQ Users," check the desired roles or users at the left and click
between the two lists to move the selected items to the right. (The arrow is enabled only when some resources are selected.) - For "Notify Other Recipients," type in one or more comma-separated email addresses and click
.
- On the "Notify Roles", "Notify HQ Users", or "Notify Other Recipients" tab, respectively, click
- (optional) Set up an SNMP trap to be thrown when the alert is triggered.
On the "SNMP Trap" tab, type the Address of the target SNMP trap engine and the OID and click
. SNMP traps must be [enabled] in HQ for this option to work.
Conditions for Triggering the Alert
- To set an alert on a metric threshold, select the metric from the Metric drop-down box, select the radio button for absolute value or baseline, select the operator (>, < or =) from the operator drop-down and for absolute value thresholds, simply enter the threshold in the Absolute Value text field.
- For a value change alert simply select the value change radio button and the alert will fire if the value for the metric ever changes.
- For baseline thresholds, select one of the values (Baseline, Min, Max) from the baseline drop-down and enter the percentage in the textbox. *
- To alert on an inventory property value change, simply select the desired inventory property. *
- If the resource supports control actions, you can alert on a control action and its resulting status. First select the control action, and then select one of the following states: In Progress, Completed, or Failed. *
- If the resource supports event/log tracking, you can alert on the log level and, optionally, a substring to match in the log string. *
When and How Often the Alert Will Fire
Each time conditions are exceeded or met
This is the selection that will apply to most alerts. This means "alert me immediately when the conditions for my alert have been met". However, with no further configuration this also means "continue alerting me every x minutes until the conditions are no longer met". x will vary depending on the collection interval for the metric(s) in the alert definition, but generally it will be either 1, 5 or 10. This selection is very effective when defined along with a Recovery Alert which will eliminate the 'alert storm' described above.
When conditions are exceeded for X within a time period of Y *
This is a fairly complex action and is most effective when the time periods represented by X and Y are relatively large. For instance, if you want to make Y anything less than 30 minutes, you probably want to use the "Each time conditions are exceeded or met" selection.
To really understand this configuration selection you need to be familiar with the concept of metric collection intervals and know the collection interval(s) of the metric(s) in your alert definition. Let's say your alert definition was for Free Memory < 10M and you wanted to be alerted whenever this condition was met for 20 minutes within a time period of 1 hour. Its not absolutely necessary to know that the collection interval for the Free Memory metric is 5 minutes, but it helps because then you know that there are 12 collections per hour for that metric and if 4 of those 12 collections meet or exceed the threshold, the alert will fire.
Once every X times conditions are exceeded within a time period of Y *
This selection is very similar to the previous one. With the previous option, it was nice to know the collection intervals for the metrics in the alert definition. With this option, it is absolutely necessary. It is necessary because it is possible to create an alert that is impossible to fire. If the metric collection interval is large enough that X collections will never be taken in the Y time period, the alert can never fire. For example, if we take our Free Memory metric with its 5 minute collection interval and configure an alert definition to fire "Once every 15 times conditions are exceeded within a time period of 1 hour." this alert will never fire. Because we know that only 12 collections will be taken per hour so we will never see 15 metrics per hour, much less 15 exceeding the alert threshold.
Managing Alert Triggering and Action Execution*
You can control and manage Alert notification behavior in HQ Enterprise. You can prevent the same Alert from firing repeatedly and configure consolidated notifications for related Alerts.
The "Enable Action Filters" portion of the Alert definition page provides these Alert configuration options.
Disable alert until re-enabled manually or by recovery alert
When the Alert fires, it will be automatically disabled, and remain so until it is re-enabled manually or via a Recovery Alert. Recovery Alerts are covered in greater detail in the [Defining Recovery Alerts] section.
Filter notification actions that are defined for related alerts.
Configure this option to reduce the number of notifications when related Alerts fire. Related Alerts are alerts on Resources on the same Platform, including the Platform. When you filter notifications for related Alerts, a notification for the first of the related Alerts to fire will be issued. After that a bundle of filtered alerts will be sent out every 5 minutes as Alerts continue to fire.
|
Related Topics |
Associated UI Pages |