HQ Script Service
The Script service available underneath all platform resources can be used to invoke Nagios style plugins in the language of your choice.
Each instance of a Script service provides the following metrics:
- Availability - Uses process exit code:
- 0 ==
OK
- 1 ==
Warning
- 2 ==
Critical
- 3 ==
Unknown
- 4 ==
Paused
- Execution Time - Real time (in milliseconds) taken for each execution
- Result Value - First number (if any) seen in the output stream, example:
"Number of Transactions: 234"
Script services are configured using the following properties:
| Name |
Description |
Required |
Example |
Notes |
| prefix |
Space delimited prefix argument(s) |
No |
sudo |
| path |
Path to the script or program |
Yes |
/usr/local/nagios/libexec/check_http |
HQ will check that the path exists |
| arguments |
Space delimited script arguments |
No |
-H 209.237.227.36 |
Use quotes for arguments with spaces |
| timeout |
Timeout in seconds |
Yes |
120 |
Execution is aborted if > timeout |
To create a Script service, click on the "New Platform Service" link from the platform view, select Script from the drop-down and enter a name of your choice. Edit the Configuration Properties to configure the fields listed above.
Another option is to click the "New Server" link and choose Nagios from the drop-down. You can optionally import an existing nagios configuration file by checking the Auto-Discover Plugins check box or create them by clicking the "New Service" link and selecting Nagios Plugin from the drop-down. The Nagios Plugin service type has the same functionality as the platform Script service. Note that HQ has several built-in services for network protocols which do not require any Nagios plugins to use.
Script Based Plugins
While the Script service can be useful, the metrics are very limited. The Script concept can be expanded to include any number of metrics by implementing an xml plugin. The plugin can use multiple scripts or a single script that outputs key=value value pairs. These pairs are parsed by the script executor and are collected as metrics using the keys within the metric templates. As an example, we will create an I/O Device service which uses an iostat
script wrapper to format the data. Most Linux admins are familar with the iostat command which reports CPU and I/O stats for devices and partitions:
% iostat -d -x
Linux 2.6.9-22.EL (hammer) 06/23/2006
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
hdc 0.00 0.00 0.00 0.00 0.00 0.00 96.94 0.00 129.82 113.47 0.00
sda 0.02 0.59 0.07 0.54 2.00 9.06 17.95 0.00 4.21 1.75 0.11
Each device (hdc, sda) will be an instance of the I/O Device service, so we want the wrapper script to only collect metrics for a given device:
% iostat -d -x sda
Linux 2.6.9-22.EL (hammer) 06/23/2006
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.02 0.59 0.07 0.54 2.00 9.06 17.95 0.00 4.21 1.75 0.11
The wrapper script
will invoke iostat with the arguments given above and parse the tabular output into key value pairs like so:
% ./pdk/scripts/device_iostat.pl sda
rrqm/s=0.02
wrqm/s=0.59
r/s=0.07
w/s=0.54
rsec/s=2.00
wsec/s=9.06
avgrq-sz=17.95
avgqu-sz=0.00
await=4.21
svctm=1.75
%util=0.11
The plugin defines configuration properties for the script path and the device name:
<config>
<option name="script"
description="Collector script"
default="pdk/scripts/device_iostat.pl"/>
<option name="device"
description="Device name"
default="sda"/>
</config>
The properties are then applied to a filter template:
<filter name="template"
value="exec:file=%script%,args=%device%"/>
Where exec will route collection of the metric to the script executor plugin and the properties will be expanded to:
exec:file=pdk/scripts/device_iostat.pl,args=sda.
The filter is then used in each metric template with the key it is to collect:
<metric name="Write Requests per Second"
category="PERFORMANCE"
indicator="true"
template="${template}:w/s"/>
Which is expanded to:
template="exec:file=pdk/scripts/device_iostat.pl,args=sda:w/s"
Command Line Test
% java -jar pdk/lib/hq-product.jar -Dplugins.include=io-device -Ddevice=sda
I/O Device Availability:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:Availability
=>100.0%<=
I/O Device Read Requests Merged per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:rrqm/s
=>0.0<=
I/O Device Write Requests Merged per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:wrqm/s
=>0.6<=
I/O Device Read Requests per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:r/s
=>0.1<=
I/O Device Write Requests per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:w/s
=>0.6<=
I/O Device Sectors Read per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:rsec/s
=>2.0<=
I/O Device Sectors Writen per Second:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:wsec/s
=>9.1<=
I/O Device Average Sector Request Size:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:avgrq-sz
=>18.0<=
I/O Device Average Queue Length:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:avgqu-sz
=>0.0<=
I/O Device Average Wait Time:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:await
=>0.004s<=
I/O Device Average Service Time:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:svctm
=>0.001s<=
I/O Device CPU Usage:
I/O Device:exec:file=pdk/scripts/device_iostat.pl,args=sda:%util
=>0.1%<=
Create an I/O Device service
After you have deployed the iodevice plugin, instances of the service must be created manually. Using the HQ GUI:
- Navigate to the platform view of choice and click the "New Platform Service" link
- Enter a Name of your choice
- Select I/O Device from the drop-down list
- Click the OK button
- From the Inventory tab, click the Edit button
- Change any properties if needed
- Click the OK button
Your service and now configured and metric data will be viewable from the Monitor tab.
I/O Device plugin sources
How often are my scripts executed?
The script collector caches the results to avoid executing the script for each individual metric. The cache key is properties of the metric template.
In the iostat example, where the template is:
The properties/cache-key would be:
If you wanted the script to collect data for all resources in a single round, you could just change the template like so:
Which makes the properties/cache-key the same for all resources:
The io-device-plugin.xml would change to:
<filter name="template"
value="exec:file=%script%:%device%"/>
<metric name="Availability"
template="${template}_Availability"
indicator="true"/>
And device_iostat.pl would just format the output keys with device name in the key:
print "${device}_$labels[$i]=$values[$i]\n";
The lifetime of the cache is defined by the metric intervals, whose defaults are defined by the plugin and can be changed later per-resource or globally per-type in the UI. So, if your metric intervals were configured to collect every 5 minutes, the script would only be run once every 5 minutes regardless of how many resources the script output applies to.