SNMP Plugin
- Introduction
- Getting Started
- Iteration 1: A Very Basic Plug-in
- Iteration 2: Additional Platform Metrics
- Iteration 3: Pulling in Network Interfaces as Platform Services
- Iteration 4: Collecting Network Interface Service Metrics
- Iteration 5: Adding Auto-Discovery Components for the Platform
- The Final Product
- Additional SNMP Plugin Examples
- Resources
Introduction
SNMP is the standard protocol for monitoring network-attached devices, which is leveraged by several bundled HQ plugins and made easy by the PDK. The bundled netdevice plugin provides a generic Network Device platform type which can be used to monitor any device that implements IF-MIB (rfc2863) and IP-MIB (rfc4293). The Network Host platform type extends Network Device with support for HOST-RESOURCES-MIB (rfc2790). The Cisco IOS platform also extends Network Device, adding metrics from CISCO-PROCESS-MIB and CISCO-MEMORY-POOL-MIB. The Cisco PIXOS platform extends Cisco IOS, adding metrics from CISCO-FIREWALL-MIB.
In looking at any HQ plug-in, there are two main concepts to grok:
First is the inventory model. Resource types define where things live in the hierarchy along with supported metrics, control actions, log message sources, etc.,
as well as the configuration properties used by each feature. In the case of implementing a custom SNMP plug-in for a network device, you are typically defining a platform type that collects any scalar variables that apply to the device and one or more service types to collect table data such as interfaces, power supplies, fans, etc.
Second is the metric template attribute which is a string containing all info required to collect a particular data point. In an SNMP plug-in, each of the metrics correlate to an SNMP OID. While we tend to use the object names to gather the desired data points in the plugins, you can also use the numeric OID. This has the added benefit of avoiding having to worry about ready access to the MIB file anywhere the plug-in is used.
The process of implementing a new SNMP based plugin for HQ starts with locating the device vendor's MIB file(s) and choosing which OIDs you wish to collect as metrics in HQ.
Getting Started
We'll be implementing a NetScreen plugin, so the first step is to verify basic connectivity to the device using the snmpwalk command:
$ snmpwalk -Os -v2c -c netscreen 10.2.0.140 system sysDescr.0 = STRING: NetScreen-5GT version 5.0.0r11.1 (SN: 0064062006000809, Firewall+VPN) sysObjectID.0 = OID: enterprises.3224.1.14 sysUpTimeInstance = Timeticks: (186554200) 21 days, 14:12:22.00 sysContact.0 = STRING: ops@hyperic.com sysName.0 = STRING: ns5gt sysLocation.0 = STRING: SF Office sysServices.0 = INTEGER: 72
Next, having downloaded and unarchived the MIB packages, we install the NetScreen MIB files in the appropriate location for our machine's SNMP installation:
$ sudo cp NS-SMI.mib NS-RES.mib NS-INTERFACE.mib /usr/share/snmp/mibs
Now, verify we can view OIDs defined in NS-RES.mib:
$ snmpwalk -Os -M /usr/share/snmp/mibs -m all -v2c -c netscreen 10.2.0.140 netscreenResource nsResCpuAvg.0 = INTEGER: 2 nsResCpuLast1Min.0 = INTEGER: 2 nsResCpuLast5Min.0 = INTEGER: 2 nsResCpuLast15Min.0 = INTEGER: 2 nsResMemAllocate.0 = INTEGER: 47310400 nsResMemLeft.0 = INTEGER: 60884848 nsResMemFrag.0 = INTEGER: 2537 nsResSessAllocate.0 = INTEGER: 19 nsResSessMaxium.0 = INTEGER: 2000 nsResSessFailed.0 = INTEGER: 0
And the tabular OIDs defined in NS-INTERACE.mib:
$ snmpwalk -Os -M /usr/share/snmp/mibs -m all -v2c -c netscreen 10.2.0.140 netscreenInterface | more nsIfIndex.0 = INTEGER: 0 nsIfIndex.1 = INTEGER: 1 nsIfIndex.2 = INTEGER: 2 nsIfIndex.3 = INTEGER: 3 nsIfName.0 = STRING: "trust" nsIfName.1 = STRING: "untrust" nsIfName.2 = STRING: "serial" nsIfName.3 = STRING: "vlan1" ...
Iteration 1: A Very Basic Plug-in
Once the MIBs are sorted out, you can begin with a very simple plug-in that might look something like this (line numbers added for instructional purposes):
1 <plugin>
2 <property name="MIBDIR" value="/usr/share/snmp/mibs"/>
3
4 <property name="MIBS"
5 value="${MIBDIR}/NS-SMI.mib,${MIBDIR}/NS-RES.mib,${MIBDIR}/NS-INTERFACE.mib"/>
6
7 <platform name="NetScreen">
8
9 <config include="snmp"/>
10
11 <plugin type="measurement"
12 class="org.hyperic.hq.product.SNMPMeasurementPlugin"/>
13
14 <property name="template" value="${snmp.template}:${alias}"/>
15
16 <metric name="Availability"
17 template="${snmp.template},Avail=true:sysUpTime"
18 indicator="true"/>
19
20 <metric name="Uptime"
21 alias="sysUpTime"
22 category="AVAILABILITY"
23 units="jiffys"
24 defaultOn="true"
25 collectionType="static"/>
26
27 </platform>
28 </plugin>
Let's disset this to better understand what is going on:
- The first and last lines enclose the plug-in contents within the tags <plugin> and </plugin>
- Line 2 defines where the MIB files can be found on the system that will be collecting the SNMP data from the Netscreen device
- Line 4 and 5 define which specific MIBs our plug-in will use
- Line 7 begins the Platform definition, and provides the type name that will appear in HQ
- Line 9 specifies that we want to include the HQ default SNMP information and templates available in the Network Host and Network Device specifications
- Lines 11 and 12 specify that we are defining a measurement plug-in using the SNMPMeasurementPlugin class we've imported
- Line 14 declares the template we will use for the measurement data we collect
- Lines 16 through 18 define how the Availability metric will be collected. The name set for the metric is how it will show up in the HQ UI. Note also that we change the template to denote that Availability is true if we can get the sysUpTime OID data, and that we set this as an indicator value that is turned on (provides the green light / red light information for the Platform)
- Lines 20 through 25 define our Uptime metric. Note the clarification of the metric alias that will be substituted in for the template's ${alias} (line 14) as data is collected. We also specify the category, units, defaultOn, and collectionType values as per the measurement plug-in documentation
- Line 27 closes out the platform definition
This gets us going, but does not yet provide us with a lot of useful information about the platform. Before diving in to gather more information, let's take another look at line 21. Instead of using the alias parameter, we could also have defined that line like this:
template="${snmp.template}:sysUpTime"
This explicitly defines a template for this metric rather than relying on the alias value and the default measurement template we set.
Iteration 2: Additional Platform Metrics
OK. Let's gather some more scalar Platform metrics that might prove interesting:
... 26 <metric name="Average CPU Utilization" 27 alias="nsResCpuAvg" 28 units="percent"/> 29 30 <metric name="Average CPU Utilization (Last 1 min)" 31 alias="nsResCpuLast1Min" 32 units="percent"/> 33 34 <metric name="Average CPU Utilization (Last 5 min)" 35 alias="nsResCpuLast5Min" 36 units="percent"/> 37 38 <metric name="Average CPU Utilization (Last 15 min)" 39 alias="nsResCpuLast15Min" 40 indicator="true" 41 units="percent"/> 42 43 <metric name="Memory Allocated" 44 alias="nsResMemAllocate" 45 units="B"/> 46 47 <metric name="Memory Left" 48 alias="nsResMemLeft" 49 indicator="true" 50 units="B"/> 51 52 <metric name="Memory Memory Fragment" 53 alias="nsResMemFrag" 54 units="B"/> 55 56 <metric name="Sessions Allocated" 57 alias="nsResSessAllocate"/> 58 59 <metric name="Sessions Maximum" 60 alias="nsResSessMaxium"/> 61 62 <metric name="Sessions Failed" 63 alias="nsResSessFailed" 64 collectionType="trendsup"/> ...
Again, we provide a name value for how the metric will appear in HQ, use the alias to specify the OID name to be used with the template, and where necessary, specify units, whether or not this will be a default indicator, and the collectionType. This gets us good, basic system information for the platform.
Iteration 3: Pulling in Network Interfaces as Platform Services
Now, we want to get information about the device network interfaces. To do this, we must query the SNMP table data from the device, and put them in proper context as Service definitions within HQ. We add the following to the plug-in:
...
67 <!-- index to get table data -->
68 <filter name="index"
69 value="snmpIndexName=${snmpIndexName},snmpIndexValue=%snmpIndexValue%"/>
70
71 <filter name="template"
72 value="${snmp.template}:${alias}:${index}"/>
73
74 <server>
75 <service name="Interface">
76 <config>
77 <option name="snmpIndexValue"
78 description="Interface name"/>
79 </config>
80
81 <property name="snmpIndexName" value="nsIfName"/>
82
83 <metric name="Availability"
84 template="${snmp.template},Avail=true:nsIfStatus:${index}"
85 indicator="true"/>
86
87 </service>
88 </server>
...
Breaking the collection of this table data down:
- In lines 68 and 69 we define an index filter to correlate name and value pairs from the SNMP table data
- In lines 71 and 72 we define a new template that takes into account the OID and its associated index
- In line 74 we start a Server definition. In this case, the Server's only attributes are the Platform Services we are defining in lines 75 through 87: the network interfaces for the device
- In lines 75 through 81 he Service is given a name, and the individual interface name is derived by assoicating the snmpIndexValue with the nsIfName (through the snmpIndexNmae association) defined by the OID
- In lines 83 through 85, like we did at the top, Platform-level, we define our Availability metric, defining availability as true if we can gather nsIfStatus value for the inteface, and setting it as a default indicator.
- In line 87 we close the Service definition with the </service> tag
Iteration 4: Collecting Network Interface Service Metrics
Collecting the metric data for each interface is very similar to what we did to collect the scalar data for the Platform. The difference is that it is contained within the Service definition. Here's what that looks like:
... 87 <!-- nsIfFlow* metrics --> 88 <metric name="Bytes Received" 89 alias="nsIfFlowInByte" 90 indicator="true" 91 collectionType="trendsup" 92 category="THROUGHPUT" 93 units="B"/> 94 95 <metric name="Bytes Sent" 96 alias="nsIfFlowOutByte" 97 indicator="true" 98 collectionType="trendsup" 99 category="THROUGHPUT" 100 units="B"/> 101 102 <metric name="Packets Received" 103 alias="nsIfFlowInPacket" 104 collectionType="trendsup" 105 category="THROUGHPUT"/> 106 107 <metric name="Packets Sent" 108 alias="nsIfFlowOutPacket" 109 collectionType="trendsup" 110 category="THROUGHPUT"/> 111 112 <!-- nsIfMon* metrics --> 113 <metric name="Auth Failures" 114 alias="nsIfMonAuthFail" 115 collectionType="trendsup" 116 category="AVAILABILITY"/> ...
Iteration 5: Adding Auto-Discovery Components for the Platform
The final touch is in adding the necessary pieces for auto-discovery to work. This makes it nice when you use the plug-in, since inventory information for the Platform, and any discoverable services that are defined are automatically pulled into HQ. The additions are:
... 7 <!-- for autoinventory plugin --> 8 <classpath> 9 <include name="pdk/plugins/netdevice-plugin.jar"/> 10 </classpath> ... 11 <properties> 12 <property name="sysContact" 13 description="Contact Name"/> 14 <property name="sysName" 15 description="Name"/> 16 <property name="sysLocation" 17 description="Location"/> 18 <property name="Version" 19 description="Version"/> 20 </properties> 21 22 <plugin type="autoinventory" 23 class="org.hyperic.hq.plugin.netdevice.NetworkDevicePlatformDetector"/> ... 94 <plugin type="autoinventory" 95 class="org.hyperic.hq.plugin.netdevice.NetworkDeviceDetector"/> ... 105 <plugin type="autoinventory"/> 106 107 <properties> 108 <property name="nsIfIp" 109 description="IP Address"/> 110 <property name="nsIfNetmask" 111 description="Netmask"/> 112 <property name="nsIfGateway" 113 description="Gateway"/> 114 </properties> ...
- In lines 7 through 10, we import the netdevice-plugin to enable auto-discovery
- In lines 11 through 20, we add some inventory properties that will show-up on the Inventory tab
- In lines 22 and 23, we call-out the NetworkDevicePlatformDetector for auto-inventory of the Platform scalar values (enabled through the inclusion we did in lines 7 through 10)
- In lines 94 and 95, we call-out the NetworkDeviceDetector for auto-inventory of the Platform table values (also enabled through the inclusion we did in lines 7 through 10)
- In lines 105 thorugh 114, we insure that the network data is incorporated into the Platform inventory as part of the auto-discovery process
The Final Product
The final plug-in in its entirety is here in netscreen-plugin.xml.