We thought is was OK but decided to make it better

It only seems like a few days ago that we made the announcement about the HMC checks being shipped in the latest updates for AAG, but here we are announcing a NEW update for the HMC checks!

So why did we change so quickly and come up with a totally new set of HMC checks and a new install wizard?

a. First of all we did not like the REST API process, we discussed a number of times just how over loaded the data sent back from the REST API’s was and even after figuring out ways to reduce the data content it added a number of convoluted actions to extract and store the UUID for each of the elements we needed to extract data for.

b. This added a lot of complexity to the Nagios XI wizard especially when new systems/LPARs are to be added. We preloaded the database we managed with all of the systems and LPARs when the wizard ran, if anything changed that process had to be run again to repopulate the DB and the checks had to be manually re-sorted.

c. Dealing with XML formatted data is pretty heavy even with a C program so the overhead required just to load and review the data form the HMC was a burden. There are some issues with parsing XML as well in terms of what has to happen and how much memory needs to be allocated.

d. The added complexity is brought was not required, the information provided by IBM about the APIs also left a lot to be desired in terms of what the returned content would be. There were gaps in the data provided which were probably more important from a monitoring standpoint and we could not access.

Whats new then?

a. The new process uses a SSH connection to the HMC and runs any command available on the HMC, the information we can glean from the commands is far better and a lot less work to extract that the REST API’s.

b. Information that was missing from the REST API’s is now easily available via a number of commands. One we are particularly interested in was when updates are available for the HMC software that needed to be installed. We can send out a list of service events that have been sent in the last x hours, plus many other important data points.

c. The installation is quicker, no more parsing through pages of XML or JSON returned from the REST API’s just to set up the UUID’s for the various elements. The data returned is also significantly less which means the parsing is quicker and less resource intensive.

The installation via the wizard has changed slightly and in our opinion is a lot more flexible because we are not fixing the UUID to a specific managed system or LPAR and automatically setting up every one for each monitor.

The following shows how easy it is to install the HMC checks and how it differs from the original process.

Once you have uploaded the wizard zip file to the Nagios XI instance it can be selected to add a HMC to the monitoring.

AAG for HMC Wizard uploaded to Nagios XI
Adding the HMC Address

Click the wizard Icon and you are presented with the first screen which requires the IP address for the HMC to be entered. Press Next..

Enter the profile information

You will need to enter the user profile and password for the HMC that can be used to collect the required information. Enter a name that will be used to store the connection information, this has to be unique for the hosts being configured. The default is to overwrite any existing data that exist for a host with this name. Press Next..

Default checks list

You are now presented with a list of the checks available, the defaults entered will not work in most cases because the names being passed in will not match any that are configured for the HMC. You can uncheck the checks that you are not interested in and add the correct parameters that you do want. This list is reflective of the checks for a single managed system and a single LPAR, there are normally a lot more that need to be added which can be done after the initial check commands are installed. Click Finish..

Configuration successful

Click the link View status details for xxxx (your system config name) to see the checks that have been created.

Checks configured to run against HMC

Thats all you have to do.. The next things will be to copy the relevant checks to cover your other managed systems and LPAR’s so that you have everything being monitored.

The new checks are a lot faster and more efficient that using the REST API’s, my personal opinion is that the REST API’s are great for adding different views of the HMC using the data returned, but monitoring needs to be fast and efficient because it will be called constantly. Check out our website for a view of what we are delivering with our AAG monitoring solution for Nagios XI, the number of checks are increasing regularly. If something is not right we will always try to find a better way to do it.

Chris…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.