
We have been aware of a limitation with the IBM provided monitors which can cause a number of issues when set up in a specific manner that can result in a noticeable increase in CPU and resource allocation when the jobs are running. We have never seen the impact in any of our managed customers but was informed by one of our larger customers that they were seeing some impacts on the systems they were monitoring so we felt it was time to look at options to work around those limitations.
The problem originates from the Watch programs if you set the messages to monitor for as *ALL, the IBM manuals and the EM4i manual clearly state that setting to *ALL can affect the performance and should not be used, however it is something that most customers do because its easier than trying to figure out what messages to watch for across the various message queues that need to be monitored. The benefit of the watches is they instantaneously pick up the message and allow us to send out the notifications immediately so the users can pick up and if necessary send a response to the message within a very short period. The watches can have many messages configured with little performance impact, its only the *ALL setting that causes the issues. Much better to know a batch job has a problem quickly and provide a response than leave the job and all follow on jobs hanging until a response is given. Knowing a disk failure has occurred could be pretty time sensitive as well.
The new feature creates a polling process (runs at a configurable timing interval) which will scan each of the message queues looking for messages that have not been configured but need a response or a *DIAG type message. We have also added the ability to further filter this by message severity so you can limit the notification being sent out to only messages that are urgent. When pulling the messages back from the message queues we had to ensure were were not reading the entire queue every time, we are only interested in messages that have not been seen since the last time we looked at the message queue. We also want to be able to track the messages picked up by this process so we can add them to the watch monitors in the future to ensure the messages are picked up as quickly as possible.
This has resulted in a process that runs very quickly and very efficiently (we have seen very little resource use even when running the poll job very frequently) and provides an alternative to the *ALL option that many of our customers have implemented. Its easy to set up and implement and should make using EM4i much better for those customers who are seeing performance impacts with *ALL settings.
The new feature will be released in the next update (due shortly).
Get in touch and ask for a demo, lots of cool features to see and many more coming soon.
Chris…