We have placed the latest PTF (PTF05) for HA4i on the website. To download the PTF you just need to sign in and go to the product pages where the download link will be available. The following is a sample of the changes we have added to the product in this PTF, for a full list of enhancements and fixes see the Cover Letter installed with the PTF.
As the product is installed in a number of very large and active accounts we found that the message queue used by HA4i would fill pretty quickly and wrap with all of the status/diagnostic messages that are sent. While some of the diagnostic messages are important, they are only important when looking for problems in the process. As part of the update we have now removed a lot of the messages we sent such as member delete messages when the output file is cleaned up after the APYJRNCHG has run and No object saved or restored messages from the HA4iMSGQ unless HA4i is set to debug mode. Other messages such as QCMDEXC messages which had subsequent detailed messages sent and retry messages have also been removed.
We have added Email capabilities to the source system. Now you can configure and start the email manager on each system, this allows processes such as the STATUSCHK program to send emails when the target system or the link to the target system is down. A new function key to the IFS audit screen allows all errors to be re-submitted at once instead of having to select each entry individually. If a constraint restricts the submission of the request a message is sent stating the submission was not carried out.
A major addition is a new APYJRNCHG process which Reduces the locking process required each time it is called. V6R1 and above can now process the majority of the journal entries without having to be filtered and managed using the QDBRPLAY API, this now reduces the number of times the process has to lock and unlock all of the objects which are described to the journal.
Auditing has had a major upgrade in a number of areas, some are related to the output of the audits such as we now mark a missing file with M instead of Y/N which states an error exists. A new logical file audit capabilities was added which audits logical files and report errors associated with the make up of the logical file such as the Based on File information or the number of members in the Logical File, access path information and indexes.
One concern may user reported was the amount of data they had to trawl through to find out the errors that had been logged so with this update we have reduced the amount of data written out to the spool and DB file by a new setting in the command which provides either *FULL output (same as before) or *ERR which identifies just those objects in error. We have also moved the output for the audits to a separate output queue so they can be found by anyone who needs them and added Filters to the file audit views to allow the display of *SRC, *LGL, *PHY *DTA (*LGL and *PHY) and *ALL which improves the data analysis.
Due to the way the APYJRNCHG works we have also updated the automated audit process to automatically retry *FILE object errors prior to running the audit against the physical files and after the existing journal receiver has been applied. This reduces the number of false positive errors where a logical file member does not exist because the underlying data member has not been created by the APYJRNCHG process. Also as part of the same update we have added a new parameter to the object audit command (AUDLIB) to determine if the Object Filter file is checked for matches before an object is audited. This will remove any object audits errors from being reported for objects which are not being replicated by HA4i.
Another concerns was how to end an audit once it had been started, if a file audit started to process a file with millions of records it could take a significant amount of time to process. There were a couple of things we added to help with this, firstly we have added a new command which will end the auditing of records after a certain number of records had been checked, next we added a new feature which allows you to set where the audit starts by either skipping an initial number of records or skipping a percentage of the total record count. This allows files which have a large number of records that are only extended (such as history files) to start auditing after skipping a percentage of the records with a full audit against the remaining records. The default audit will still skip a percentage of the records across the entire member. We have also added a new check to the file audit, It will now flag where the journal information is different or the file is no longer journaled on each system.
Object replication Improvements
A new object retry process has been developed which will cycle through failed object replication requests. Each time a request fails to be replicated it will be marked and a delay incremented between 1 and 50 seconds before each retry of the same object is processed. Each object will be retried 5 times before being marked as a failed request. if an object does not exist when the retry
is processed the object will be removed from the retry list and not marked in the error list. Objects that cannot be replicated because the library does not exist will automatically get registered as failed requests.
A new option on the RETRYOBJF command allows object retries to be re-submitted by type such *FILE or *DTAARA etc. plus *IFS for all IFS type failures. This provides the ability to manage which objects are sync’d in which order such as physical files before logical files.
Our initial take on object replication was that it would be based at the library level, if any supported object changed within the specified library it would be replicated. This worked fine for most of our customers but some had problems where objects are created and deleted constantly or were always locked. The initial solution was to add object specific entries so any object which matched would be ignored, however this was still unsuitable for some, they has so many objects being created with generic names that they could not keep adding them to the list. So we added generic name support to the Object replication filter process. The name is checked for an ‘*’ in it, if found the name up to the ‘*’ will be used to check the object request ie. TST* will filter all objects of the given type from the given library which begin with TST.
Another request was to automatically repair a failed request such as one where a command failed on the target because the object did not exist. Now if a command fails the object is located on the source and automatically replicated to the target system and the error is not logged, if the object does not exist on the source either the request is discarded.
Status & Management
While the status screens are not meant to be monitored constantly we found that some additional information would benefit the user when reviewing certain status. As part of this PTF we have added the following information to the status screens.
- Depth of the process queue to the status for the retry manager.
- Apply status screen shows if any object or IFS errors are logged
Finally we have added a new command which allows the status of a remote journal apply process to be retrieved from HA4i. The command can only be run from a program which allows a parameters which is passed as *CHAR 10 to be set. This allows the user to develop programs to manage the apply process when carrying out functions such as running a save on the target system.
This has been a major PTF for HA4i, not in terms of fixes but in providing more functionality and features to the user. All of the updates have been implemented as a result of customer requests and that has to be good for everyone because we are delivering what the customer requires not what we feel they need.
HA4i is an affordable option for High Availability, if you are considering a HA solution HA4i has to be part of your review process. if you need more information or would like to see a demo of the product let us know.