We have a client who is running our HA software who recently had files showing up as missing from the target system but they existed on the source system. Our first thoughts were there must be some process deleting and adding the files back and not journaling them. In fact this is the case, but the problem we encountered was how to identify the actual reason for the delete and which process was doing it.
The client has limited knowledge about the application and the supplier promised faithfully that they are not deleting and adding files back. We had to be able to prove that it wasn’t the HA application that was causing the issue because as usual that’s where the finger always points. The first thing we did was run a DSPJRN against the journal on the production system, for any entries for the files and using the entire receiver chain to make sure we caught everything. The command came back with 0 entries converted, as we had no history with the files we thought it must be because the files were never journaled so we had never been keeping them updated. Easy fix, we just journaled the files and sync’d them up. Then we noticed the same files re-appeared in the audits a couple of days later, the audits just reported the files did not exist on the target. Again we looked at the DSPJRN command to see what had happened and it came back stating that the file had no entries in the journal?
We believed we must have done something wrong with the journaling and sync’ing of the file. So we made the same request to start journaling and re’sync, only this time we checked to make sure the files were journaled and they were the same on both systems after the sync. Next audit came up clean so we thought OK its fixed. It wasn’t, a couple of days later the files appeared in the audit reports as missing from the remote system again!
DSPJRN on the source system yet again showed no entries! So we looked at the logs on the target system and could see the files were synced in a particular receiver as we expected. We looked closely at the entries deposited by the APYJRNCHG command and could see the file was being deleted as part of the apply process. Using the data in the receiver we were able to track down the offending program and prove to the client that we were in fact working as expected and the application and a user were responsible for deleting the files. Now the client and application vendor have to decide how they want to handle the files and if they are important for recovery. They cannot auto journal new objects due to a high volume of temporary objects being created in the database library! Why developers don’t separate temporary objects from production objects is beyond us.. Another option would be to clear them instead of deleting them!
Enough of the Rant! What does appear strange to us is how the DSPJRN command worked. On the source system the object existed yet a DSPJRN failed to find any entries for the object, if the object had been journaled we could have believed the DSPJRN command used the JID but the object wasn’t. On the target system the object did not exist and yet the DSPJRN command did find all the entries for the particular object, like the source system its not journaled either so why did it find the entries? The above problem only appeared when we were looking for entries for a specific object, trawling through the receivers showed all of the entries for all the objects so they are there.
So if you run a DSPJRN looking for entries for a specific object be aware the results may not be right.