Proper way to take care of DFHSM
William Moll Jr.
Mainframe Storage Administrator at Kyndryl working remotely from home
DFHSM Issues
Being a Mainframe Storage Administrator, I have worked at many companies either as a employee or a contractor. I have worked at approximately 10 companies. What I have seen at them all is DFHSM is totally being treated improperly.
The largest violation I have come across is there was a company that had HSM journaling turned off. This tells me they have no clue about HSM. Journaling is used when aHSM control dataset gets corrupted. Journaling gives you the capability to forward recover the control dataset. First you would restore the HSM control dataset in failure from the latest backup. Journaling writes all the updates of a control dataset to the Journal dataset. Once the control dataset is restored you then issue the UPDATEC command. This will read the Journal dataset an apply all the updates to the restored control dataset.
The next issue I saw is a lot of installations do not understand HSM backups. They think that if you define 7 backups in HSM that when HSM creates backup number 8 that HSM will delete backup number 1. WRONG!!! HSM flags backups for deletion and will not delete until a EXEC BACKUPBV EXE runs. The one company I worked at had a whole 3390 mod 9 allocated for the Backup Control Dataset and it needed to be expanded because it was full. I analyzed the issue. The problem was a BACKUPBV EXE had never been run. I ran the BACKUPBV EXE and it ran for almost a week and freed around 4 million backups. I then cut the BCDS down to about 1/4 of a 3390 mod 9 and then set up a jobe to do a EXEC BACKUPBV EXE to run weekly so the problem would never occur again.
The next issue I have seen many places reorgs of the HSM Control Datasets and Small Dataset Packing datasets are ignored. The one company that I started at in 2012 had not done a reorg on these datasets since 1999. I could understand why HSM requests were sort of slow. I did a reorg on these datasets and then set up jobs to reorg these datasets every 6 months. keep these datasets in good shape will cause HSM to run effiently.
Now comes the most violated HSM issue. HSM control datasets tend to get errors in them and also HSM tapes get errors. There is a facility in HSM that will allow you to clean up the errors. It is the various AUDIT commands. You can audit the MCDS(migration control dataset), the BCDS(Backup Control Dataset) and OCDS(Offline Control Dataset) and finally Audit Medial Controls which is your HSM tapes. The audits will create FIXCDS commands which will fix the control datasets. The Audit Media Controls goes agains your HSM tapes and will fix the records. By doing the media controls you are fixing a Recyle issue.If you are having a tape showing 20% of valid data, it may actually be only 10% of valid data with 10% of the data having bad records that need to be fixed.
CEO at Rightel Co-operative Company
4 年When you drive IBM Mainframe machine you are quite confident of system integrity whereas the other machines shall never give you such a feeling! Just my personal experiences with IBM Mainframe VM/SP to z/VM and MVS to z/OS.