High-Performance IT - Data Centre and Operations
The Best practice IT Standard is:
Protection of data centre assets, completion of the night’s batch processing, successful completion and verification of back-ups and online systems availability ready for the next day’s business opening hours.
The performance and capability of data centre operations are assessed every night and every morning. Production and development batch processing, scripts, job scheduling and job run sheets are assessed daily. Confirmation of database availability, completion of overnight back-ups through to the successful initiation of online production systems ready for business opening hours is also assessed daily. As a result, a thorough health check is usually not warranted unless there have been recurring problems.
Performance Assessment
Data Centre Security.
1. How often are physical security systems tested?
2. Video Tests?
3. Man Traps/Bollards?
4. Doors/Vents/Fan exhaust?
Redundancy.
1. How often are water and backup water systems checked?
2. How often are backup communications links checked?
Is the data centre manual up to date in respect of?
1. Changes to movement sensors, alarms.
2. Intruder monitoring, external security services.
3. Electronic access controls.
4. Off-site media transport and storage.
5. Air Conditioning.
6. Fire suppression systems.
7. Generator testing.
8. Equipment maintenance schedules.
9. UPS and generator testing.
10. Surveillance checks.
11. External physical penetration testing.
12. Review the data centre maintenance procedures.
13. Review the data centre operations manual.
Operations.
1. If there are processing problems, what tends to be the most common cause, 1) change control, 2) job scheduling, 3) run sheets, 4) other?
2. Is the operator’s manual up to date?
3. Manage and schedule batch jobs.
4. Batch job dependencies.
5. Printer definitions and queue management.
6. Is an operator (hand over) diary in use?
7. Who manages data restoration services?
8. Who creates the application production scheduling calendar?
9. When was the last disaster recovery plan walkthrough?
Sample Task list
1. Determine what risks are evident or may exist in the data centre physical and logical environments?
2. Arrange an assessment of physical security.
3. Are support facilities being regularly tested? (Back-up communications, generators, water supply, fire suppression).
4. Determine further works required and scope out.
5. Breakdown the scope of works to task level, ready for loading into the change management project schedule.