How to improve your IT. Part 17 - Data Centre and Operations

IT Date Centre

IT Date Centre

A series of posts on how to improve the performance of your IT

The Best practice IT Standard is:

Protection of data centre assets, completion of the nights batch processing, successful completion and verification of back-ups and on-line systems availability ready for the next day’s business opening hours. The performance and capability of data centre operations are assessed every night and every morning. Production and development batch processing, scripts, job scheduling and job run sheets are assessed daily. Confirmation of database availability, completion of overnight back-ups through to the successful initiation of on-line production systems ready for business opening hours is also assessed daily. As a result, a thorough health check is usually not warranted unless there have been recurring problems.

Performance Assessment

1.     Data Centre Security.

a.     How often are physical security systems tested?

b.     Video Tests?

c.      Man Traps/Bollards?

d.     Doors/Vents/Fan exhaust?

2.     Redundancy.

a.     How often are water and back-up water systems checked?

b.     How often are back-up communications links checked?

3.     Is the data centre manual up to date in respect of?

a)     Changes to movement sensors, alarms.

b)     Intruder monitoring, external security services.

c)     Electronic access controls.

d)     Off-site media transport and storage.

e)     Air Conditioning.

f)      Fire suppression systems.

g)     Generator testing.

h)     Equipment maintenance schedules.

i)       UPS and generator testing.

j)       Surveillance checks.

k)     External physical penetration testing.

l)       Review the data centre maintenance procedures.

m)    Review the data centre operations manual.

4.     Operations.

n)     If there are processing problems, what tends to be the most common cause, 1) change control, 2) job scheduling, 3) run sheets, 4) other?

o)     Is the operator’s manual up to date?

a)     Manage and schedule batch jobs.

b)     Batch job dependencies.

c)     Printer definitions and queue management.

p)     Is an operator (hand over) diary in use?

q)     Who manages data restoration services?

r)      Who creates the application production scheduling calendar?

s)     When was the last disaster recovery plan walkthrough?

Sample Task list

  1. Determine what risks are evident or may exist in the data centre physical and logical environments?

  2. Arrange an assessment of physical security.

  3. Are support facilities being regularly tested? (Back-up communications, generators, water supply, fire suppression).

  4. Determine further works required and scope out.

  5. Breakdown the scope of works to task level, ready for loading into the change management project schedule.


You can share this post by using the buttons below

You can follow me on Facebook, Twitter, LinkedIn, Medium and Slideshare