Operations & Procedures, Planning & Strategy, Risk & Resiliency, Staffing & Training

COVID-19 webinar: Minimizing critical facility risk (English) – Q&A

Rhonda Ascierto / March 23, 2020

UII Note 51 • Q1 2020

On March 18, 2020, Uptime Institute hosted the webinar COVID-19: Minimizing critical facility risk (recording available here). Much of the content was based on the Uptime Institute report COVID-19: Minimizing critical facility risk (full report download available here).

Below you’ll find the questions (Q) attendees submitted, organized by topic and answered (A) by Uptime Institute experts.

CONSTRUCTION

Q: What is your recommendation regarding active data center construction projects, given the number of personnel they involve? What precautions should be taken?

A: For those organizations involved in data center construction, major upgrades or extensions of capacity, the pandemic presents challenges. Construction speed has a big impact on cost, and delays in one area can impact many other areas and other suppliers. In this case, however, delays may be advisable, and the following actions may be appropriate:

  • Suspend all nonessential projects when possible.
  • If the project must continue, coordinate with contractors to ensure all subcontractors/vendors are applying appropriate safeguards.
  • If possible, create a separate, secure entrance for all parties involved in the project and establish isolation of the project personnel from the operations personnel. Operations team members who are assigned to project oversight or supervision should be dedicated to those duties and not allowed to interact with the duty operations personnel.

Q: Can you recommend any resources related to the precautions that should be taken during construction projects, given the number of personnel on-site?

A: The US Occupational Safety and Health Administration has guidance on how to protect workers in various professions.

MANAGEMENT AND OPERATIONS

Q: How is Uptime Institute revising or changing its operations standards to address the operational changes needed to address the pandemic?

A: Uptime institute will be reviewing what may need to change within the standard in the light of current events. For example, behaviors for disaster recovery plans may need to be enhanced or revised to better prepare critical facilities for situations like the COVID-19 pandemic. Watch for further guidance from Uptime on this topic.

Q: What are some possible unintended consequences to operations due to the measures commonly being taken to avoid spread of the virus among staff?

A: Companies are taking a variety of actions to avoid spreading the virus, including reducing shift numbers, limiting interaction between shift teams, or covering their critical facilities remotely. Any options that deviate from normal processes can increase the risk of human error or extend response times. The key is to keep communication open between all groups, so they are in alignment with their support of the critical facility.

Another possible unintended consequence of reduced shifts is excessively worked staff, which can result in human error. Take steps to alleviate worker fatigue. Additionally, with fewer staff on-site, there are fewer resources available to respond to an emergency. This can be mitigated by ensuring a robust on-call schedule and ensuring that emergency operating procedures are thorough and complete. Where possible, arrange for external expert cover as a contingency.

Q: What are your recommended procedures for maintenance under this situation?

A: If activities such as operations and maintenance are outsourced, collaborate with partners to set and align policies.

Plan for essential maintenance visits. Governments or companies may relax rules, or provide exemptions, for the maintenance of essential equipment. Most of the “lockdowns” currently in place make exception for people going to work; however, other authorities having jurisdiction may apply stricter controls on travel within their areas of control. Operators must plan for how to manage this in advance and obtain the necessary permissions where required. Permissions may depend on the applications/services being run in the data center. Most governments maintain a list of “critical industries,” but some data center services may fall into a gray area — it is important to clarify this in advance.

Review maintenance plans and prioritize: Determine which tasks and issues can be downgraded/responded to last or not at all if operating on a skeleton staff crew.

Consider the consequences of deferred maintenance, as it may increase risk of component or system failure. As always, have a plan in place to respond to any major problem, coordinating with vendors as necessary, to ensure issues can be addressed.

  • If equipment failure cannot be addressed in a timely manner, ensure procedures to address safe shutdown/isolation of the equipment and that digital infrastructure is sufficiently resilient to absorb the loss of failed equipment (at least until workload can be transferred).
  • As time passes and restrictions remain in place, revisit deferred tasks and determine whether continued delay increases risks beyond reasonable tolerances.

Postpone all nonessential maintenance (e.g., infrared scanning and quarterly electrical power management system visits) and major projects where possible.

If nonessential, reschedule high-risk testing (e.g., black start/plug-pull tests, generator load bank tests) for after pandemic risks have subsided.

In mixed-use facilities (such as a server room in a mixed-use building), requirements such as maintenance (and access) for critical staff, and for critical facility exceptions to the general building rules, should be clearly identified to establish exception policies where appropriate.

Q: As part of our plan to reduce work in this situation, is it appropriate to hold preventive maintenance activities?

A: As stated separately, prioritize the maintenance activities that are part of your preventive maintenance program. Determine which activities must be performed and which can be delayed. Review and re-prioritize monthly to ensure those items put on hold or delayed are not postponed indefinitely. Watch for further guidance from Uptime on this topic.

Q: Rather than rescheduling all high-risk testing, shouldn’t the backup power generator and uninterruptible power systems be tested, to prepare for the worst?

A: Even with the measures being discussed to mitigate risk of spreading COVID-19, infrastructure readiness must still be considered a high priority for mission-critical facilities. When prioritizing maintenance and testing activities, the following should be considered:

  • The resources required to complete the activity.
  • The risk of the activity.
  • The risk of NOT doing the test or maintenance activity.
  • The risk to the facility if the activity causes an emergency (i.e., extra resources may be required on-site to address the emergency, which may involve individuals who may not have been screened).

RESILIENCY

Q: What would be the implications for our fuel backup supply (12 hours) under a lockdown scenario, and what are your recommendations for fuel storage under such a scenario?

A: Data center operators should ensure they have a priority delivery contract with fuel vendors. Operators without an existing agreement may not be able to negotiate one under current conditions, but an attempt should be made.

At the very least, discuss this eventuality with fuel suppliers: Ask what their plans are for being available if shelter-in-place orders are given in the area. This will at least set a baseline for the capability you would have to get additional fuel in a worst-case scenario.

In extreme situations — for example, regions at high risk of shelter-in-place orders, areas susceptible to regular power outages, or situations in which the data center operator does not have a priority delivery contract in place — data center operators could consider having temporary fuel tanks or tanker trucks parked at the site. Remember that hospitals will get the first fuel deliveries in health emergencies.

Q: COVID-19 will cease to be an urgent concern at some time in the future. Climate change, though, is truly endemic and may have played a role in the initiation and transmission of this virus. After this crisis passes, I’m curious if anyone is proactively tracking data center operations against accepted RCPs (relative concentration pathways or representative concentration pathways) for future operational efficiency and generalized asset risk (e.g., adequate energy sourcing, facility resiliency, etc.) We see this as coming rapidly on the heels of the wind-down of the COVID-19 pandemic.

A: Uptime has not observed clients using RCPs and tracking against data center operations to date, although our report on climate change resiliency provides advice on preparation. However, energy resourcing is increasingly becoming an important factor on site selection, among other factors such as distance from users and network availability. In general, while RCP is not unimportant, Uptime clients currently have been placing a higher value on other considerations. Time will tell if COVID-19 and its aftermath changes that weighting.

SANITIZATION

Q: How can operators identify appropriate specialist firms under these conditions? They can’t possibly be ready to scale up like this.

A: Here, we assume the participant is referring to specialty cleaning firms. Contact service providers directly and inquire about their experience with this type of deep cleaning — ask about past and current clients, what methods are used, etc. Apply the same diligence used in vetting any critical contractor. While it is true that it will be difficult to secure new suppliers in the short term, cleaning companies will be attracted by the opportunity to work with large, stable clients and will likely help as much as they can. Many will be scaling up to meet demand.

Q: Is there a chemical agent or compound that allows disinfection of the data center in case it becomes contaminated with the coronavirus that causes COVID-19?

A: Uptime does not have the subject matter expertise to make specific recommendations for products or systems to sanitize data centers that have been contaminated. Data center operators should consider the following:

  • Directly contact the cleaning company contracted for deep cleaning and investigate their capabilities for decontamination and sterilization.
  • Consult relevant websites (see the Appendix in our report COVID-19: Minimizing critical facility risk) and determine if the chemicals and systems the cleaners propose to use are consistent with the recommendations of public health experts.
  • Contact equipment manufacturers and discuss the materials and procedures to determine potential impacts on specific equipment.

Q: Are there approved cleaning products that that will clean the virus but not damage the equipment?

A: Consult equipment manufacturers to discuss the materials and procedures to determine potential impacts on specific equipment. There are many products designed to clean and disinfect electrical equipment.

Q: Using gloves isn’t a good practice, because the virus can live in their surface. Isn’t it better to wash your hands?

A: References to “gloves” in our report COVID-19: Minimizing critical facility risk are intended to mean single-use, disposable gloves that are discarded in a waste receptacle immediately after use. Work gloves and electrical personal protective equipment gloves are not suitable for sanitization procedures.

Q: Can you address sanitizing procedures after a potential or known exposure in the data center?

A: Uptime does not have the subject matter expertise to make specific recommendations for sanitization procedures in a data center potentially contaminated with the virus Data center operators should consider the following:

  1. Directly contact the cleaning company contracted for deep cleaning and investigate their capabilities for decontamination and sterilization.
  2. Consult relevant websites (see the Appendix in our report COVID-19: Minimizing critical facility risk) and determine if the chemicals and systems the cleaners propose to use are consistent with the recommendations of public health experts.
  3. Contact equipment manufacturers and discuss the materials and procedures to determine potential impacts on specific equipment.
  4. Contact local authorities or other in-country advisory bodies.

Q: Will air scrubbers in the data center help prevent the spread of the virus?

A: This will depend on the type of scrubber and the minimum efficiency reporting value of the filters in the scrubbers. Watch for further guidance from Uptime Institute on this topic.

Q: At this time, do you know of any data centers that have been contaminated with the coronavirus that causes COVID-19? If so, what was the main problem they encountered in addressing the situation?

A: Uptime is not presently aware of any data centers that have been contaminated. If a data center does become contaminated, the main problems will likely pertain to staffing/isolation and physically decontaminating the facility.

STAFFING MANAGEMENT

Q: What ways can/should business maintain employee moral (mental health) during these difficult times?

A: There are several online resource articles available, including:

Consider implementing a “buddy” scheme so that colleagues reach out and engage in daily communication with others. Many people who are now working from home in various industries are moving to online social media platforms and are creating crowd-sourced documents (see, for example, the Coronavirus tech handbook). In addition, many organizations — sports teams, educational establishments, zoos, opera houses, musicians and more — are creating online events and remote access to stay engaged with others.

It is important to keep a level head and understand that, while being informed as to the latest developing information aids with facilitating preparations, there are some whose mental health will be affected by such activities. For those individuals, encourage them to limit the frequency they check the news and ensure that all team members feel comfortable within the organization to call for help when needed.

Q: How should a critical staff member react to a case of possible contamination? For example, should they not get directly involved and instead use specialized personnel to reduce exposure to critical staff?

A: The reaction should be immediate: Isolate any affected staff member, contact management/Human Resources, and follow all corporate and government guidelines related to quarantine and reporting. Implement the top level of the company’s pandemic preparedness plan. Any staff member having close contact with a confirmed COVID-19 case should be advised to self-quarantine for the appropriate period, usually 14 days.

For a confirmed COVID-19 case at the site: Contact the cleaning contractor. Cleaning personnel should use single-use bio-hazard suits, gloves, shoe coverings, etc. All materials should be bagged and removed from the site once cleaning is complete.

Q: Data center staff are skilled technicians. Have any data centers reached out to temp agencies for staffing? What other options are available in the event proprietary staff get sick?

A: Uptime is not aware of any data centers contacting temp agencies (outside of their existing contractors) or using staff to perform technical tasks that they are not trained for. Should the use of staff unfamiliar the facility be necessary, the skilled and experienced technicians who staff the data center should review all standard operating procedures, methods of procedure and emergency operating procedures to determine if those instructions are clear and specific enough to be followed by personnel who are less technically trained or are less familiar with the specific data center’s equipment and operations. Procedures written by experienced staff tend to have “assumed knowledge” that a temp or nontechnical staff member might not possess.

COSTS

Q: How do I prepare for the financial impact of many of the practical suggestions I am hearing?

A: Here, we assume the participant is referring to impact on data center operations budgets.

Clearly, executive management will need to assess the situation and prepare accordingly. Considerable government support is available in many countries.

As with the case with other abnormal events (e.g., equipment failure or severe weather event), management typically takes the reasonable approach of instructing the operations team to spend what is necessary to protect staff and the data center infrastructure, keeping track of the costs. The justifications will be examined as a part of an ongoing review process.

If management is asking for justification prior to expenditure, ensure the case for all proposed spending is detailed, highlighting the risks to operations and personnel if the expenditure is not authorized.

COMPLIANCE

Q: In our Singapore data center, we are taking temperatures as a precautionary measure, as mandated by the Ministry of Health. In areas in which the government has not mandated such measures, how are data centers responding, and what are the legal implications?

A: Uptime cannot comment on legal consequences as these will vary from location to location. In areas where there is no regulatory mandate, data centers should decide (in consultation with insurance companies, legal advisors, Human Resources departments and other management) at which response level to institute such measures.

If temperatures are taken, follow the guidelines issued by public health authorities regarding the safest procedures, to minimize the risk of infection.

Q: How can data centers ensure compliance with service level agreements (SLAs), at all levels?

A: For the SLAs between service providers/vendors and data center owners, this will depend on the terms of the SLA and the strength of the relationship between the data center operations team and the supplier.

There are likely to be situations where SLA breaches occur — for example, on levels of redundancy or on site staff. Financial penalties and problems with customers may be avoided if all parties consult in advance.

Q: If a staff member had COVID-19, recovered from it, and spent 10 days in home confinement after the symptoms subsided, can they come back to work?

A: Guidance from health authorities is evolving in this area and should be followed at all times.

Consider “recovered” staff both potentially infectious and at risk. There are reports indicating that people who have contracted the virus and recovered have only limited immunity and may become re-infected. Therefore, all the same rules and policies should apply to all staff: Until more data becomes available, consider staff who have had COVID-19 to be both as potentially infectious and as at risk as all other staff.

TIER CERTIFICATIONS

Q: Have you learned of any Tier-certified data center shutting down during the pandemic?

A: Uptime is not aware of any such cases at this time.

Q: During the pandemic, what will be the procedure for currently scheduled data center certifications? If inspections are delayed, will Uptime extend the design validity?

A: The COVID-19 pandemic is a dynamically evolving situation affecting different parts of the world in different ways. Therefore, currently scheduled on-site visits from Uptime Institute consultants are being evaluated on a case-by-case basis. Clients with an ongoing project should contact their assigned Project Manager to discuss scheduling.

Uptime Institute is still supporting remote work and certifications without encountering issues on our side.

Uptime Institute does have an established process for requesting and evaluating extension requests. Cases where extensions are sought due to delays caused by COVID-19 are likely to successfully receive an extension, but the formal process still needs to be followed, with the requested details provided. Clients who believe they have encountered COVID-19-related delays should contact the local business development representative to initiate an extension request.


Uptime Institute wants to ensure the data center industry is prepared and informed about countermeasures to the COVID-19 situation. Click here to access live support and on-demand emergency management resources. Follow the COVID-19 Intelligence Collection on Inside Track for frequent updates.