Disaster Recovery Exercise Standards
Disaster Recovery Exercise Standards
Standardized DR exercise procedure for OHEMR Epic healthcare infrastructure, ensuring compliance, resilience, and clinical uptime.
๐ฏ Overview
This DR exercise standard documents the validated and repeatable steps for simulating Epic failover between Azure regions (WUS3 to CUS), validating both infrastructure and application recovery. Execution ensures Epic, HIPAA, and SOX compliance, and prepares teams for real-world disaster scenarios.
Benefits
- Epic Certification Compliance: Demonstrates Epic-required DR test execution and documentation
- HIPAA/SOX Readiness: Provides full audit trail and evidence of PHI protection during DR
- Clinical Uptime: Validates minimal downtime and reliable Epic failover/restore paths
- Operational Efficiency: Ensures roles, responsibilities, and handoffs are clear
๐ DR Exercise Timeline & Task Matrix
Standard DR Exercise Phases
| Phase | Purpose | Healthcare Requirement |
|---|---|---|
| Pre-Req | Data sync and readiness | PHI protection, Epic validation |
| Failover ODB to CUS | Application failover | Clinical continuity, Epic compliance |
| Failover WSS to CUS | Citrix/infra cutover | User access, downtime minimization |
| Validations | Service and system checks | Clinical function, audit evidence |
| Service Restore | Full service recovery | Business continuity, compliance |
๐ฅ Epic-Specific DR Patterns
Sample DR Timeline Table (No Unnamed: 0 or Time columns)
| Tag | Task | Estimated Duration | Team | Resource |
|---|---|---|---|---|
| Pre-Req | Pre-Req: flatfile -Sync directory files between WUS3 and CUS regions | Epic Infra (ODBA) | Chris L / Laura / | |
| RedAlert checkdrreadiness | ||||
| Failover ODB to CUS | Send message to users to log out of Epic | 1 | UHG Citrix | Jason / Bharat |
| Failover ODB to CUS | turn BCA Web Edit Mode to On | ?? | ||
| Failover ODB to CUS | Kick users out of WUS3 Epic PRD Instance | 1 | UHG Citrix | Jason / Bharat |
| ODB Failover to CUS | ||||
| Failover WSS to CUS | Perform Citrix Failover to CUS | 1 | Citrix Team | Jason / Bharat |
| Failover ODB to CUS | Turn off Epic Interfaces in Epic and Rhapsody | Interface | Linda Wilson (UHG) | |
| Failover ODB to CUS | Allow interface messages to process, notify group once queue is at zero | Interface | Linda Wilson (UHG) | |
| Failover ODB to CUS | Bring WUS3 to runlevel D, perform ODB Failover | 20 | Epic Infra (ODBA) | Mason / Laura |
| Failover ODB to CUS | Webblob -ANF cutover | 5 | NAS Team | Liviu / Neha |
| Failover ODB to CUS | DNS Cutover - Step 19 in ODB run book - A1 in DNS tab | 45 | DNS Team | Ken Cox |
| Failover ODB to CUS | Promote CUS ODB as primary, demote WUS3 ODB to secondary | 20 | Epic Infra (ODBA) | Chris L / Laura |
| Failover WSS to CUS | HSW (Internal) Failover | 20 | DNS Team | Ken Cox |
| HSW (External) Failover | ||||
| DNS Cutover - Bundle #2 | ||||
| Failover WSS to CUS | Validate DNS Failover | 10 | Epic Infra ECSA | Matt/ Jerry Bennet / M |
| Failover WSS to CUS | Validate Internal / External Zones | 10 | Epic Infra ECSA | Matt/ Jerry Bennet / M |
| Failover WSS to CUS | 1. Epic Infra ECSA Web Servers: In Kuiper | Epic Infra ECSA | Matt/ Jerry Bennet / M | |
| - Change WUS3 (PRD) servers "Transitioning out of service" | ||||
| - Change CUS (DR) servers "In service" | ||||
| 2. Recycle application pools on CUS (DR) servers | ||||
| Failover WSS to CUS | Cogito changes to monitor/modify ETLs as needed | Epic Cogito | Angelea/ Maria / Nick | |
| Pre-Service Restore Validations | ||||
| Validations | Integration Team turn on interfaces in appropriate order | 15 minutes | Interface Team | |
| Validations | Confirm: interfaces up and messages processing | 5 minutes | Interface Team | |
| Validations | Confirm: Can login to hyperspace PRD | 5 minutes | ECSA | |
| Validations | Confirm: Can open chart review | 5 minutes | ECSA | |
| Service Restore | GC Analyst fileback all transactions in BCA | 30 minutes | ?? | |
| End Downtime | ||||
| Service Restore | Change PRD to Access level all | 5 minutes | Epic ODBA | |
| Service Restore | Call service management hotline number | 1 minute | Leader | |
| Service Restore | Leadership communication #5 sent (final) | 1 minute | Leader |
๐ง Standard DR Validation Requirements
Pre-Exercise Validation
Before starting DR failover, verify:
- Epic Requirements: All data and systems are in sync, ready for cutover
- Network Security: Citrix, DNS, and NAS changes follow approved playbooks
- Communication: All teams notified and ready for handoff/validation per timeline
- Documentation: All tasks, owners, and durations tracked for audit
Validation Checks
# Confirm all Epic systems and interfaces are up post-failover
# Example PowerShell validation
$EpicStatus = Get-EpicSystemStatus -Region "CUS"
if ($EpicStatus -ne "Healthy") {
Write-Error "Epic DR system status NOT healthy. Investigate immediately."
}
# Confirm Citrix access
Test-NetConnection -ComputerName "cus-citrix-gateway" -Port 443
๐จ Troubleshooting Guide
Common DR Issues
Problem: Users cannot access Epic after failover
Diagnosis: Citrix or DNS issue
Resolution:
- Validate Citrix failover completion and gateway update
- Confirm DNS cutover and propagation
- Communicate to Service Desk for user notifications
Problem: Interfaces not processing messages
Diagnosis: Interface queue or system not restarted
Resolution:
- Check interface team brought up systems in order
- Validate interface message queue is zero
- Confirm Epic interface status in application
Problem: Performance issues post-failover
Diagnosis: NAS or database not fully cut over
Resolution:
- Validate ANF cutover (Webblob)
- Check SQL/ODB failover completion
- Monitor Epic application logs for errors
๐ Related Documentation
- VM Deployment Standards: Virtual machine deployment procedures
- Azure NetApp Files SnapMirror Migration: NAS migration and cutover playbook
- Epic Architecture Requirements: Epic-specific infra standards
- Security Baseline: Healthcare security controls
๐ Support & Contacts
Emergency Contacts
- Epic Production Deployment Issues: [email protected]
- Terraform/IaC Support: [email protected]
- Security/Compliance Questions: [email protected]
๐๏ธ Deployment Excellence: Standardized Terraform-based VM deployment ensures reliable, compliant healthcare infrastructure supporting Epic clinical systems.