Cloud and disaster recovery: Load-balanced data centers are not a perfect solution

Share This

A growing trend in the disaster recovery arena for cloud providers is the use of load-balanced data centers instead of hot-cold data centers. Companies are deploying private clouds that are load balanced between their datacenters to take care of disaster needs. If one datacenter suffered from a disaster, the other datacenter would be operating even though it is at reduced capacity.

But there are still challenges. Tracking the various configurations of the infrastructure of an application is tricky. Each application creates server names, selects open IP addresses, addresses DNS mappings, defines physical and virtual servers, creates firewall rules, defines SAN and NAS configurations, implements load balancer rules, and defines database clusters.

All of these elements exist for an application in each environment,such as development, test and production. Many of these application configurations are maintained by multiple web-based applications. The maintenance applications are not integrated and therefore the metadata application configurations are not centralized. Worse yet administrative changes are made to products at implementation time due to urgency, like a SAN subsystem, that are not captured in the change management system. Hence the metadata is often out of date also.

It would be great to have a tool to clone the configuration in one data center to the other data center it is load balanced with. The configuration would need unique server names, new IP addresses. It would model the symmetry of the application in the other data center while still providing necessary infrastructure if the other data center fails. But creation of a tool or wizard would be difficult considering all of the valid permutations of products that could be configured.

Centralization of infrastructure configuration metadata is critical.

Without the centralization of the parameters and a versioning of them the deployed application and its supporting infrastructure will drift in small ways over time. Small configuration changes can cause problems in both the primary and secondary load-balanced datacenter. If the configuration data is not versioned, it may be very difficult to return the datacenter back to a stable state when a change leads to immediate production errors.

It also points to certification of critical elements of the architecture. Companies should have a policy that states that only tested product configurations, such as versions of virtualized machines on kernel software or operating systems, can be deployed within the data center. Only specific versions of firewall hardware can be deployed in the various data centers. Another danger is to have a lack of options, such as single-sourced software or hardware, for various infrastructure components. If there is a common flaw in hardware or bug in software it could lead to a dramatic failure in multiple data centers.

In conclusion, corporations are addressing disaster recovery concerns by deploying applications in load balanced architectures. But this doesn’t protect against human error, particularly configuration errors. Corporations may turn to certified components like specific virtual machines or load balancers to avoid some of the disasters due to an untested configuration or a lack of versioning of configuration metadata. Configuration metadata needs to be stored in a centralized manner and versioned so that the application can fall back to a trusted configuration if errors occur.