Early in September, New York Federal Reserve employees in Financial Markets and Operations began relocating to the bank's technology center in New Jersey.
The occasion of the mass movement is now all too well mourned. With airplanes having rained down from the sky just 24-hours before,
As it all turned out, the Fed's downtown offices were soon reoccupied--and the banking system held steady. Even so, the World Trade Center attacks illustrated the critical importance of an effective business continuity plan. It pointed to the need for a systematic guide where data, personnel, and process as well as the ever critical transactions would be retained even should all hell break loose.
"Systems are only part of the equation," says Philip Jan Rothstein, a consultant based in Brookfield, Conn.
Greg Valdez, CIO, of Veritas, a business continuity services and software provider in Mountain View, Calif. agrees. "A business can fail if the result is the loss of domain and business process knowledge. New York City is certainly experiencing that after Sept. 11," says Valdez.
Certainly, the collapse of the Twin Towers was an emergency of a fundamental sort. In addition to generating ambient anxiety, it evinced a major psychological shift in the business world.
"You'll still have many CEOs that remain unconvinced that the expense of failover equipment or other recovery technology is worth it," says Valdez. "But others are looking more closely at both our technology and their recovery plans," he explains. "Many CIOs are beginning to understand the kinds of revisions that need to be made."
Rothstein notes that many executives have been chastened by the experience of terrorism on U.S. soil, however indirect it may have been for any given company.
"In Europe, of course, terrorism has always been real and evident," he says. "After the hijackings, much of our cultural naivete was shattered," he explains. As the corporate world begins to feel its way around in more cautious times, revising disaster recovery plans will become important. And though many will ignore the problem and stay at risk, many more will begin to think about disruptions of the catastrophic sort.
"What the WTC situation illustrated was the need to consider personnel very carefully," says Greg Benton, director of strategic alliances with Agilera, an application management services provider in Englewood, Colo. "You can have all the redundant data in the world and be in significant trouble if you haven't planned for the death, defection, or illness of staff." Benton sees third-party providers taking a bigger role in an environment shaped by increasingly sophisticated technologies and limited personnel resources.
State of the union
On business disruptions, the statistics are telling. For every minute that passes during a bank system failure, about $250,000 is lost, according to Valdez. Meanwhile, 40% of businesses that experience a major disruption from a disaster never really bounce back.
Whether act of God, act of war, or a simple but devastating computer chip malfunction that leads to a single point of failure, the long-term effects can be huge. "Five years later, on average, these disaster stricken companies are out of business," says Valdez.
He explains that continuity or contingency planning is designed to address the whole gamut of business disruption. Disaster recovery, as a subset of this discipline, is arguably the most crucial aspect.
By way of definition, Valdez offers this take: "'Failover' refers to saving a transaction by switching to another system should a component fail. 'Recovery' refers to recreating a transaction in the case when a system is destroyed."
Thirty percent of all CIOs polled in a recent CIO Insight survey indicated that they'd been forced to enact a disaster recovery plan at some point in the last five years. And yet only about a third of all U.S. businesses even have a plan.
Banks are a different matter. Regulations require them to have contingency plans on file and tested. Regulatory requirements notwithstanding, banks can still come up short, depending on the nature of the catastrophe. While the system held on Sept. 11, some institutions experienced significant disruptions. Few bankers would speak on the record on this subject, but various business continuity companies offered their take-and advice.
Get specific--and real
Most bank managements mean well, says consultant Rothstein, but don't give enough quality time to planning, relying instead on software boilerplates to sustain a business in time of need.
As a provider of security solutions, including methods to encrypt transactions in storage and transit, Rod Murchison, vice-president of product management, with Ingrian Software, Redwood City, Calif., sees a lot of data centers. In his view, banks are on top of managing the backup of transactional data. "They do fine with backup of the kinds of accounting records so critical to regulators and investors," he explains.
They fall off, though, in terms of saving copies of work accumulated on a day to day basis or work involving project detail, however. "Day-to-day process-related work is hard to back up--much of it isn't digital to begin with," says Murchison. "Think of backing up all your e-mails or Word files. How much time would that take? Is it worth it?"
Figuring out just what is worth it is the hard part of planning. But instead of seeing a blueprint that is the brainchild of thorough situational analysis, Rothstein says he is often is faced with generalities. Software tools please auditors but don't help bankers face the facts or address the particularities of geography, industry, or client and employee needs.
On the subject of people and logistics, Rothstein once had a client who needed to determine where to locate a backup facility. With its primary offices in a Jersey City location close to public transportation, the client had to consider what would work for employees should they be forced to relocate suddenly. "Many of them are single parents. Many wouldn't have a way to travel. In that situation, it wouldn't really make sense to have a backup facility in central Pennsylvania." he adds.
"You can't have a backup facility half a country away from your primary facility and simply assume that the personnel will relocate there temporarily or that you'll find other people easily," agrees Ruth LaStina, vice-president of operations and engineering at Three Pillars, a digital security company, based in Nor-cross, Ga.
"Most in bank management realize this, but planning can get so decentralized that sometimes less than optimal decisions get made." In contrast to Rothstein's view, LaStina believes banks have detailed enough plans, but fall down on testing and modification.
"You really need to think through a variety of personnel and systems issues," says LaStina. "You need to make sure that, for instance, when you reactivate key network functions that the applications which sit on the top layer aren't adversely effected."
Steve Wilson, president and CEO of Lebanon Citizens National Bank, Lebanon, Ohio, is a bit more upbeat on the subject of the industry's readiness. "We passed Y2K with flying colors and, after Sept. 11, the banking system remained operational, despite not being able to collect checks and other operational challenges," he says. "I think how the industry handled those two situations indicates that it's serious about disaster recovery." Wilson's own bank had to make use of a disaster recovery plan in 1989, when a catastrophic fire damaged headquarters.
It might have put a less well prepared institution out of commission, but Lebanon Citizens "didn't skip a beat," says Wilson. Since the early '90s, when the bank took data processing back in house, it took special care to test and retest methods and system backups. When Y2K planning was at its most intense, the focus shifted from coping with the physical to attempting to address every conceivable data processing-oriented disaster that might occur. "Senior management made planning a priority," says Wilson, "but we got input from all personnel. Basically, we asked them how they would do their jobs if the computers were out or if electricity was down and the generator had problems," he explains. "You're never going to get 100%. The idea is to think flexibly. Whatever comes up, you can give yourself a way to cope.
A big part of planning for data backup is the question of how to manage the use-and cost-of the backup facilities under ordinary business conditions.
"Most money center banks have multiple data centers that they run at less then full capacity, so if there is a system problem in one center, it goes into failover elsewhere," says Rothstein.
It may make sense to let backup facilities do double duty, but specific tools and strategies need to be in place to address how conversions are to be handled in time of crises. This includes knowing what data takes priority in the case of shutdown/failover situation. These are the sorts of particulars, says Rothstein, that a boilerplate can't address.
Hampering effective planning
Several factors can hamper the continuity planning process. These include: hazy reporting structure, with no top down authority guiding the planning process and inadequate budgeting for recovery.
Then there's a pesky people problem that gets in the way. "Basically, nobody wants to admit that what they do for a living isn't absolutely vital to a firm's well being," says LaStina.
"And yet, the person in charge of formulating a business recovery plan has to know who is doing what in the organizational chart-and who is strictly necessary for the ongoing health of a business versus who isn't in order to prioritize properly," she explains.
Another pesky people matter-one that's high on Rothstein's list-is a lack of general awareness. "Many executives miss fairly obvious vulnerabilities," says Rothstein. "When I go into offices, I see it everywhere." This includes heavy cabinets loaded with hundreds of pounds of paper to poorly designed and overly crowded offices that wouldn't be easily evacuated in a time of crises. The fix?
"Think about creating the most physically stable environment," he plains. "Do this even if you don't think you're in the earthquake belt."
A better way
As an alternative to scattershot planning, recovery experts think plans should be multifaceted, taking all aspects of the business into account. The resulting plan ought to be able to utterly resurrect a business, whatever happens.
First in a long series of recommendations from Valdez (see chart on p.56) is the need to place an individual in charge of continuity planning within the first top two tiers of an organization. "Have a senior executive handle this and report directly to the CEO to get the best results," he says.
Rothstein thinks it strictly necessary for several key managers and officers to sit around a table in a conference room and go over the critical portions of an existing plan point by point, arguing each one's merits before adding or subtracting elements from the plan. This process might take place at a monthly gathering or it could be something done a few times annually, depending on the scope of change within the institution. It's also important to establish a chain of command in the case of a disaster; so that it's clear from the onset who calls whom and what action is taken in what order.
As a discipline, continuity planning should let a business come back online whatever the nature of the disruption. This involves for a large organization having multiple layers of storage, including redundant array of independent disks (RAIDS) and the more enterprise-wide storage area network storage (SANS) approaches. Increasingly, it may also involve outsourced offsite facilities that handle key aspects of functionality management. It could involve engaging an internet service provider to handle router management, for instance. Prioritizing is also important. Banks need to determine to the extent possible, exactly what is worth it and why. Notes Valdez: "You need to ask yourself, 'what applications need to be brought up in what order, what processes must be brought up in minutes or hours, what processes can hold for a few weeks'?" Having the answers thought out in advance can make all the difference.
RECOVERY
Rebuilding your business
Level 1 People Understanding, knowledge,
dispersion, contact
mechanisms
Level 2 Plan, processes Tests, rehearsal, crisis,
response team
Level 3 Physical facilities Alternate sites, utilities,
geopolitical, mother nature,
network, command center
LeveL 4 Hardware and OS components Servers, disk, OS
systems software
Level 5 Network connectivity LAN, WAN, hubs,
routers, switches
Level 6 Network services IP, DNX, DHCP, SNA,
directory services
Level 7 Data services Backup, recovery, file
systems, SANS
Level 8 Systems management Monitoring, alerting,
job execution
Level 9 Security System administration,
user access control lists
Level 10 Other infrastructure Middleware, gateways,
components transaction managers, logs,
ORBs, e-mail, fax
Level 11 Business data DBMS, utilities, actual
databases, resync checkpoints
LeveL 12 Enterprise shared ISP, e-biz, ERP, GL, sales,
applications VNet, e-maiL, support,
business operations
Level 13 User access PC, LAN, office software
Level 14 Individual business Legal, marketing, data
applications and warehouse, remote dial-in
decision support
Source: Veritas