Small Business Resources, Business Advice and Forms from AllBusiness.com

Business Exchange

enterprise INFORMATION Architecture: Don't Do ECM Without It

By:Byrne, Tony
Publication: EContent
Date: Saturday, May 1 2004
Subject: Information management
Location: United States
IMAGE ILLUSTRATION 1IMAGE ILLUSTRATION 2

Two questions resound throughout the content industry: Why do Enterprise Content Management (ECM) projects take so long to implement? And why do they fail with such alarming frequency? While all enterprise-level IT projects prove to be difficult and risky undertakings, a deeper examination of the ECM challenge in particular will reveal an endemic inattention to-or at best belated appreciation of-its critical corollary: the need for Enterprise Information Architecture (EIA).

Information Architecture (IA) is commonly understood to be the art and science of structuring, organizing, and labeling information so that content owners can better manage it and users can find what they're looking for more effectively. IA can be bottom-up (i.e. analyzing and labeling content chunks) or top-down (i.e developing standardized categorization schemes or taxonomies).

This traditional approach for IA works very well for improving individual Web sites or helping departments find and manage information more efficiently. But developing effective information architectures at the enterprise level presents another dimension altogether.

THE EIA PROBLEM

To understand the case for EIA, consider the following examples:

An enterprise employee needs to file a trip expense report. The policy that describes acceptable billings resides on a local server, but another applicable policy resides with HR, and the expense form application sits on the IT intranet. Thus, the enterprise's org chart stands between this weary traveler and her simple reimbursement task.

A major manufacturer changes a product category from "filters" to "furnace filters" when launching of a new line of products and, as a result, the entire enterprise must respond in unison. Unless all the different business units embrace this new labeling simultaneously, customers will get confused across Web sites, channel partners, newsletters, call centers, etc.

A company sells software through multiple channels. Unfortunately-but not uncommonly-the marketing silo tells customers something different than its catalog and support groups do (or vice-versa). Clearly they need a common labeling structure. To support casual site browsers, this requires a unified taxonomy across divisions. Assisting hard-core searchers requires content analysis across silos to describe chunks of information so that they can be retrieved effectively and properly linked together.

Lou Rosenfeld, an IA guru who teaches a highly-respected seminar on EIA, puts his finger on the problem saying, "No one is advocating for users of enterprise-wide content. We just have advocates for business units and this leads to a lot of frustration and confusing duplication."

ENTERPRISE IS HARDER

An enterprise-wide information architecture should define a model for working with disparate business units in a manner that presents employees and customers alike with a unified way of accessing information across the whole company, according to Peter Morville, president of Semantic Studios and Rosenfeld's co-author of O'Reilly & Associates' Information Architecture for the World Wide Web (a.k.a. the "Polar Bear book" for the bear depicted on its cover).

The problem is that, in an enterprise setting, IA evaluation methods, designs, and (especially) governance models become much more complex, but there are no textbooks for practicing IA in large, decentralized environments made up of content silos. Indeed, this very complexity is likely to introduce new, arcane concepts-like ontology, thesaurus, and metamodel-that make EIA seem all that more overwhelming, especially to senior decision-makers.

For example, at an enterprise level, different "types" of metadata rise to the fore, followed by the need for synonyms to achieve effective enterprise-wide search engine results. Rosenfeld cites the example of a multinational corporation whose national divisions employed a plethora of overlapping terms to describe employee time off: Annual leave (Australia), the holidays (U.S.), public holidays (Australia, U.S.), vacation (U.S.), bank holidays (U.K.), holiday (Australia and U.K.), and personal leave (all). What term will any given searcher use? And remember that this international company had it easy-at least those terms were all in English!

So, in order to face this level of complexity, there'd better be a darned good business case.

EIA OFFERS ROI

Rosenfeld notes that every enterprise information system possesses some notion of information or data architecture-albeit usually an independent one. By investing in EIA, companies can increase the ROI on all information applications by exposing the information across the corporation so users can find things faster and managers can do their jobs better.

Put another way, EIA helps users find the right information, which is the whole point of IT investments in the first place. In the words of Joseph Busch, founder and principal of Taxonomy Strategies, content becomes "actionable." For enterprise customers where that action is "buy things," EIA adds tangible value to the top line.

According to Morville, EIA can also add value to the bottom line by reducing costs. Among his clients, Morville has seen huge duplication of effort as different departments deploy various search engines, each configured differently. He also sees substantial redundancy of effort going into navigational structures, which makes it difficult and costly for enterprises to update Web sites as products and services evolve.

OK, EIA is important, but how do you actually practice it?

EIA AS A DISCIPLINE

There may be a dearth of EIA textbooks, but a set of common practices is slowly emerging. Some key issues revolve around:

* Balancing central versus local needs and authority

* Tactical versus strategic approaches

* Understanding (and acting on) the different (but related) EIA problem domains

Let's take a look at each issue in turn.

Centralized versus departmental control represents a struggle as old as the first distributed enterprise. In IA today, it appears that departments are winning for the most part. But there are some approaches to restoring balance.

Even in a highly decentralized environment, EIA specialists will, at a minimum, seek some coordination among different information systems, obtaining consistency in content models where possible (i.e. two departments agree to label first names "fname"), and mapping terms where things must have different labels or categories. In other words, information starts to become integrated at the metadata tier, not the content tier. The idea here is that departments don't have to comply with a central standard, but they do have to expose data models, especially if they want access to information in the central directory.

Peter Hallett, VP of marketing at SchemaLogic, likens a standard enterprise content model to having a municipal plan or blueprint. If you want to construct a building, you know where to connect up the sewer, electricity, phone line, and so forth. That sort of standard infrastructure approach sounds nice as a goal, but Hallett is quick to point out that, given its complexity, the typical enterprise supports multiple taxonomies, each covering a broad swath of enterprise content, ranging from audience, to subject, to processes, and so forth.

Rosenfeld counsels a two-step approach to the problem of multiple taxonomies:

1. Build shallow, broad central taxonomy that answers "where will I find the information I need?"

2. Rely on independent departmental taxonomies to answer "where in this area will I find the information I need?"

The much-lauded Microsoft intranet, MSWeb, was an early adopter of a similar approach two years ago. The intranet team catalogued a collection of taxonomies-some enterprise, some domain-specific-and allowed them to coexist. Like many companies, Microsoft's product taxonomy became sacrosanct, but individual departmental navigational taxonomies were also sanctioned, as long as they were registered centrally. Over time, the various intranet navigation schemes were eventually harmonized as well.

Some U.S. federal agencies are also beginning to take a more centralized approach to metatagging standards. In one of his first acts as administrator of the Environmental Protection Agency, Mike Leavitt challenged his organization's departmental Web managers to "reorganize content to serve users, not your programs," and to develop a high-level, agency-wide taxonomy of 30 terms within six months (to be doubled in another six months).

Like all enterprise projects, EIA has both a tactical and strategic dimension. At a tactical level, it encompasses the standard IA design dcliverables-ranging from wire frames and design guidelines to taxonomies and controlled vocabularies-but promulgates enterprise-wide. These, in turn, can help drive specifications for content management systems and search engine optimization.

The part about enterprise promulgation, though, requires some real strategy, and here's where information architects often need help. Rosenfeld argues for a formal EIA governance structure centered around an independent team, complete with charter, staff, and budget. In such a model, the group might serve as a self-funded consulting resource to the rest of the enterprise.

Centralized resources are handy, confirms Morville, because "different business units are going to be at different stages at different times and one way to accommodate that is to see those needs as market demand: One business unit might want different things than another."

But at the strategic level, companies must deal with corporate politics. Morville says, "Trying to get major stakeholders on the same page for navigation and search support-given dozens of different business units who control the actual content-requires a skill set that most information architects don't have." he suggests educating and winning over a senior manager who can communicate with peers and connect tactical detail with enterprise objectives.

Assuming that a company creates an EIA group or governance board, it needs a charter, which raises the question of scope. What should fall under its purview?

Rosenfeld has attempted to create a generic EIA roadmap that, while he worked hard to fit it on a single page, is still pretty intimidating. Rosenfeld admits as much. "The roadmap doesn't make sense for all enterprises, and some may be unworkable in your environment," he says, "but you should have an understanding of your options." And, of course, recognize relationships among the choices you make.

Rosenfeld's roadmap does have practical applications. It suggests, for example, that doing a content analysis before conducting a content inventory is premature. "I think the roadmap sums up the challenge," confirms Victor Lombardi, a former IA lead at Razorfish and now an information architect at a major New York City financial services firm. "Taxonomy is here, and search is there, and the roadmap shows all the steps we need to take over the next three years-listed on one page-that's pretty remarkable," says Lombardi.

IMAGE CHART 3

Lou Rosenfeld's EIA roadmap, from "Very Soon" to "Way Off"

But as a practical matter, where to begin?

GETTING STARTED

If EIA sounds intimidating, consider the following practical advice. For veterans of epic enterprise content management campaigns, many of these suggestions will sound quite familiar.

Invest in search and very lightweight metadata in the nearterm. For search, the most important metadata attribute may be page or document title. "If you can standardize on implementation and naming conventions, you can improve users' experience dramatically," according to Rosenfeld. Simple search log analysis can persuade skeptical business managers that searchers are looking for information across silos-though you will likely need IT help to merge and analyze disparate logs.

Recognize the political implications of ElA. When trying to reconcile taxonomies, start at a broader level first and focus on official products, services, and other public-facing terms. Lombardi preaches, "Follow the politics." At his super-distributed financial services firm, he is trying to create an EIA process that is friendly to the firm's unique culture, emphasizing handbooks over rule-books and directories over gospels. In fact, Lombardi's firm is so decentralized that individual business units have complete freedom over product naming conventions. So Lombardi built a guide to all the products, classifying them and linking to their URLs.

Publish guides. Sometimes this is the best way to help people find stuff. Rosenfeld defines a guide as a "single page containing a selective set of important links embedded in narrative text that address important, common user needs." A guide might highlight a specific topic or help that corporate traveler complete her expense reimbursement process. Guides are easy to create, and, Rosenfeld notes, they minimize political headaches by creating new real estate rather than redeveloping old turf.

Consider starting with one business unit or Web site at a time. Look for high-value content that is heavily trafficked to give your pilot project some visibility. Mike Lee, the manager of creative Web development for the AARP.org site, took this approach. Over the past year, he and his team have been converting the organization's main site into a content management system and registering the content with rich metadata. Like most enterprises, AARP has several Web sites, but now Lee has a master information template to work from.

Don't be a control freak. "IA inherits a lot from Library and Information Science, which is much more about controlling the organization of content," notes information architect Lombardi, echoing a common lament among EIA specialists. "We need to be more like Google, and help content organize itself," he adds. AARP's Lee points out that, at a minimum, an IA specialist can serve simply as a resource to other departments struggling to figure out how to work together.

IMAGE TABLE 4

Companies Featured in This Article

GET VERTICAL

Thus far, we've been discussing working horizontally across multiple silos to create common content models and directories. But enterprises also face a vertical IA problem too: converging legacy data that's often buried in legacy systems with front-end, (typically customer-facing) unstructured content. For example, you might want to link a product or ingredient (data) with a recipe (content).

At Taxonomy Strategies, Busch has worked on a couple of commercial projects where the enterprises were very interested in integrating back-end supply chain data stores with their public ecommerce Web content. In one case he says, "The IA people had designed a lovely user experience but could not make integration happen because they couldn't bridge to legacy data," since no one had aligned the two systems' metadata.

The gap between data modeling and information architecture is part cultural but still quite real. It often manifests itself as a puzzling bifurcation between content and services in many enterprises, including on large government Web sites.

There is a growing sense that to do EIA, IA specialists are going to have to step up and deal with data and legacy IT systems and issues, instead of just content and user experience. To be sure, this will be a difficult (and perhaps not always welcomed) transition for those without an IT background. Busch also points out that, "data architecture(DA) people have the opposite problem: just because you have good clean data, it doesn't mean you're going to have a good user experience." Clearly there is a need to meet in the middle, with IA and DA specialists cooperating to develop truly enterprise applications.

This is particularly the case in financial services, where lots of data resides in enterprise systems that remain far removed from customer interfaces. Lombardi, who practices IA in a large financial services firm, sees much of his work as decreasing the distance between the enterprise and its various customers. "You can have the most organized thesaurus," he says, "but if it doesn't serve the customer, it has no effect."

Lombardi has found that making information models customer-centric tends to take control away from business people-or is at least perceived that way-and some managers become uneasy about this. A good EIA strategy might reward information managers who think about the customer first and the business unit second the way the EPA's Leavitt has mandated.

WHERE'S THE EIA?

If EIA is so great, then why is it apparently practiced so little? There are several plausible reasons.

Many IA specialists concede a generalized failure across the industry in getting the message of enterprise IA across to senior corporate leaders in ways that would spawn more effective champions and project sponsors. Says AARP's Lee, "As the discipline grows, we are going to have be able to evangelize our techniques to a room full of C-level leaders. Lee recounts a breakthrough of his own when he persuaded the CFO at a major financial institution to learn about the role of "user personas" and support their application in the enterprise.

Ironically, EIA also suffers from a lack of expensive software being associated with it. Without a big budget line item (and major implementation) at stake, EIA tends not to command the attention of senior sponsors. As a result, IA staffers get tacked on to heavyweight ECM projects, sometimes as an afterthought.

Some responsibility for the dearth of EIA activity also lies with IA specialists themselves. There is a bit of a tendency in the IA community to over-invest precious energy in KM-csque intellectual debates about ontologies and topic maps, when thought and research could better be applied to more pressing issues, like how to build compelling business cases for a corporate EIA team.

THE FUTURE OF EIA

Still, some positive change is afoot. A recent survey by the Asilomar Institute for Information Architecture (AIfIA) found significantly more IA specialists working at big companies than in the two years previous. In researching this article, a handful of major enterprises were reluctant to discuss their own EIA successes; clearly they see their hard-earned EIA achievements as a competitive advantage. Doubtless there is more good EIA work going on out there than is generally known.

Amid nearly endemic corporate reorganizations and priority shifts, departments today get buffeted a lot. A canonical reference repository or unified content model can serve as a force for continuity and adaptation that an enterprise can rally around to make sure its most valuable and sought-after information continues to be readily available to staff and customers alike.

IA guru Rosenfeld believes that most enterprises are going to get this right at some point. His only question is will it take three to five or ten to fifteen years? Surely a solid EIA strategy, effectively communicated to company leaders, will accelerate the process.

SIDEBAR

A Select Glossary of IA Terms

The IA community, characteristically, has come up with its own common glossary of terms, excerpted here from the IA Wiki:

Bottom Up: A process of developing an information architecture based on an understanding of the content and the tools used to leverage that content (e.g. search, indexes). This involves the creation of building blocks, the databases to contain them, and the procedures for their maintenance.

Card Sorting: A user-centered design method to discover the inherent categories of collections of content.

Content Inventory: A complete list of all the content that the information space holds and will hold.

Controlled Vocabularies: A collection of preferred terms that are used to assist in more precise retrieval of content. Controlled vocabulary terms can be used for populating attribute values during indexing, building labeling systems, and creating style guides and database schema. One type of a controlled vocabulary is a thesaurus.

Labeling: The systematic application of terms used to describe content objects. A Controlled Vocabulary can be used to develop appropriate labels.

Metadata: A definition or description of data, often described as data about data. For example, the data of a newspaper story is the headline and the story, whereas the metadata describes who wrote it, when and where it was published, and what section of the newspaper it appears in. Metadata can help us determine who content is for and where, how, and when it should appear.

Ontology: Resembles faceted taxonomies but use richer semantic relationships among terms and attributes, as well as strict rules about how to specify terms and relationships. Because ontologies do more than just control a vocabulary, they are thought of as knowledge representation. The oft-quoted definition of ontology is "the specification of one's conceptualization of a knowledge domain."

Taxonomy: A set of controlled vocabulary terms, usually hierarchical. Once created, it can help inform navigation and search systems.

Top Down: The process of developing an information architecture based on an understanding of the context of the content and the user's needs. This involves determining the scope of the site and the creation of blueprints and mockups detailing the grouping and labeling of content areas.

User Personas: A user archetype you can use to help guide decisions about product features, navigation, interactions, and even visual design.

Wire Frames: A rough outline of page elements and their arrangement within the page.

SIDEBAR

Where are the Tools?

Since Enterprise Information Architecture can become quite laborious, you might think a thriving market for EIA automation tools would emerge. Think again.

IMAGE ILLUSTRATION 5

Participants sort terms by dragging them from the left pane to the right in EZSort.

Participants using Boiko's custom collaboration tools can hold an asynchronous discussion regarding a change to a vocabulary term.

SIDEBAR

Enterprises are investing in content integration applications, although that's not the same as EIA. Point-to-point integration applications (often hornegrown) can allow two systems to share content, but this tight coupling tends to become fragile in the face of business change and the nearly inevitable need to draw other repositories into the mix. Enterprise Content Integration (ECI) software is designed to solve this problem by creating a "virtual abstraction layer" above diverse information sets [see "Content Integration", EContent, March 2003, pp. 26-31], but successful implementations typically presume that the enterprise has worked out some sort of underlying reference model to describe and find the content intended to be shared or exchanged. In short, ECI depends on EIA.

Collaboration Tools

Since much of EIA revolves around obtaining group consensus, there could be a role for specialized collaboration tools. IBM recently released a beta version of EZSort, a client application to help companies organize information based on users' expectations gathered from card-sorting exercises. This is potentially useful, but card sorting represents only a very small piece of the EIA puzzle.

CMS Bible author Bob Boiko, now a professor at the University of Washington, has also developed a collaboration utility with some of his students. Boiko has been around the EIA block too many times to believe that companies can easily create a common taxonomy, even when people are talking about the same thing. His Web-based tool assumes that business units in the same enterprise still use different vocabularies and simply tries to create a kind of metadictionary to relate terms.

In short, the application allows a set of people who implicitly share a common vocabulary to co-create an explicit vocabulary. You can't buy this software; it was simply a feasibility study. Through experiments, Boiko's team found that collaboration was indeed feasible, but automation adherents take note: it still took a lot of attention and energy on behalf of the participants.

SchemaLogic

One software company, SchemaLogic, has addressed the EIA marketplace head-on. The company's product, SchemaServer, manages taxonomies and vocabularies in shared repositories. It imports, reconciles, stores, and makes those models available to departmental subscribers, who can import pieces into their systems.

As you might imagine, reconciling is the tricky part. Whenever a change to a vocabulary is suggested, SchemaServer supports an annotated consent process that includes voting and owner veto where necessary. In the end, though, there is still manual work for the various systems representatives to reconcile the changes, ideally coordinated by an IA specialist who is administering the whole process.

IMAGE ILLUSTRATION 6

Participants sort terms by dragging them from the left pane to the right in EZSort.

Participants using Boiko's custom collaboration tools can hold an asynchronous discussion regarding a change to a vocabulary term.

SIDEBAR

The software sounds fascinating, but it may be a measure of the relative immaturity of EIA that the company can boast only one implementation in production after more than a year of promoting the product. (SchemaLogic says several other firms are piloting the software.)

Peter Hallett, the company's VP of marketing, remains optimistic, "If you build an infrastructure where people can share content structures and then see how each other system is using similar or even overlapping information structures, then it is not a far step to getting to core taxonomies or vocabularies. So, when anything is changed, that impact is known and people can have access to it, and subscribing systems can get the new labels as a web service."

Most EIA specialists maintain a healthy skepticism for automated solutions. "I haven't seen any good EIA tools except those that help with project management," harrumphs Rosenfeld. But Taxonomy Strategy's Busch cautions that if you are serious about EIA, you will likely need some kind of enterprise reference data repository or tool, even if it's just MS Excel.

IMAGE ILLUSTRATION 7

Representatives from different information systems within an enterprise use SchemaServer to "vote" on changes to a shared vocabulary term.

AUTHOR_AFFILIATION

TONY BYRNE (tbyrne@cmswatch.com) is founder and principal of CMSWatch (www.cmswatch.com) and author of the CMS Report, now in is 5th Edition.

Comments? Email letters to the editor to ecletters@infotoday.com.

In addition, make sure to read these articles: