Small Business Resources, Business Advice and Forms from AllBusiness.com

DEVELOPMENT OF A WEB SITE USABILITY INSTRUMENT BASED ON ISO 9241-11

INTRODUCTION

Web site usability is an important issue in e-commerce. It was the redesign of the Staples Web site, due to usability testing, which helped decrease drop-oft' rates in the registration process by 25 percent. Statements about the importance of Web site usability come at a time when

consumers are increasing spending at B2C Web sites. The continuing growth of ecommerce business-to-consumer sales remains apparent. Ecommerce estimates for the first quarter of 2004 from the Census Bureau of the Department of Commerce found that online sales were $15.5 billion, representing a 28.1 percent increase over 2003's first quarter sales. More recently, the Holiday eSpending report from Harris Interactive, Goldman Sachs and Nielsen NetRatings indicated that online business-toconsumer sales rose by 19% in November 2004 over November 2003.

However, studies have reported that current websites contain numerous usability problems (23). Difficult-tounderstand formats, difficulty in navigation, disorientation, and lack of interaction and reliability are frequently mentioned problems (38). ?-commerce experts agree that poor website design is one of the major reasons for recent dot.com failures, and over half of online traffic was driven away due to poor website design (38).

This paper attempts to better understand Web site usability by developing an instrument based on the often-references ISO 9241-11 standard for usability. A literature review on usability and Web site usability is followed by an explanation of the methodology used, the results of the e-commerce simulation, and discussion of the results.

LITERATURE REVIEW

Usability Research

Usability is the most traditional concept in HCI research. Usability can be defined as a "measurable characteristic of a product's user interface that is present to a greater or lesser degree" (19). There are multiple attributes and definitions of usability, including learnability, efficiency, memorability, control of errors, and satisfaction (25).

Usability testing is a process for determining what problems the user may have in using the system (32). One appropriate methodology is contextual design. Contextual design is a customer-centered approach to designing products. It is based upon customer data gathered in the field and models how a customer works (6). According to Binstock (7) "two factors central to building usability into applications are interaction design and usability testing. Both practices seek to ensure the user's experience with the software is consistent with expectations; that the use of the software is intuitive; and that there's no needless obstacles to successful completion of the transaction."

Others have tried to define the factors that must be considered when evaluating a system. Reiter and Oppermann (34) believed a complete evaluation of human-computer interaction must consider the user, the tasks, the computer, the organization, and the relations between them. Bevan and Macleod (5) used more broadly defined terms for what should be included in evaluations of information technologies: tasks, equipment, and environment.

Usability must extend beyond the issues of ease of use, ease of learning, and navigation (35). With the emergence of the Internet there has been additional pressure for the design to be intuitive because often there is no opportunity to train customers to use the software, and users easily leave a Web site for another if unsatisfied.

Usability Definitions

The term usability has been used in many different ways, making it a very confusing concept. Seffah and Metzker (40) explain that "usability refers to both a set of independent quality attributes such as user performance, satisfaction, and learnability, or all at once, making it very difficult to precisely measure usability." The different viewpoints have led to different definitions and standards. Without consistent terminology it is difficult to examine the concept of usability (40).

Some of the more frequently used definitions of usability are that of ISO 9241-11 and Jacob Nielsen (25). The ISO 924111 standard says usability is: "The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use" (15). Although widely quoted, the ISO 9241-11 definition has done little to help us understand what usability actually means.

Quesenbery (32) notes three important criticisms of the ISO 9241-11 definition: (1) It is too focused on well-defined tasks and goals, either ignoring the less tangible elements of user experience or forcing simplistic definitions of tasks (such as reducing an e-commerce site to the simple task "buy things"). (2) The emphasis on efficient and effective as the most important attributes of an interaction make it difficult to talk about how usability applies to products or context where these are less important. Work that looks at pleasure, engagement, or other difficult to measure emotional aspects are often defined as "beyond usability." (3) "Satisfaction" is not a robust enough term to cover the needs in many situations. Based on Quesenbery's criticisms, it appears the ISO 9241-11 definition of usability may have been acceptable in a context of enterprise or other work-related applications, but in the consumer world of shopping, information-seeking and online services, the ISO 9241-11 definition is not a broad enough view of human interaction to describe the usability goals of either the users, or the business (32).

Quesenbery (32) built on Nielsen's definition of Web site usability as well as the ISO 9241-11 to develop five dimensions which can be used in a Web site setting as well as for software development. Quesenbery's five dimensions of usability include effectiveness, efficiency, engagement, error tolerance, and ease of learning.

Empirical Tests of Website Usability

The idea of website usability has been around for some time. Only recently, though, have there have there been attempts to develop and test website usability as a theoretical construct (1, 28, 43). According to Palmer (28), the measure of what users want in a website is an important area of study because the website is a primary user interface for net-enabled business, information provision, and promotional activities (2, 16,39).

Agarwal and Venkatesh (1) presented categories and subcategories incorporating the Microsoft Usability guidelines, while developing an instrument that operationalizes website usability. Their findings suggested that the evaluation procedure, the instrument, as well as the usability metric exhibit good metric properties. Palmer (28) also created a metric/instrument for the study of website usability (see Figures 2-3). Palmer suggested that robust metrics can be obtained from multiple sources to identify key usability elements for website design. Building on earlier works in usability (23, 31, 38), Palmer developed a set of constructs that suggest a focus on download delay, organization, and navigation as well as media richness (8) with interactivity and responsiveness constructs (29, 45). Palmer's (28) study found that website usability factors (download delay, navigability, content, interactivity, and responsiveness) are important in explaining the success of websites. Although Agarwal and Venkatesh ( I ) and Palmer (28) developed good starting points for studying website usability empirically. Green and Pearson (12) found both instruments needing modification for use in e-commerce research. Other studies examined similar variables such as download time, navigation, graphics usage, and interactivity (48).

Web Site Usability as a Valid Research Construct

There have been several studies that have attempted to explain or predict online customer perceptions within a business-to-consumer (B2C) setting. B2C e-commerce is the ability of consumers to purchase products and services online using Internet technologies and associated infrastructure (27). For example, McKinight, Choudhury, and Kacmar (22) proposed a Web-customer satisfaction model which includes several usability constructs (e.g. navigation and interactivity), finding significant influence on online customer satisfaction. Szymanski and Hise (44) also proposed an online customer satisfaction model which includes ease of navigation as a usability construct. Devaraj, Fan, and Kohli (9) included supportability as a part of their B2C channel satisfaction and preference model. However, these studies selected and included only a small number of Web site usability constructs into their models, and therefore, the full effects of web site usability on online customer perceptions were not precisely captured.

Recently researchers have begun proposing theoretical models of Web site usability (1, 17, 18, 28, 41). For example, based on the analogy of Web sites as buildings. Kirn et al. (17) adopted a theory of architectural quality to measure web site architectural quality including firmness, convenience, and delight. Singh et al. (41) adopted Kaplan's model of landscape preference to explain web site preference assuming that Web site preference is similar to that of landscape (or place) preference.

Although these models provided a better understanding of the effects of Web site usability constructs, they examined only direct effects of Web site usability constructs on online customer perceptions, not examining indirect effects through other usability constructs. Recently, some empirical evidence on these relationships was provided by Web site usability researchers. For instance, Norman (26) pointed out that simplicity is positively related to ease of navigation, but negatively related to interactivity. Nielson (23) also identified a positive influence of consistency on navigability and a negative influence of simplicity to interactivity. While researchers requested further studies to examine those relationships (17), there has been no study that systematically examined several indirect relationships to Web site usability or acceptance.

Previous studies have found that a usable web site creates a positive attitude toward online stores, increases stickiness and revisit rates, and eventually stimulates online purchase (3). Browsing behavior has also been examined, but is very difficult to quantify (42). A delay occurs for browser users when a user clicks on a hyperlink and nothing seems to happen for several seconds. Several studies have determined delay to be one of the most important aspects of e-commerce quality (21, 46, 47), seriously interfering with a site's usability (43).

Fui-Hoon Nah (11) examined another aspect of Web site usability, finding that the availability of feedback prolongs Web users' tolerable waiting time, reducing uncertainty about the wait. An experimental study also found that most users are willing to wait for only about 2 seconds for simple information retrieval tasks on the Web.

Hall and Hanna (14) conducted an experiment that resulted in several findings for Web site usability: colors with greater contrast ratio generally lead to greater readability, color combination did not significantly affect retention, preferred colors (i.e. blues and chromatic colors) led to higher ratings of aesthetic quality and intention to purchase, and ratings of aesthetic quality were significantly related to intention to purchase.

Web Site Usability Instrument

The current study used Quesenbery's definition and dimensions of Web site usability as a foundation for building an appropriate instrument. Quesenberry defines Web site usability by building on the ISO 9241-11 definition. Web site usability is defined as the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, engagement, error tolerance, and ease of learning in a specified context of use.

Effectiveness is the completeness and accuracy with which users achieve specified goals. It is determined by looking at whether the user's goals were met successfully and whether all work is correct.

Efficiency is the speed (with accuracy) with which work can be done. Efficiency may be a subjective judgment of when a task is taking "too long" or "too many clicks." Quesenbery (32) notes that navigation design elements such as keyboard shortcuts, menus, links and other buttons all have an impact on efficiency. When they are well-designed, with clearly expressed actions, less time and effort are needed for the user to make navigation and action choices.

Engagement is defined as how pleasant, satisfying or interesting an interface is to use. "Engaging" replaces "satisfaction," looking for a word that suggests the ways that the interface can draw someone into a site or a task. It also looks at the quality of the interaction, or how well the user can connect with the way the product is presented and organized.

The error tolerance of a Web site is how well the product prevents errors, and helps the user recover from any errors that do occur. It would be great to say "error free" or "prevents errors" but mistakes and accidents and misunderstandings will happen. You misread a link and need to find your way back, or enter a number with a typo. The real test is how helpful the interface is when an error does occur.

Ease of learning refers to how well the Web site supports both initial orientation and deeper learning. A Web site may be used just once, once in a while, or on a daily basis. It may support a task that is easy or complex; and the user may be an expert or a novice in this task. Every time it is used, the interface must be remembered or relearned, and new areas of the Web site may be explored over time.

Quesenbery (32) attempted to consolidate important usability concepts. Based on the 5 e's of Web site usability, a study was conducted to examine the dimensions of usability and its ability to predict Web site performance.

PROCEDURE

Although Web site performance is important to several target populations, the current study will focus on the online consumer in accordance with the Pavlou (30), Palmer (28), and Aganval and Venkatesh (1) studies. The study utilized 375 undergraduate business students from a large Midwestern university.

Students were directed to an e-commerce scenario. The scenario involves the process of searching, selecting, and inquiring about products available from the selected Web retailer (Sears.com). In the task, each participant was asked to find all the items of a shopping list in the Web site. After the subject put all the items into a shopping cart, he or she was required to go through the checkout process up to the point of actually buying the product. The participants then completed a survey contain questions related to the five dimensions of Web site usability in a 7-point (1-Strongly Disagree - 7-Strongly Agree) Likert-type scale (Table I)(I).

ANALYSIS

Exploratory Factor Analysis

When developing a measure for Web site usability, Quesenberry (32) provides categories which give a starting point for understanding the underlying factors. Exploratory factor analysis is appropriate for this study because it seeks to uncover the underlying structure of a set of variables. The researcher's ? priori assumption is that any indicator may be associated with any factor. The exploratory, factor analysis will allow the development of Web site usability as a valid construct for further research or statistical analysis such as regression by showing the variables exhibit acceptable discriminant and convergent validity.

Inspection of the correlation matrix found a significant relationship between each variable, supporting the use of factor analysis. The KMO was .926, supporting the adequacy of the sample for use in factor analysis. The Bartlett test of sphericity was used to determine the appropriateness of factor analysis on the entire correlation matrix. The Bartlett test of sphericity also examines the presence of correlations among the variables. For the present data, the Bartlett test of sphericity was highly significant, x^sup 2^= 3364.826, p<.000, with 136 d.f, suggesting that the variables are appropriate for factor analysis.

Deriving Factors and Assessing Overall Fit

Extraction and retention of the factors is the next step in the factor analysis of the data set under study. A principal component analysis was used. Three factors were retained based on the Latent Root Criteria (Eigenvalues > 1). Eigenvalues are the proportion of total variance accounted for by each factor. 82.83% of the variance in the 5 variables is explained by the three factors, based on the related Eigenvalues.

Rotation of the factor structure is used to interpret the factors in the order of importance. A Direct Oblimin rotation was used to develop theoretically meaningful factors for Web site usability. Each of the factor loadings each exceeds .40, which is considered significant (13). It is also important to examine the communalities, which represent the amount of variance in an individual item that is accounted for by the factor solution. Each of the communalities for the sample is above .5, the acceptable threshold for explained variance.

The Direct Oblimin rotation resulted in the retention of four factors. The retained four factor structure resulted in items with factor loadings greater than .4 with a very clean structure, voiding the need to delete items and rerun the analysis (Table 2). Three of the original factors, effectiveness, error tolerance, and ease of learning, were discriminant factors, holding to their original form. The other two factors, efficiency and engagement, loaded together onto a single factor. Based on the definitions and questions being asked, it appeared that a better label be given for the new factor labeled efficacy. According to Webster's dictionary, efficacy is the power to produce a desired effect. Efficacy relates to Quesenbery's definition of efficiency in that speed is the ultimate outcome that is expected. Efficacy is also appropriate for the engagement category in that it involves the pleasantness and satisfaction of the Web site. Again, both of the measures are measuring individual perceptions of a Web site, therefore the efficacy label seems appropriate in this measure, where the user's intended goals are so important to understanding the usability of a Web site.

The reproduced correlation matrix computes the correlation between the observed and the reproduced correlations. There are 42 (30.9%) nonredundant residuals with absolute values greater than 0.05. The 20% is acceptable because it is below the acceptable level of 50% (13). Thus, we can conclude that there was not a significantly large difference in the residuals between the observed and reproduced correlations, supporting the appropriateness of continuing analysis with this data.

In summary, an exploratory factor analysis was used to identify the underlying factor structure of Web site usability based on the theorized categories by Quesenberry (32).

Nomological Validity

Multiple regressions were performed to demonstrate the instrument's nomological validity. As shown in Table 3, only efficacy was found to be a significant predictor of intention to return to the Web site (20), an important measure for ecommerce sites (47). The instrument explained 12.3% of the variance of intention to return to the Web site. Table 4, shows that error tolerance was found to be a significant predictor of intention to make a transaction with the Web site in the regression model, another important evaluation of the consumer's experience with an e-commerce Web site (4, 30, 49, 50). The Web site usability instrument explained 2.8% of the variance of intention to transact with the Web site.

DISCUSSION & CONCLUSION

The regression results demonstrated that although some nomological validity of the instrument was present, the adjusted R2 for both regressions were rather small. The results tell us one of two things. Either the instrument does not fully encompass all the dimensions of Web site usability or else Web site usability is not as important a factor for retaining potential customers as thought. There were limitations to the study which may provide some insight for the results. The study used students, and although deemed acceptable for use in e-commerce research, there still remains a doubt about the representativeness of the sample. The participants were also told which items to search for on the Web site, and given a realistic scenario, but one that may not have applied to their interests or needs.

This study shows that Web site usability, at least as defined by ISO 9241-11 does not account for all of the variance of a shopper's intention to make a transaction with a particular site. Other factors, such as provision of information on the firm, presence of FAQ section, use of multimedia, user accounts, security, and privacy statements have already been shown to significantly impact online sales (33). In addition, factors such as price, brand, variety, tax, fee, delivery, and shipping cost are likely to be important criteria as well for users of B2C Web sites.

One must note that other measures of Web site usability have been developed that have accounted for larger proportions of the variance explained in measures such as intention to return to a site and intention to make a transaction, although this study is the first to attempt a strict adherence to the ISO 9241-11 standard for usability in the domain of the Web. Questions about a better definition for Web site usability should be debated. Studies examining more comprehensive models of the dimensions of Web site usability are necessary to further the use of the Web site usability construct in research and for practitioners. Other studies should examine other categories of Web sites in addition to B2C Web sites such as were examined in this study. Other categories such as news/information oriented Web sites (37), B2B sites, as well as Investor related sites are just a few of the potential categories of Web sites that should be explored in future research.

In addition, make sure to read these articles: