Small Business Resources, Business Advice and Forms from AllBusiness.com

A comparison of two methods for scoring an in-basket exercise.

By Strubler, David S.
Publication: Public Personnel Management
Date: Thursday, September 22 2005

The in-basket exercise has been successfully used for decades by a wide variety of organizations for selection and management development in both the public and private sector. (1,2,3,4,5) An in-basket was one of the exercises in AT&T's pioneering Assessment Center. (6) It is now one of

the most commonly used situational exercises, (7,8,9) and is often used outside assessment center programs. (10) However, because each candidate's responses to the in-basket items must be evaluated by trained assessors, the cost of using an in-basket exercise may discourage organizations from using it, despite its success in predicting performance in management jobs. (11,12,13) If an easier-to-score in-basket exercise that still retained the situational test format of a traditional in-basket could be developed, organizations might make greater use of this well-established selection and management development tool.

The purpose of this study was to compare two methods for scoring an in-basket. The first method involved a traditional in-basket exercise during which participants wrote down the actions they would take on each item. The second method consisted of a multiple-choice in-basket test based on the same collection of in-basket materials.

The typical in-basket contains a collection of items of varying importance and priority that managers find in their in-baskets, such as phone messages, memos, and documents, and the candidate must indicate what action they would take on each item. (11,14,15,16) Some of the items may be interrelated to add complexity, and there is also generally a time limit, which puts candidates under some time pressure to handle all of the items. It is a simulated work task designed to measure performance on work that managers typically do, so it has high face validity for candidates. (17) The collection of items in the in-basket are usually targeted to a specific job or they can be made very general, including the kinds of items that any manager might deal with. (18) Trained assessors score the exercise by coming to consensus on ratings on performance dimensions such as prioritization, decision making, delegation, organization, and interpersonal skills, or on some overall measure of performance such as "exercise effectiveness." (19)

A few researchers have experimented with alternative scoring methods primarily designed to make scoring faster and easier to do with large numbers of candidates. Felix M. Lopez, former chairman of the Educational Testing Services' Executive Study Conference, experimented with a 111-item multiple-choice questionnaire for an in-basket developed for the fictional AMA Company, as part of the American Management Association Management Course. (9) Betty Salem, Don Ellis, and Douglas Johnson developed an in-basket for a promotion test for police sergeant (which the Civil Service Commission ruled was legitimate after an official protest over its use), consisting of multiple-choice questions relating to organizational, decision making and administrative skills. (2) A. Ralph Hakstian and Karen P. Harlos, in one of a series of studies on alternative in-basket scoring systems conducted at the University of British Columbia, used a multiple-choice test to score one of the eight performance dimensions measured by the in-basket. (20) Gerald A. Kesselman, Felix M. Lopez, and Felix E. Lopez of Lopez Assessment Services, used a kind of checklist of possible actions (participants could check more than one) to score an in-basket exercise. (21) Richard C. Joines, president of Management & Personnel Systems, originated an "item-by-item" approach to scoring in-baskets. Each item was scored using a detailed scoring key that was supported by criterion-related validation. Some items were designated as priority items and an extra point was awarded if the candidate completed the item. This approach reduced scoring time to less than 30 minutes per in-basket and increased scoring reliability to the 0.90 range. (18,22) Dennis A. Joiner, a consultant who has specialized in the use of assessment center method since 1977, described a similar "item approach" for an in-basket developed for the New York Department of Corrections. Subject matter experts specified "must dos," "nice to dos," and "should not dos" for each item, and each item was scored by comparing the responses to the expert judgments.'" None of these studies reported tests for adverse impact.

To test whether equivalent forms of an in-basket could be developed, a collection of in-basket items must be given to two groups. In the traditional in-basket group, participants indicate the action they would take on each item by writing their responses on a worksheet. In the multiple-choice in-basket group, participants indicate the action they would take by choosing one of the multiple-choice responses. If the two forms of the test are equivalent, the mean scores for the two groups should not be significantly different. Additionally, the two forms of the test should not have an adverse impact.

Method

Participants

One hundred sixty-four MBA students (52 female, 99 male and 13 unreported) in two management courses at two medium-sized universities in the Midwest were volunteer participants in a study as a class project. In Sample 1, participants were in three different management courses--two on campus and one extension site of the first university. In Sample 2, participants were in a traditional management course at the second university, or at a distance learning management course at the second university, with students at 40 learning centers. The mean age of the pooled sample was 32.4 years, with 10.6 years of work experience and 3.8 years of management work experience.

Procedures

A subject matter expert was used to assist in the development of the in-basket for the job of controller in a medium-sized construction firm. This expert has been in the construction field for 13 years, and has been a controller for 15 years. Several interviews were conducted to obtain information about the nature of the controller's job, including sample documents and common critical incidents. Based on that information, 25 items were created for the in-basket, and a multiple-choice test was created. The subject matter expert reviewed the final version of the in-basket for realism, and the actions she indicated she would take were used as the answer key. The in-basket scenario, an example item, the multiple-choice question for that item, and the in-basket worksheet are shown in Figure 1.

Participants were randomly assigned to either the traditional or multiple-choice in-basket conditions. Booklets were given to each participant containing the instructions, a scenario to set up the exercise, an organizational chart, 25 in-basket items, and either a worksheet to record their actions (traditional in-basket) or the 25-item multiple-choice test (multiple-choice in-basket). Participants were given the instructions orally, including a reminder of the 30-minute time limit, then told to begin working on the exercise. At the end of the time limit, the booklets were collected and participants were debriefed about the purpose of the study and the possible results.

Scoring the In-Baskets

For the multiple-choice form of the in-basket, participants were given one point for each keyed correct action, according to the subject matter expert. The sum of these points was the dependent variable "action." For the traditional in-basket, participants were given one point for each keyed correct action on the in-basket worksheet by the authors separately, who then came to consensus on each one. In both conditions, participants also rated the priority of each item as "vital," "important," or "trivial." Participants were given one point for each keyed correct "priority," again according to the subject matter expert. The sum of these points was the dependent variable "priority."

Results

Sample Differences

As a procedural check of the random assignment of subjects to conditions across the multiple data collection sites, Analyses of Variance were calculated to test for mean differences between sites on each of the continuous demographic variables, comparing the traditional and multiple-choice in-basket conditions. There were no significant differences between the participants in the two experimental conditions for age (F (1,147) = 0.64, p = .427), work experience (F (1,147) = 0.50, p = .480), or for management experience (F (1,147) = 0.22, p = .639), indicating that the two experimental groups were equivalent.

Before the two samples were pooled, ANOVAs were calculated on the continuous demographic variables and the two dependent variables to see whether there were significant differences between the two samples. There were significant effects for age (F (1,147) = 20.37, p < .001, means: Sample 1 = 28.6 Sample 2 = 34.1); work experience (F (1,147) = 17.15, p < .001, means: 6.9 and 12.3); and management work experience (F (1,147) = 8.09, p = .005, means: 1.9 and 4.7); but not for action (F (1,147) = 0.75, p = .387, means: 9.6 and 9.1); or for priority (F (1,147) = 2.09, p = .151, means: 12.4 and 13.1). Although the two samples from the two different universities were significantly different on the demographic variables, there were no significant differences between the samples on the dependent measures, indicating that the participants in both samples responded similarly to the in-basket situation. Therefore the samples were pooled and all of the analyses that follow are based on the pooled sample.

Correlations with Demographic Variables and Dependent Variables

Correlations were calculated among age, work experience, and management work experience, as well as the dependent variables "action" and "priority." Not unexpectedly, age significantly correlated with work experience (r = .95) and management experience (r = .71), and work experience correlated with management experience (r = .74). Action and priority did not correlate (r = .11), indicating that the two dependent variables were measuring different aspects of performance in the in-basket. Work experience correlated with priority (r = .24), but not with action (r = -.04). Although work experience correlated with priority (r = .24), management experience did not (r = .19).

Tests for Equivalence of the Traditional and Multiple-Choice In-Basket

The traditional in-basket and the multiple-choice in-basket conditions presented participants with the same stimulus materials, the only difference between conditions was whether participants wrote down their actions on a worksheet in the traditional in-basket condition or answered the multiple-choice test items in the multiple-choice in-basket condition. In both conditions, participants chose "vital," "important," or "trivial" as the priority for each item. To test for equivalence of forms (traditional and multiple-choice in-basket), an ANOVA was calculated with forms as the grouping variable and action and priority as the dependent variables. The Multivariate Analysis of Variance test for mean differences on both dependent variables taken together was significant (F (2,161) = 8.17, p < .001), indicating that the two forms were not equivalent. However, the Analysis of Variance showed there was a significant difference between the two forms only for action (F (1,162) = 16.20, p < .001), the two forms were equivalent for priority (F (1,162) = 0.02, p = .901). The mean score on action was 9.99 for the multiple-choice form and 8.29 for the traditional form, while the mean score for priority was, respectively, 12.89 and 12.86. The results of these analyses are shown in Table 1, and the means and standard deviations for participants in each condition are shown in Table 2.

Adverse Impact

To test whether either the traditional or multiple-choice in-basket had an adverse impact on women, an ANOVA was calculated with sex as the grouping variable and action and priority as dependent variables. There was no adverse impact on women for either form. For the multiple-choice in-basket, there was no significant difference between women and men for action (F (1,82) = 0.15, p = .704) or priority (F (1,82) = 1.25, p = .267), nor was there for both action and priority together (F (2,81) = 0.80, p = .452). For the traditional in-basket, there was also no significant difference between women and men for action (F (1,65) = 0.24, p = .628) or priority (F (1,65) = 0.22, p = .644), or for both action and priority together (F (2,64) = 0.23, p = .792).

To test for adverse impact on race minorities, the same tests were calculated for Asians, Blacks, and Hispanics. No adverse impact was found; there were no significant effects of race for Asians, Blacks, or Hispanics for either action or priority, or for action and priority together. The small number of race minority participants in the samples precluded the calculation of adverse impact within each condition; there were only five Asian (three percent), two Hispanic (one percent), and eight Blacks (five percent) participants. The results of these adverse impact analyses are shown in Table 3.

Discussion

The purpose of this study was to test whether an alternate form of the in-basket could be developed--one that is equivalent to the traditional form of the exercise, faster to score, and without adverse impact. Across two demographically different groups of participants, the multiple-choice and traditional forms of the in-basket were found to be equivalent on one performance measure (the priority participants assigned to each item), but not equivalent on the other (the action participants took on each item). Neither form had an adverse impact on women or race minorities for either performance measure.

A possible explanation for differences between action scores on the two forms is that the subjective scoring of the traditional in-basket leads to systematically lower scores than the objective scoring of the multiple-choice in-basket. Participants may have found it more difficult to describe the action they would take on each item than to select one action from among four alternatives. This explanation is supported by the lack of difference between the two forms on the priorities participants chose for the items. Participants in both conditions were given the same three choices for priority--essentially a multiple-choice type of test.

Another possible explanation for the difference between the traditional and multiple-choice forms of the in-basket on the actions that participants said they would take is that the traditional in-basket test is a different stimulus situation than a multiple-choice test. For the traditional in-basket, participants must interpret the situation and generate a course of action, and for the multiple-choice in-basket, participants only need to choose the best course of action from alternatives that have already been generated. The multiple-choice form of the in-basket tends to narrow the search field when participants look for the correct response; participants may have been "channeled" into one of the four responses available, this cannot happen when an open-ended free response is required of participants as in the traditional in-basket. In short, while the multiple-choice form is more of a recognition task, the traditional form is more of a recall or creative task. The multiple-choice form of the in-basket may achieve a greater efficiency in terms of time and cost reduction. However, the efficiency is achieved with some loss of the developmental or learning aspect of the traditional in-basket, because the participant does not have to generate their own response--they choose their answer from among four alternatives.

The results of the adverse impact test were also encouraging. There was no adverse impact on women, for either action, or for priority, in the multiple-choice or traditional form. There was also no evidence of adverse impact on Asians, Blacks, or Hispanics for either action or priority. Tests with adverse impact are less attractive to employers.

The participants in this study came from two samples from two different universities. There were significant differences between the samples on all of the demographic measures (age, work experience, and management work experience) indicating that the samples were drawn from different populations, but there were no significant differences between the samples on the two measures of performance in the in-basket (action and priority). These results are encouraging for generalizing the results to other samples. Further research might test additional samples or different content areas for the in-basket.

There were a number of limitations with the current study. First, the samples included only a small number of race minority participants. Although the adverse impact tests on women were based on 52 women (34 percent of the sample), the adverse impact tests for race were based on small sub-sample sizes. Future research in this area might profit from over sampling minorities to get a better indication of whether either version of the in-basket has adverse impact on race minorities. Second, the sample was comprised of MBA students with some work experience and some management experience--not employees in an actual selection or management development situation where in-baskets are typically used. Third, the scoring of the in-basket was based on the judgments of a subject matter expert (SME). Although this SME had 15 years of experience as a controller and 13 years in the construction field, the realism of the in-basket for a specific job depends on the expertise of the SME. Future research might use a panel of SMEs to develop the in-basket scenario and scoring key. Fourth, the in-basket presented participants with a situation that was designed to be unfamiliar to most of the participants. This was an experimental control to insure that current or recent job experience would be unlikely to bias participants' scores, but in-baskets are typically administered to people who have some familiarity with the job, and contain job-specific in-basket materials. Although in-baskets have been used successfully for a wide variety of jobs, the multiple-choice form of the in-basket developed in this study needs to be tested further by developing traditional and multiple-choice forms of in-baskets for different content areas. Finally, this study did not include a measure of actual job performance to allow for validation of the test or to test for the differential validity of the traditional versus the multiple-choice in-basket.

The in-basket has been successfully used for selection and management development for decades, most commonly as part of an assessment center. This study provided partial support for the idea that a multiple-choice form of the traditional in-basket can be developed that is statistically equivalent and that has no adverse impact. The multiple-choice in-basket was found to be equivalent to the traditional in-basket when participants were determining the relative importance of the items in the in-basket, but not when participants were determining what action should be taken on the items. Additionally, there was no adverse impact by race or sex for either the traditional or for the multiple-choice in-basket. Future research might expand the number of performance dimensions measured, to include such management skills as delegation, decision-making, or interpersonal skills.

Notes

(1) Gaugler, B. B., Rosenthal, D. B., Thorton, G. C. III, and Bentson, B. (1987). Meta-analysis of assessment center validity. Journal of Applied Psychology, 72(3), 493-511.

(2) Salem, B., Ellis, D, and Johnson, D. (1981). Development and use of an in-basket promotional exam for police sergeant. Review of Public Personnel Administration, 1(2), 23-35.

(3) Sever, J. W., Knippenberg, R. W., and Perfetto, U. J. (1977). Minneconsin: A behavior based oral test. Public Personnel Management, 6(6), 427-436.

(4) Spychalski, A. C., Quinones, M. A., Gaugler, B. B., and Pohley, K. (1987). A survey of assessment center practices in organizations in the United States. Personnel Psychology, 50(1), 71-90.

(5) Strausbaugh, D., and Wagman, B. L. (1977). An assessment center examination to select administrative interns. Public Personnel Management, 6(4), 263-268.

(6) Byham, W. C. (1970, July/August). Assessment centers for spotting future managers. Harvard Business Review, 150-167.

(7) Bender, J. M. (1973). What is "typical" of assessment centers? Personnel, 50(4), 50-57.

(8) Finkle (1976). Managerial assessment centers. In Handbook of Industrial and Organizational Psychology, M. D. Dunnette, Ed. Chicago, IL: Rand McNally.

(9) Lopez, E M., Jr. (1966). Evaluating executive decision making: The in-basket technique. American Management Association.

(10) Joiner, D. A. (2002). Assessment centers: What's new? Public Personnel Management, 31(2), 179-185.

(11) Cascio, W E (2003). Managing human resources: Productivity, quality of work life, profits. New York, ICE Irwin, McGraw-Hill.

(12) Goldstein, H. W., Yusko, K. P, Braverman, E. P, Smith, D. B., and Chung, B. (1998). The role of cognitive ability in the subgroup differences and incremental validity of assessment center exercises. Personnel Psychology, 51(2), 357-374.

(13) Schippmann, J. S., Prien, E. P, and Katz, J. A. (1990). Reliability and validity of in-basket performance measures. Personnel Psychology, 43(4), 837-859.

(14) Frederiksen, N., Saunders, D. R., and Wand, B. (1957). The In-Basket Test. Psychological Monographs: General and Applied, Vol 71(9) Whole No. 438, 1-28.

(15) Joiner, D. A. (1984). Assessment centers in the public sector: A practical approach. Public Personnel Management, 13(4), 435-450.

(16) Thorton, G. C. III, and Byham, W C. (1982). Assessment centers and managerial performance. New York, NY: Academic Press.

(17) Cooper, B. L., Clasen, P S., and Butlen, M. C. (1999). Creative performance on an in-basket exercise: Effects of inoculation against extrinsic reward. Journal of Managerial Psychology, 14(1), 39-56.

(18) Joines, P,. C. (1991). Innovations in In-Basket Technology: The General Management In-Basket. International Congress on the Assessment Center Method, Toronto, Canada.

(19) Klimoski, R, and Brickner, M. (1987). Why do assessment centers work? The puzzle of assessment center validity. Personnel Psychology, 40(2), 243-260.

(20) Hakstian, A. R, and Harlos, K. P (1993). Assessment of in-basket performance by quickly-scored methods: Development and psychometric evaluation. International Journal of Selection and Assessment, 1, 135-142.

(21) Kesselman, G. A., Lopez, E M., and Lopez, E E. (1982). The development and validation of a self-report scored in-basket test in an assessment center setting. Public Personnel Management Journal, 11(3), 228-238.

(22) Joines, R. C. (1987). The Item-by-Item Scored General Management In-Basket. Paper presented at the International Personnel Management Association Assessment Council Conference, Philadelphia, PA.

Kenneth M. York, Ph.D.

School of Business Administration

Oakland University

Rochester, MI 48309-4401

Phone: (248) 370-3272

Fax: (248) 370-4319 FAX

E-mail: york@oakland.edu

David C. Strubler, Ph.D.

Industrial & Manufacturing Engineering and Business Department

Kettering University

Flint, MI 48504-4898

Phone: (810) 762-7979

E-mail: dstruble@kettering.edu

Elaine M. Smith

School of Business Administration

Oakland University

Rochester, MI 48309-4401

Phone: (248) 370-3279

Dr. Kenneth M. York received his Ph.D. in industrial/organizational psychology from Bowling Green State University, Bowling Green, Ohio, and is an associate professor of management in the School of Business Administration, Oakland University in Rochester, Michigan. He specializes in the application of behavioral decision theory, and the creation of experiential learning exercises for development of management skills.

Dr. David C. Strubler received his Ph.D. in organizational communication from Wayne State University, Detroit, Michigan and is an associate professor of management in the Industrial & Manufacturing Engineering and Business Department, Kettering University in Flint, Michigan. He specialized in quality management, organizational communication, cross-cultural business communication, and business teams.

Elaine M. Smith was a student in the School of Business Administration at Oakland University in Rochester, Michigan. when this study was conducted. She received her bachelor of science degree in business administration in 2001.

Figure 1. In-Basket Scenario, Example Item, Multiple-Choice
Question, and In-Basket Worksheet.

   Scenario

   You are Terry Wilson, Senior Controller and Chief Information
   Officer for Ludwig Construction and Development, Inc., a
   medium-sized construction firm. The company is divided into
   two divisions: heavy construction and development of
   industrial buildings. Terry Wilson holds a middle management
   staff position, and is very autonomous dealing with company
   business. As Senior Controller, Terry handles most of the
   financial aspects of the company and has dotted line authority
   over the two division controllers. As Chief Information Officer,
   you are an advisor/consultant and troubleshooter for the
   company's in-house staff, keeping obstacles out of employees'
   way so they can do their work. It is 8:00am, August 1 and you
   have just returned from a one-week vacation. You have a number of
   items that have collected in your in-basket: phone messages, email
   messages, and other documents. Also, you are expecting a
   conference call to arrive at 8:30am, and need to take care of
   these items before then.

   Telephone Message

   Date: July 25

   To: Terry Wilson

   From: Jeff Ludwig

   Subject: Refinancing deal

   I reviewed the information, and I want to make some changes. We
   need to talk about this right away.

1. Priority of this item (circle one): Vital Important Trivial.
Action you would take:

   a. Call Jeff Ludwig this morning, ask him to list the changes he
   wants to make

   b. Go to Jeff's office this afternoon, find out what changes he
   has in mind

   c. Go to Jeff Ludwig's office before the 9:00am meeting to discuss
   than changes he wants to make

   d. Do nothing

   In-Basket Worksheet

   Item Priority Action You Would Take

   1
   ...
   25
   Priority: Vital, Important, or Trivial

Table 1. Test for Equivalence of Traditional and Multiple Choice
Forms of the In-Basket

Performance
Measure          Multiple R     F-Ratio     Probability      df

Action              .30         16.20 *        .100         1,162
Priority            .01          0.02          .901         1,162
Multivariate        .30          8.17 *        .001         2,161

Note. * p < .05. The multivariate tests were calculated on both the
dependent variables Action and Priority. Forms are Traditional vs.
Multiple Choice In-Basket, Samples are first vs. second university
sample.

Table 2. Means and Standard Deviations of Action and Priority for
Traditional and Multiple Choice Forms of the In-Basket

                                                         Significant
                                                         Difference
Performance                                                Between
Measure       Condition                   Mean (StDev)     Forms?

Action        Traditional In-Basket        8.29 (2.4)        Yes
              Multiple Choice In-Basket    9.99 (2.8)

Priority      Traditional In-Basket       12.96 (2.9)        No
              Multiple Choice In-Basket   12.89 (3.7)

Table 3. Adverse Impact for Traditional and Multiple Choice
Forms of In Basket

                          Dependent       Mean (StDev)          Adverse
Group   Condition         Measure      Majority     Minority    Impact?

Sex     Traditional       Action       8.4 (2.4)    8.1 (2.7)   No
                          Priority    13.0 (2.6)   13.4 (3.1)   No
        Multiple Choice   Action      10.3 (2.9)   10.0 (2.7)   No
                          Priority    12.5 (4.2)   13.5 (3.1)   No

Race    Asian             Action       9.4 (2.8)   10.6 (2.5)   No
                          Priority    13.1 (3.4)   11.6 (6.6)   No
        Black             Action       9.4 (2.8)    8.5 (3.7)   No
                          Priority    13.1 (3.4)   12.5 (2.7)   No
        Hispanic          Action       9.4 (2.8)    9.0 (4.2)   No
                          Priority    13.1 (3.4)   11.5 (0.7)   No

In addition, make sure to read these articles: