Small Business Resources, Business Advice and Forms from AllBusiness.com

Comparing survey and diary measures of Internet and traditional media use.

By Eastin, Matthew S.
Publication: Communication Reports
Date: Friday, April 1 2005

Considerable debate exists over the accuracy of self-reported media use measures. This report compares two methodologies for studying Internet and traditional media use: online surveys and diaries. A study was conducted with undergraduate students from two universities. Participants were asked

to (a) complete a survey and (b) keep a diary over the course of one day. Both instruments assessed how frequently they engaged in various media use activities, including television viewing, radio listening, Web surfing, email sending and receiving, music listening, and video game playing. Results indicate that survey estimates of media use are consistently higher than diary use, but both methods are significantly correlated with each other, within a given medium. Given uncertainty about which method is more accurate, a third method of data collection, electronic tracking, is described.

Keywords: Internet; World Wide Web (WWW); Media Use; Research Methods

**********

The research debate over the accuracy of self-reported media use is not new (Coffey & Stipp, 1997; Reagan, 1996; Sheehan & Hoy, 1999; Yun & Trumbo, 2000; Zillmann & Bryant, 1985). Recent technological advances and increasing Internet penetration have stimulated new forms of data collection and new methodological research questions. For example, the Internet has recently enabled some survey research to move from expensive phone or direct mail methods to faster, less expensive email or Web-based surveys. Given the variety of options now available, questions must be raised about the stability of various methodological approaches to studying media use, i.e., how consistent and accurate they are. These types of questions can help improve the validity of media use measurement.

To date, the vast majority of research assessing the new technique of online data collections has focused on response rates and generalizability. Little empirical research exists from which to understand response differences between retrospective self-report Web-based data and such other measures as diaries or electronic measurement. The present study compares two of these methods--Web self-reports and diaries--and discusses the third, electronic measurement, as an alternative and potentially superior possibility to the more traditional techniques.

Web-based Surveys

Web-based surveys can be used efficiently to collect demographic, behavioral and attitudinal data, among others. The notable benefits to using Web-based surveys include design flexibility (Schillewaert, Langerak & Duhamel, 1998), large samples (Kehoe & Pitkow, 1996), efficient data collection from time and cost perspectives, (Eastin, 2002; Yun & Trumbo, 2000), increased anonymity (Kiesler & Sproull, 1986), minimized interviewer error and bias (McCullough, 1998), as well as its relative novelty. Limitations also exist. Although the generalizability of online samples is improving, they continue to be problematic (Yun & Trumbo, 2000). Further, multiple responses and ethical considerations (Greenberg, Eastin & Garramone, 2003) present problems for online data collection. Although these issues are relevant in the overall assessment of Web-based data collection, they do not address how the response patterns in Web-based surveys may differ from alternative forms of data collection.

Estimating Media Use

Generally speaking, how do people estimate behavior frequencies? Cognitive scientists such as Sudman, Bradburn, and Schwarz (1996) posit that people expend cognitive effort only to the extent required to form a minimally satisfying response. In other words, when asked to complete a survey, people are 'cognitive misers' when it comes to estimating behavior frequencies. As cognitive misers, people use such response strategies as estimating an ongoing rate of behavior, then approximating this to the time period specified by the question. This method of reasoning typically leads to overestimates of behavior in surveys. Diary entries, on the other hand, have been considered by some to be a more accurate representation of use (Anderson, Field, Collins, Lord & Nathan, 1985). However, completing the diary entries places a heavier burden on the user throughout the data collection period. They must remember to use it each time. In addition, it may require users to report engaging in sensitive behaviors such as viewing sexual content or visiting pornographic Web sites, if that is the focus of the research.

While there is a tendency for respondents to over-report their use of traditional media, research has generally found a moderately high correlation between retrospective self-reports and other benchmark measures. Van der Voort and Voojis (1990) found a correlation of .54 between diary data and self-reported television viewing. Further, this relationship increased to .77 for older children with higher education and family income. For the Internet, Yun and Trunbo (2000) compared traditional-mail survey responses on various types of email use to email and Web-based responses. Results indicated that both Web and email survey formats produced significantly higher response levels of email sent and received, social email use, and task email use. Finally, and most relevant to this research, LaRose, Eastin, and Gregg, (2002) reported for general Internet use a significant correlation (r = .65) between recall and diary data. Since Internet use occurs in smaller 'chunks' than traditional media use, which typically involves longer, more singular usage activities such as the watching of a half-hour or hour-long programs (Heeter & Greenberg, 1988), measurement issues are particularly important in the information age.

This study includes both Internet use measures and traditional media use measures in a single effort to map out similarities and differences between retrospective and diary reports of these behaviors.

Methods

Participants

In the spring and summer of 2002, undergraduate students (N = 456) from two large Midwestern US universities who were enrolled in introductory communication and telecommunication classes were recruited to participate in this study, for which they received extra credit. Approximately 7% of these students (31/456) completed only one phase of the study and were therefore not included in subsequent analyses. The vast majority (96%) of the remaining 425 students were between the ages of 18 and 25 (M = 21), and 50% were female.

Procedures

The data collection process was divided into two phases, each by a different method. The first phase required participants to complete an online survey about their mass media use. In the second phase, participants kept a diary of their media use for one day.

For the first phase, a survey instrument was created in HTML and placed on the Web. When voluntary participants accessed the survey site, the survey provided instructions about how to complete and submit it. Students could complete the online survey from Tuesday through Friday of the first data collection week.

Immediately after submitting their survey, participants received a screen online with information about the second phase of the study, the media use diary. Researchers visited classes the following week and handed out the diaries. The four-page diaries, printed on heavy cardstock, each had a day identified on them ranging from Sunday to Saturday, with all days represented equally. Students were instructed to fill out their diary for the single designated day, yielding a composite week of media use from the diary data. The first page of the diary contained instructions about how to fill it out, and the next three pages asked about mass media and Internet use. Students returned completed diaries to their class.

Survey Variables

The survey instrument contained measures designed to tap how frequently students used different media. All time-spent items had a scale ranging either from '0' to 'more than 3.5 hours' or '0' to 'more than 5.5 hours'. The scale levels were displayed in half-hour increments.

Television use, radio use, and Web use each were measured with five items. Respondents indicated how many hours they used each medium yesterday in the morning, afternoon, and evening, and on Saturday and Sunday.

Email use was measured by asking for the number of emails sent and received. Six separate items asked about number of emails received from (and sent to) friends, relatives, and for school or work. These measures used scales with increments of '0,' '1-9,' '10-19,' '20-29,' and '30 or more' messages. Therefore, a score of '1' on the email sent variable, for example, would indicate that the respondent sent 1-9 emails.

Music listening was assessed with two items, one asking about hours spent listening to music on CDs or tapes yesterday and the other asking about hours spent listening to music in MP3 format yesterday.

Six items measured video game use. Four asked how many hours were spent yesterday playing video games on the Web, on a computer (but not the Web), on a console system, and on a handheld system. The last two asked about overall video game use on Saturday and Sunday.

A final section of the survey asked for age, sex, whether or not they had a job, class level, college, living location (on or off campus), and grade point average.

Diary Variables

The diary instrument required participants to keep track of and log their media use activities throughout the course of one specified day. That day was randomly assigned.

The general media use page of the diary had 10 rows, each corresponding to a media use session. Columns enabled the participants to write down the start and end times for each media use session, and to identify each medium accessed. Options were provided for TV, radio, music player and off-line video game by oneself or with others. Diarists could check as many boxes as applied in a given session, to account for multiple simultaneous uses (i.e., media multi-tasking).

The first Internet use page was formatted exactly as the general media use page, except the 'nature of use columns' contained different headings. They were (1) watched a movie or video clip, (2) listened to music, (3) work-related information gathering, (4) entertainment-related information gathering, (5) online video game by self, and (6) online video game with others. The second Internet use page asked diary keepers to write down their 'number of contacts' for overall email sent, email sent to family, email sent to friends, email sent for work or school, and email sent for other reasons. The same options were offered for email received. Diary keepers also were asked to write down the number of chat rooms they entered, the number of discussion groups they contacted, and the number of different people they instant messaged during the course of that one day.

Constructed Measures

Several composite measures were created. From the survey, total television time yesterday, total radio time yesterday, and total Web time yesterday were constructed by summing the three yesterday day-part scores for these activities. Total music listening time yesterday is the sum of listening to music on tape/CDs and computer music listening. Total video game time yesterday summed the computer, console, and hand-held video game use items.

With the diary data, the first transformation was to calculate the number of minutes spent with general media and the Internet. Trained coders used the start and end time information to determine the number of minutes during each use session, and then all use sessions were added to create total media use and total Internet use variables. The minutes' data for each session were also used to determine how many minutes were spent with each specific type of medium, such as TV time, radio time, etc. Once all were converted to minutes, total video game use (offline) was formed by summing the two general media video game items and total online video game use by adding the two Internet video game items.

All time figures were transformed into both hours and minutes to make appropriate comparisons. Since the survey items had 'capped' maximums, e.g., 'more than 5.5 hours', the diary hour figures also were capped so that their unlimited ranges did not inflate variances or means when the survey and diary results were compared. This makes for a slightly more conservative estimate of Internet and media use.

Results

The primary comparisons between the survey and diary methods are in Table 1. Seven different media measures can be compared in terms of units of time spent 'yesterday.' In addition, two email measures can be assessed by both methods.

There are two key findings. First, self-estimates of Internet and traditional media use are consistently higher when reported on a survey than when reported in a diary, except for radio use. Despite these absolute differences in projected mean level of activity, the second is that the two methods of collecting this information are consistently correlated with each other for all seven correlations (p < .001).

In Table 1, the hourly estimates of time spent on the Internet, with television, with both on line and off-line music, and with off-line video games are consistently and significantly higher on the survey results. For example, the average time estimate on the Internet is three-quarters of an hour greater on the surveys; this is the largest discrepancy found between the two methods. Television is 30 minutes longer and music is listened to for an additional 30 minutes. Table 1 indicates that same pattern for email received from others, but not for email sent.

Between the survey and diary methods, correlations range from .20 for listening to music off-line to .58 for email sent. Internet estimates are correlated .39 and television time estimates are correlated .35. These correlations represent two methods of data gathering not completed concurrently, but all are statistically significant. On the other hand, if we were applying reliability estimate standards, most fall short of that criterion.

On an absolute media use basis, the survey results would have us believe that these respondents gave 9.3 hours 'yesterday' to the Internet, television, radio, and off-line music and video games. Even the diary results report 7.5 hours of use for these same activities. Neither result would be plausible save for the expectation that multi-tasking across media is a common activity--that listening to music or having the radio on while surfing the Internet is not unusual. There also may be categories of use that are not mutually exclusive, e.g., time on the Internet for information and/or entertainment. But the magnitude of the time estimates accentuates the need to determine whether any particular media use is a primary or secondary activity, as well as which estimate is more accurate.

Discussion

A nagging problem in the measurement of media use remains that of identifying a meaningful common scale unit. The choice of time, e.g., minutes or hours, is more a convenience than a psychologically or semantically meaningful decision. When you watch TV, you watch programs, not minutes. When you go on the Internet, you are targeting Web sites, games or a friend, not minutes. In addition, equating 15 minutes of reading with 15 minutes of TV viewing ignores the differences in complexity of these two behaviors. Nonetheless, until time can somehow be refined in measurement schemes, it remains the most common index of use.

These results beg the question of which method, survey or diary, is more accurate. The two methods are different and, in this study, the survey provides consistently higher estimates across most major media use items, and especially those media that have the highest use estimates within the population segment studied. In other words, the more the medium appears to be used, the larger the discrepancy between the survey and diary results. In addition, the two methods are positively correlated and significant, albeit modestly correlated. In part, we argue below that this is a function of study methods.

Having earlier acknowledged the shortcomings of Web surveys in general, we can indicate that the diary suffers from problems of memory (when was it completed?), authorship (who really filled it out?), and mortality (for how long will a respondent be diligent?), among other issues. Claims of greater accuracy can be offset by these issues.

It also would be advantageous to incorporate more intricate data collection techniques such as Experience Sampling Methods (ESM) (Kubey, Larson & Csikszentmihalyi, 1996). Most applicable within the current framework is the external dimension of ESM, which focuses on time, companionship, and activity. By using random time sampling intervals to track media use, some of the time constraints inherent in the current study methods could be lessened.

Additionally, the need for the proposed third leg of measurement, e-tracking assessment, seems crucial. It is not expected that e-tracking will be a panacea, but a new approach to be explored. Commercially available e-tracking systems for computers (e.g., 'spyware') offer potentially accurate information on single-user machines, and passwords can protect shared computers from being used by people who are not the subject of research. The data provided can be mined for a rich array of usage information, including time spent engaging in activities and usage patterns. As Internet penetration continues to increase in homes and the workplace, a desire for more accurate estimates of usage time (and usage content) online is likely to intensify. In addition, more homes are switching to dedicated broadband connections, e.g., cable and DSL modems (Pew Internet for Life, 2002). For those with dedicated connections, tracking systems must be able to separate use time from idle connection time and download time (e.g., large movie files). The situation for the Internet has an analogy in the Nielsen electronic TV ratings system. It can measure potential exposure (the set is on, but no one is in the room), but attention is not likely to be assessed. So, e-tracking has limitations, but it also presents researchers with a new avenue of inquiry into computerbased media use, which seems important in light of the deficiencies identified in this study.

The most problematic issue for this study is the offset comparison between the diary and survey completions. All responded online about a weekday, and then completed a diary day during the next week. The assumption was that the behaviors examined would be relatively similar from one weekday to another. This remains an untested assumption and may have contributed some measurement error. On the other hand, if the diary and survey were done for the same day by the same respondents, there is reason to believe that the second set of responses (regardless of which method was implemented first) would have been tainted by the first set, yielding higher reliability estimates than warranted. This again is empirically testable by implementing both concurrent and offset time entries with these methods and comparing the results.

This study contributes to our understanding of how alternative methods of assessing media use can yield different results and poses still another method to examine. The triangulation of media usage methods--survey, diary and e-tracking--has yet to be accomplished successfully. This remains a challenge to the research community.

References

Anderson, D., Field, D., Collins, P. A., Lord E. P., & Nathan, J. (1985). Estimates of young children's time with television: A methodological comparison of parent reports with time-elapse video home observation. Child Development, 56, 1345-1357.

Coffey, S., & Stipp, H. (1997). The interaction between computer and television usage. Journal of Advertising Research, 37 (2), 61-67.

Eastin, M. S. (2002). Diffusion of E-commerce: An analysis of the adoption of four e-commerce activities. Telematics and Informatics, 19 (3), 251-267.

Greenberg, B. S., Eastin, M. S., & Garramone, G. (2003). Ethical issues in conducting mass communication research. In G. Stempel, D. Weaver, & G. C. Wilhoit (Eds.), Mass communication research and theory. Boston: Allyn & Bacon.

Heeter, C. & Greenberg, B. S. (1988). Cableviewing. Norwood, NJ: Ablex.

Kehoe, C., & Pitkow, J. (1996). Surveying the territory: GVU's five WWW user surveys. The World Wide Web Journal, 1 (3). Retrieved April 20, 2004, from http://w3j.com/3/s3 kehoe.html

Kiesler, S., & Sproull, L. S. (1986). Response effects in the electronic survey. Public Opinion Quarterly, 50, 402-413.

Kubey, R., Larson, R., & Csikszentmihalyi, M. (1996). Experience sampling method applications to communication research. Journal of Communication, 46 (2), 99-120.

LaRose, R., Eastin, M. S., & Gregg, J. (2001). Reformulating the Internet paradox: Social cognitive explanations of internet use and depression. Journal of Online Behavior, 1 (2). Retrieved July 5, 2004, from http://www.behavior.net/JOB/vln2/paradox.html

McCullough, D. (1998). Web-based market research, the dawning of a new era. Direct Marketing, 61 (8), 36-39.

Pew Internet for Life Project. (2002). The Broadband difference: How online Americans' behavior changes with high-speed Internet connections at home. Retrieved October 10, 2002 from http:// www.pewinternet.org.

Reagan, J. (1996). The 'repertoire' of information sources. Journal of Broadcasting & Electronic Media, 40 (1), 112-119.

Schillewaert, N., Langerak, F., & Duhamel, T. (1998). Non probability sampling for www surveys: A comparison of methods. Journal of the Market Research Society, 4 (40), 307-313.

Sheehan, K. & Hoy, M. (1999). Using email to survey Internet users in the United States: Methodology and assessment. Journal of Computer Mediated Communication, 4 (3). Retrieved August 17, 2004, from http://www.ascusc.org/jcmc/vol4/issue3/sheehan.html

Sudman, S., Bradburn, N. M., & Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. San Francisco, CA: Jossey-Bass.

van der Voort, T. H. A., & Voojis, M. W. (1990). Validity of children's direct estimate of time spent viewing television. Journal of Broadcasting & Electronic Media, 34 (1), 93-99.

Yun G., & Trumbo, C. (2000). Comparative response to survey executed by post, e-mail & web form. Journal of Computer Mediated Communication, 6 (1). Retrieved August 17, 2004, from http:// www.ascusc.org/jcmc/vol6/issue1/yun.html

Zillmann, D., & Bryant, J. (Eds.). (1985). Selective exposure to communication. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.

Bradley S. Greenberg is a Professor of Communication and Telecommunication, Information Studies, and Media at Michigan State University. Matthew S. Eastin is an Assistant Professor of Communication at the Ohio State University. Paul Skalski is an Assistant Professor in the Department of Communication at University of Minnesota-Duluth. Len Cooper is a Doctoral Student in the School of Communication at The Ohio State University. Mark Levy is a Professor of Telecommunication, Information Studies, and Media at Michigan State University. Ken Lachlan is an Assistant Professor in the Department of Communication at Boston College. Correspondence to: Paul Skalski, UMD Communication, 469 ABAH, 1121 University Dr., Duluth, MN 55812, USA; Email: pskalski@d.umn.edu. A version of this paper was presented at an International conference on Mass Media and Communications in the e-society of the 21st Century: Access and Participation, at Moscow State University, Moscow Russia, 17-20 October 2002.

Table 1 Internet and Traditional Media Use Yesterday as Assessed
by Survey and Diary Methods

                               Online survey   Diary     t

Internet hours                     3.26         2.39   6.26 *
Television hours                   2.96         2.49   3.26 *
Radio hours                        1.33         1.21   1.06
Music (CD/tape) hours              1.30         1.09   2.07 *
Music (Computer) hours             1.28         1.03   2.72 *
Video Games (off-line) hours        .48         .30    2.71 *
Video Games (online) hours          .23         .28    -.91
Number of emails received          1.59         1.33   5.41 *
Number of emails sent               .79         .76    1.34

                               d.f.    r **

Internet hours                 424     .389
Television hours               237     .353
Radio hours                    424     .284
Music (CD/tape) hours          424     .199
Music (Computer) hours         424     .392
Video Games (off-line) hours   424     .207
Video Games (online) hours     424     .289
Number of emails received      424     .468
Number of emails sent          424     .579

* p < .05. ** All correlation are p < .001.

In addition, make sure to read these articles: