With few exceptions, research in the commodity advertising literature is based on the econometric modeling of aggregate time series data with the assumption of the representative agent (see Ferrero et al. for an annotated bibliography of generic advertising research). In such studies, it is
There is a vast literature, however, indicating that this type of modeling may provide misleading conclusions since the approach ignores the heterogeneity of individual behavior. Examples of studies on problems of cross-sectional data aggregation in economics include Cooper and Nakanishi, Judge et al., Kirman, Maddala, Sonnenschein, Stoker, and Theil. Researchers in marketing have also contributed to this literature; for example, Christen et al.; Krishnamurthi, Raj, and Selvam; and Tellis and Weiss illustrate the potential problem of aggregation bias in estimating various marketing models.
The aim of this study is to investigate whether the use of aggregate cross-sectional data significantly biases the estimated consumers' response to advertising programs. The study addresses the question of consistency between household versus macro-level advertising response models. In his seminal work, Theil shows that the estimates obtained from an aggregate linear model are biased if the underlying parameters are heterogeneous across the units being aggregated. Our study extends Theil's framework to a double-logarithm functional form and examines the data aggregation problem in the context of advertising evaluation. The double-log form has been one of the most frequently used functional forms for advertising models mainly because (1) the sales response to advertising exhibits diminishing returns; and (2) it is easy to interpret estimates of the model because estimates themselves are elasticities. Unlike previous studies in marketing literature such as Tellis and Weiss and Christen et al., ours accounts for the parameter heterogeneity so that we can investigate potential data aggregation bias caused by both heterogeneous responses from individual agents and the nonlinear model. We decompose the aggregation bias into three parts: bias from (1) corresponding parameters; (2) non-corresponding parameters; and (3) using linearly aggregated data for a non-linear (here double-log) model.
In this paper, we first derive a statistical procedure to quantify the degree of bias and the conditions under which aggregation bias is likely to occur. Then, to illustrate the magnitude of the bias, we apply our conceptual derivations to the evaluation of generic fluid milk advertising using a household panel of U.S. milk consumers.
Aggregation Bias
Consider N households, observed over T time periods (i = 1 ... N; t = 1 ... T) whose purchases ([y.sub.it]) of a particular commodity can be characterized by
(1) log [y.sub.it] = [summation over (J/j=0)] [[beta].sub.ji] log [x.sub.jit] + [[epsilon].sub.it]
where [x.sub.jit] is a J + 1 vector of an exogenous explanatory variables with log [x.sub.0it] = 1; [beta] is a J + 1 vector of corresponding coefficients; and [[epsilon].sub.it] is the error term. It is assumed that the x's and [beta]'s are non-stochastic. If we could consistently aggregate the data over N households in each time, the corresponding macro equation could be estimated as
(2) 1/N [summation over (N/i=1)] log [y.sub.it] = 1/N [summation over (N/i=1)] [summation over (J/j=0)] [[beta].sub.ji] log [x.sub.jit] + 1/N [summation over (N/i=1)] [[epsilon].sub.it].
However, aggregate variables of (2) are rarely available in practice and are usually replaced by a simple average of each variable. Specifically, the macro-equivalent to the micro equation (1) is typically specified as
(3) log [y.sub.t] = [summation over (J/j=0)] [[beta].sub.j] log [x.sub.jt] + [[epsilon].sub.t]
where [y.sub.t] = [SIGMA] [y.sub.it]/N, [x.sub.t] = [SIGMA] [x.sub.it]/N, and [[epsilon].sub.t] is the error term of the macro model. Our task is to investigate the bias when (3) is setimated instead of (2).
To examine this bias, we first rewrite (2) using the relationship between arithmetic (x) and geometric ([x.sub.g]) means, and the definition of covariance. The relationship between the two types of means is derived from a power series of (log [x.sub.i] - log [x.sub.g] as (1)
(4) x = [x.sub.g] {1 + 1/2! 1/N [summation over (N/i=1)] [(log [x.sub.i] - log [x.sub.g]).sup.2] + 1/3! 1/N [summation over (N/i=1)] [(log [x.sub.i] - log [x.sub.g]).sup.3].......}.
Assuming the log-normal distribution for [x.sub.i] yields (Cramer)
(5) x/[x.sub.g] [approximately equal to] exp {1/2N [summation over to (N/i=1)] [(log [x.sub.i] - log [x.sub.g]).sup.2]}.
Equation (5) implies that the ratio of the arithmetic mean to the geometric mean is expressed in the second-order approximation. Equation (5) also provides important information for comparing (2) and (3) because the left-hand side variable of (2) is the logarithm of the geometric mean of [y.sub.i].
Rewriting (2) using (5) leads to
(6) log [y.sub.t] - 1/2N [summation over (N/i=1)] [(log [y.sub.it] - log [y.sub.gt]).sup.2] = 1/N [summation over (N/i=1)] [summation over (J/j=0)] [[beta].sub.ji] log [x.sub.jit] + [epsilon].
Adding and subtracting
[summation over (J/j=0)] 1/N [summation over (N/i=1)] [[beta].sub.jt] 1/N [summation over (N/i=1)] log [x.sub.jit]
to the right-hand side of (6) and rearranging, we obtain
(7) log [y.sub.t] = [summation over (J/j=0)] [[beta].sub.j] log [x.sub.jt] + [summation over (J/j=0)] [cov.sub.i]([[beta].sub.jt], log [x.sub.jit] + 1/2N [summation over (N/i=1)] [(log [y.sub.it] - log [y.sub.gt]).sup.2] - 1/2N [summation over (J/j=0)] [[beta].sub.j] [summation over (N/i=1)] [(log [x.sub.jit] - log[x.sub.git]).sup.2] + [[epsilon].sub.t]
where [[beta].sub.j] = [[SIGMA].sub.i] [[beta].sub.ji]/N and
(8) [cov.sub.i]([[beta].sub.ji], log [x.sub.jit]) = 1/N [summation over (N/i=1)] [[beta].sub.ji] log [x.sub.jit] - 1/N [summation over (N/i=1)] [[beta].sub.ji] 1/N [summation over (N/i=1)] log [x.sub.jit].
Equation (7) clearly demonstrates that if we estimate the aggregate model (3), over the true macro model (2), we would obtain biased parameter estimates because of the specification error. Equation (7) also indicates that there will be no aggregation bias when the covariance and second-order approximation terms are zero. As Theil points out, the covariance terms are formed by the heterogeneity of economic agents (e.g., individuals, households, and stores). If each agent faces the same socioeconomic and marketing environments (i.e., [x.sub.jit]=[x.sub.jt], [for all]j), and/or their corresponding behavioral responses (i.e., [[beta].sub.jt] = [[beta].sub.j], [for all]j, j [not equal to] 0), there will be no aggregation bias originated from the cross-section heterogeneity.
Another source of the aggregation bias is due to the difference between geometric and arithmetic means. If these two means are the same, the second order approximation term in (7) vanishes (see equation (5)). When all variables assume the same value across cross-sections, a special case occurs and these two means are the same. When all micro-variables, including the dependent variable, are the same across cross-sections, it is clear that the typical macro model (3), gives us consistent estimates. However, these conditions are rarely met in reality: economic agents generally have different socioeconomic and/or demographic characteristics. Also, most aggregate data are simple averages of micro variables, which are not necessarily identical to geometric means. Therefore, if we estimate (3) ignoring these covariances and second order approximations, the omitted variables submerge in the error term, which leads to biased parameter estimates.
From this theoretical foundation, Theil derived a framework that delineated aggregation bias from the linear model. He showed that the aggregation bias does not exist when the parameter vectors are all identical, but does exist when the parameter vectors are not all equal across cross-sections. We extend Theil's conceptual framework to the double-log model, and explicitly derive three sources of aggregation bias discussed earlier.
Rewriting (7) in matrix form gives
(9) Y = X[beta] + [xi]
where
[xi] = 1/N [summation over (N/i=1)] ([[beta].sub.i] - [beta])([X.sub.i] - X) + 1/2N [summation over (N/i=1)] ([S.sup.2.sub.y,ig] - [S.sup.2.sub.x,ig] [beta]) + [epsilon].
Y and [epsilon] are column vectors with T-elements of log [y.sub.t] and [[epsilon].sub.t]; [[beta].sub.i] and [beta] are column vectors with J-elements of [[beta].sub.ji], and [[beta].sub.j]; and the J x T exogenous variable matrices can be represented as
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
If elements of column vector [xi] are equal to [[epsilon].sub.t] with the usual property of the error terms, (9) is a macro counter part of the micro model represented by (1), and would result in a consistent aggregation. However, if covariance and second-order approximation terms are nonzero, the assumption that the error term ([[xi].sub.t]) has the usual property of zero mean cannot be justified.
The implications of the nonzero error term means in terms of aggregation bias can be illustrated via the following. If one applies a least square regression of Y on X for (9), the coefficient vector of this regression is
(10) b = [(X'X).sup.-1] X'Y = [beta] + [(X'X).sup.-1] X[xi] = [beta] + 1/N [summation over (N/i=1)] [[(X'X).sup.-1] [X'X.sub.i] - I] x ([[beta].sub.i] - [beta]) + 1/2N [summation over (N/i=1)] [(X'X).sup.-1] X' x ([S.sup.2.sub.y,ig] - [S.sup.2.sub.x,ig] [beta]) + [(X'X).sup.-1] X'[epsilon] = [beta] + [summation over (N/i=1)] [[W.sub.i] - 1/N I] [[beta].sub.i] + [summation over (N/i=1)] [R.sub.i] + [(X'X).sup.-1] X'[epsilon]
where [W.sub.i] = [(X'X).sup.-1] X' 1/N [X.sub.i], with [summation over (N/i=1)] [W.sub.i] = I; [R.sub.i] = [(X'X).sup.-1] X' 1/2N ([S.sup.2.sub.y,ig] - [S.sup.2.sub.x,ig] [beta]). Taking the expectation of (10) with the assumption that E([epslion]) = 0 yields
(11) E(b) = [beta] + [summation over (N/i=1)] [[W.sub.i] - 1/N I] [[beta].sub.i] + [summation over (N/i=1)] [R.sub.i].
The second term of the right-hand side of (11) represents the aggregation bias from ignoring the parameter heterogeneity, while the last term reflects the bias caused by using linearly aggregated data for the nonlinear macro model. When the parameter vectors are all identical over cross-section i, the second term vanishes, which is consistent with Theil's results. However, the double-log macro model still suffers because of the linearly aggregated data.
An Application to the Evaluation of U.S. Milk Advertising Programs
This section illustrates some of the conceptual results we derived in the previous section. The data used for this illustration are monthly household purchases and price of fluid milk products, household income, and advertising expenditure data for fluid milk from January 1996 through September 1999. We used the ACNielsen Homescan Panel data for the household purchase, price and income variables. (2) The original sample included approximately 32,000 households for the entire U.S. market. Since we are interested in tracking continuous and economically meaningful purchasing behavior of households, we selected only panelists who participated in reporting for the entire sample period and who purchased at least once every month. Then, we randomly sampled 10% of households for computational convenience. The final sample used for this illustration consists of 772 households. The monthly advertising expenditure data were provided by the Bozell Inc. for seventy-five major media markets. (3)
First, we estimate the micro and macro models. The empirical specification of the micro model represented by (1) is given by
(12) log [Q.sub.it] = [[beta].sub.0i] + [[beta].sub.1i] log [P.sub.it] + [[beta].sub.2i] log [M.sub.it] + [[beta].sub.3i] log [ADV.sub.it] + [summation over (6/j=4)] [[beta].sub.ji] [QTR.sub.j] + [[epsilon].sub.it]
where Q is monthly per capita milk purchases, P is average net milk purchase price (gross price minus coupon value) deflated by CPI for nonalcoholic beverages, M is monthly household income deflated by CPI for all items, ADV is monthly advertising expenditures (deflated by the media cost index) for each market area, (4) [epsilon]'s are error terms with zero mean, QTR are dummy variables accounting for seasonal variation of milk consumption, i = 1,...., 772 and t = 1,...., 45.
We estimate (12) for each household so as to obtain household specific coefficient estimates. Estimates of the price variable have the correct sign (-) in 757 households, and are significant at the 10% level in more than half of these households. Estimates of the income variable also show the correct sign (+) in most households (623 households), but are significant at the 10% level in only a few cases (259 households). Finally, the advertising coefficients have the correct sign (+) in 577 households and are significant at the 10% level in 506 out of the 577 households. Mean values of these household specific estimates are listed in table 1.
The empirical counterpart of the macro model represented by (3) is given by
(13) log [Q.sub.t] = [[gamma].sub.0] + [[gamma].sub.1] log [P.sub.t] + [[gamma].sub.2] log [M.sub.t] + [[gamma].sub.3] log [ADV.sub.t] + [summation over (6/j=4)] [[gamma].sub.j][QTR.sub.j] + [[epsilon].sub.t]
where Q, P, M, ADV are averages of per capita purchases, price, income, and advertising expenditures over households in the sample for each time period t. Parameter estimates obtained from this model are also reported in table 1.
Equation (11) indicates that the aggregation bias depends on the corresponding and non-corresponding parameters, and coefficients from auxiliary regressions. This becomes clearer when we present (11) in scalar form for our empirical model:
(14) E([b.sub.1]) = [[beta].sub.1] + [summation over (N/i=1)] ([w.sub.11i] - 1/N) [[beta].sub.1i] + [summation over (N/i=1)] [w.sub.12i][[beta].sub.2i] + [summation over (N/i=1)] [w.sub.13i][[beta].sub.3i] + [summation over (N/i=1)] [r.sub.1i]
(15) E([b.sub.2] = [[beta].sub.2] + [summation over (N/i=1)] ([w.sub.22i] - 1/N) [[beta].sub.2i] + [summation over (N/i=1)] [w.sub.21i][[beta].sub.1i] + [summation over (N/i=1)] [w.sub.23i][[beta].sub.3i] + [summation over (N/i=1)] [r.sub.2i]
(16) E([b.sub.3] = [[beta].sub.3] + [summation over (N/i=1)] ([w.sub.33i] - 1/N) [[beta].sub.3i]
+ [summation over (N/i=1)] [w.sub.31i][[beta].sub.1i] + [summation over (N/i=1)] [w.sub.32i][[beta].sub.2i]
+ [summation over (N/i=1)] [r.sub.3i].
The first term in the RHS of each equation is the mean of corresponding micro parameters; the second, third, and fourth terms represent the aggregation bias due to the heterogeneity of parameters over individual households; and the last term is the aggregation bias due to use of linearly aggregated data for the nonlinear model. Here, the elements of the matrices W and R in (11) are derived from the following auxiliary regressions:
(17) 1/N log [P.sub.it] = [w.sub.01i] + [w.sub.11i] log [P.sub.t] + [w.sub.21i] log [M.sub.t] + [w.sub.31i] log [ADV.sub.t] + [v.sub.1it]
(18) 1/N log [M.sub.it] = [w.sub.02i] + [w.sub.12i] log [P.sub.t] + [w.sub.22i] log [M.sub.t] + [w.sub.32i] log [ADV.sub.t] + [v.sub.2it]
(19) 1/N log [ADV.sub.it] = [w.sub.03i] + [w.sub.13i] log [P.sub.t] + [w.sub.23i] log [M.sub.t] + [w.sub.33i] log [ADV.sub.t] + [v.sub.3it]
(20) 1/2N ([S.sup.2.sub.y,ig]-[S.sup.2.sub.x,ig][beta]) = [r.sub.0i]+[r.sub.1i] log [P.sub.t] + [r.sub.2i] log [M.sub.t] + [r.sub.3i] log [ADV.sub.t] + [k.sub.it].
To avoid computational complexity, we omit auxiliary regressions for the seasonality variables. This can be justified because we do not expect any serious aggregation bias from these variables that are identical across households.
With estimates from the 772 micro models and auxiliary regressions, E([b.sub.i]) can be calculated, and the resulting aggregations bias can be obtained from (14)-(16). Note that (14)-(16) require true parameters of micro models. However, since only estimates of these parameters are available in this study, we will compute the aggregation bias assuming that the estimates are equal to their parameters. Of course, this would result in some level of sampling error in the analysis. The difference between E([b.sub.i]) and estimates of the macro model represented by (13) is considered as a sampling error and is reported along with three components of aggregation bias in table 1.
The estimates [SIGMA] [[beta].sub.1]/N, [SIGMA] [[beta].sub.2]/N, and [SIGMA] [[beta].sub.3]/N are within one standard error of [[gamma].sub.1], [[gamma].sub.2], and [[gamma].sub.3]. Therefore, if the standard error is used to measure the bias of the macro estimates, one can say that these macro coefficients are not very different from the micro parameters. However, it should be noted that they are numerically very different and the magnitude of this difference is not negligible. In particular, income and advertising variables have different signs. In addition, the macro estimates of the advertising effects are not significant whereas the micro estimates of advertising had positive signs and were significant at the 10% level in 506 out of the 772 households. This shows that the aggregation bias could potentially be misleading for advertising evaluation if researchers had access only to aggregate data.
It is seen that the aggregation bias of the price coefficient resulting from the use of linearly aggregated data for nonlinear model estimation plays the major role in forming aggregation bias while corresponding micro parameters contribute the least. Using linearly aggregated data for nonlinear functions again causes the largest aggregation bias for the income coefficient, whereas non-corresponding micro parameters bring the largest bias for the advertising coefficient. This indicates that there is no order of magnitude in general among the three sources of the aggregation bias.
The estimated sampling error in the last row of table 1 is almost the same magnitude as total aggregation bias for all three variables. This sampling error should become smaller as the number of cross-sections becomes larger, based on the law of large numbers. Although there is no guarantee that numerical results will be similar in other cases, the extent of aggregation bias in table 1 appears to be of concern because a large amount of research on generic advertising evaluation has been conducted with aggregated national-level data. Hence, evaluation results based only on aggregate data could provide potentially erroneous policy implications.
Conclusions
This article examines the potential data aggregation proble in generic advertising research. The motivation for this study is that the majority of past studies in this research area have relied upon aggregate data (mostly at the national level), while many studies in empirical economics have reported that estimates from aggregate data may be biased. In the past, researchers were generally forced to use highly aggregate data because of the limited availability of household or store-level scanner data. However, as these disaggregate data become more available, current researchers have options to choose between disaggregate and aggregate data. This study investigates how seriously the data aggregation may affect the evaluation of generic advertising. Deriving a statistical procedure to quantify the extent of bias for a double-log advertising model, we show that the aggregation bias exists as long as the covariances between marketing variables and corresponding parameters are nonzero, or the linearly aggregated d ata are used for nonlinear models.
For the purpose of illustration, the procedure is applied to the evaluation of U.S. milk advertising programs. We found significant aggregation bias in all three variables estimated: price, income, and advertising. Particularly, the macro estimate of advertising variable had a different sign from the mean of micro estimates. This illustrates that the aggregation bias could potentially mislead the advertising evaluation if one had access only to aggregate data. We hope our discussion stimulates empirical research on the commodity advertising evaluation that has traditionally relied on aggregate data. We also hope to trigger research interest in both conceptual and empirical modeling for the use of micro data.
Table 1
Aggregation Bias of Macro Coefficients
Price Income Advertising
Average of micro parameters,
[SIGMA] [[beta].sub.i]/N -0.4878 0.1585 0.0066
Macro estimates, [gamma] -0.3618 -0.6823 -0.0476
(0.2153) (1.1372) (0.3173)
Total of estimated coefficients, E 1.3944 -2.0650 -0.9785
(b)
Aggregation bias, E(b) - [SIGMA] 1.8822 -2.2235 -0.9851
[[beta].sub.i]/N
corresponding micro parameters 0.0138 -0.5925 0.0019
non-corresponding micro parameters 0.2936 -0.2066 -0.8005
nonlinear model estimation with
linearly aggregated data 1.5748 -1.4244 -0.1865
Sampling error, [gamma] - E(b) -1.7562 1.3827 0.9309
Numbers in parentheses are standard errors for macro estimates.
(1.) Following Cramer, consider N positive variable [a.sub.i] i = 1, ..., N, which is characterized as
(a) [a.sub.i] = log [x.sub.i].
Then, we can write
(b) a = 1/N [summation over (N/i=1)] log [x.sub.i] = log [x.sub.g].
From (a), [x.sub.i] can be written as an exponential and then expanded in a power series:
(c) [x.sub.i] = exp([a.sub.i]) = exp(a) exp([a.sub.i] - a) = exp(a) {1 + ([a.sub.i] - a) + 1/2! [([a.sub.i] - a).sup.2] + 1/3! [([a.sub.i] - a).sup.3].......}.
Applying the definition of arithmetic mean of xi to (c), we have
(d) x = exp(a) {1 + 1/N [summation over (i)] ([a.sub.i] - a) + 1/2! 1/N [summation over (i)] [([a.sub.i] - a).sup.2] + 1/3! 1/N [summation over (i)] [([a.sub.i] - a).sup.3].......}.
Then, with (b) and [[SIGMA].sub.i] ([a.sub.i] - a)/N = 0, (d) leads to equation (4) in the text.
(2.) Copyright 2000 by ACNielsen
(3.) It would be more desirable to have advertising data at the level of households. However, it is almost impossible to obtain this type of data.
(4.) In this study, we define an advertising goodwill variable to be a function of current and lagged advertising expenditures, which allows for carryover effects of advertising on purchases. For a further discussion on the lag structure of this goodwill function, refer to Chung and Kaiser, and Cox.
References
Christen, M., S. Gupta, J.C. Porter, R. Staelin, and D.R. Wittink. "Using Market-Level Data to Understand Promotion Effects in a Nonlinear Model." J. Marketing Res. 34(August 1997):322-34.
Chung, C., and H.M. Kaiser, "Determinants of Temporal Variations in Generic Advertising Effectiveness." Agribusiness 16(2000):197-214.
Cox, T.L. "A Rotterdam Model Incorporating Advertising Effects: The Case of Canadian Fats and Oils." Commodity Advertising and Promotion. H. Kinnucan, S.R. Thompson, and H.S. Chang, eds., pp. 139-64. Ames: Iowa State University Press, 1992.
Cooper, L.G., and M. Nakanishi. Market Share Analysis: Evaluating Competitive Marketing Effectiveness. Boston: Kluwer Academic Publishers, 1988.
Cramer, J.S. Empirical Economics. North-Holland Publishing Company, Amsterdam, 1971.
Ferrero, J., L. Boom, H.M. Kaiser, and O.D. Forker. Annotated Bibliography of Generic Commodity Promotion Research (Revised). NICPRE Research Bulletin 96-03, Cornell University, 1996.
Judge, G.G., WE. Griffiths, R.C. Hill, H. Lutkepohl, and T-C Lee. The Theory and Practice of Econometrics, 2nd ed. New York: John Wiley & Sons, 1985.
Kirman, A.P. "Whom or What Does the Representative Individual Represent.?" J. Economic Perspectives 6(2) (Spring 1992):117-36.
Krishnamurthi, L., S.P. Raj, and R. Selvam. Statistical and Managerial Issues in Cross-Sectional Aggregation. Working Paper, Northwestern University, 1990.
Maddala, G.S. Econometrics. New York: McGraw-Hill Book Company, 1977.
Sonnenschein, H. "The Utility Hypothesis and Market Demand Theory." W Econ. J. 11(1973):404-10.
Stoker, T.M., "Empirical Approaches to the Problem of Aggregation Over Individuals." J. Econ. Lit. 31(December 1993):1827-74.
Tellis, G.J., and D.L. Weiss. "Does TV Advertising Really Affect Sales? The Role of Measures, Models, and Data Aggregation." J. Advertising 24(Fall 1995):1-12.
Theil, H. Principles of Econometrics. New York: John Wiley & Sons, 1971.
This paper was presented at the ASSA winter meetings (Atlanta, GA, January 2002). Papers in these sessions are not subjected to the journal's standard refereeing process.
Chanjin Chung is assistant professor, Department of Agricultural Economics, Oklahoma State University and Harry M. Kaiser is professor, Department of Applied Economics and Management, Cornell University.