Small Business Resources, Business Advice and Forms from AllBusiness.com
 

Estimating Lorenz curves using a Dirichlet distribution.

By Griffiths, William E.
Publication: Journal of Business & Economic Statistics
Date: Monday, April 1 2002

The Lorenz curve relates the cumulative proportion of income to the cumulative proportion of population. When a particular functional form of the Lorenz curve is specified, it is typically estimated by linear or nonlinear least squares estimation techniques that have good properties when the

error terms are independently and normally distributed. Observations on cumulative proportions are clearly neither independent nor normally distributed. This article proposes and applies a new methodology that recognizes the cumulative proportional nature of the Lorenz curve data by assuming that the income proportions are distributed as a Dirichlet distribution. Five Lorenz curve specifications are used to demonstrate the technique. Maximum likelihood estimates under the Dirichlet distribution assumption provide better fitting Lorenz curves than nonlinear least squares and another estimation technique that has appeared in the literature.

KEY WORDS: Gini coefficient; Maximum likelihood estimation.

1. INTRODUCTION

The Lorenz curve is one of the most important tools upon which the measurement of income inequality is based. For a given economy or region, it relates the cumulative proportion of income to the cumulative proportion of population, after ordering the population according to increasing level of income. A number of approaches to Lorenz curve estimation have been adopted. In one approach, a particular assumption about the statistical distribution of income is made, the parameters of this income distribution are estimated, and a Lorenz curve consistent with the distributional assumption and consistent with the parameter estimates for that distribution is obtained. See, for example, McDonald (1984) and McDonald and Xu (1995). Ryu and Slottje (1996) suggest another approach. They approximate the Lorenz curve from any income distribution by expanding the inverse distribution function in terms of (a) an exponential polynomial series and (b) a sequence of Bernstein polynomial functions. When micro-data are available, nonparametric estimation of the Lorenz curve and related inequality measures is possible. See, for example, Beach and Davidson (1983); Gastwirth and Gail (1985); and Bishop, Chakraborti, and Thistle (1989). An alternative approach, more suited to grouped data, is to specify a particular functional form for the Lorenz curve and estimate it directly. It is this approach that is the focus of this article.

Early breakthroughs on Lorenz curve estimation were those of Gastwirth (1972) and Kakwani and Podder (1973, 1976). Kakwani and Podder recognized the multinomial nature of grouped data and used a Lorenz curve specification that, after transformation, could be placed in an approximate linear model framework. Other specifications have typically been estimated by linear or nonlinear least squares (Kakwani 1980; Basmann, Hayes, Slottje, and Johnson 1990; and Chotikapanich 1993). Such exercises are useful for fitting Lorenz curves, but, because the covariance matrix estimates they provide are only relevant for independent normally distributed errors, they do not provide a basis for inference about Lorenz curve parameters or any inequality measures derived from them. Clearly, observations on cumulative proportions, or even their logarithms if such a transformation is convenient, will be neither independent nor normally distributed. Sarabia, Castillo, and Slottje (1999) overcome this problem by suggesting a distribution-free method of estimation. Suppose that a Lorenz curve has n unknown parameters, and that M observations on the cumulative proportions are available. They find a set of parameter estimates for each of the K = ([??]) subsets of n observations. Because each of the subsets yields n equations in n unknown parameters, a set of parameter estimates is obtained by solving these equations. The medians of the sets of parameter estimates are recommended as the final set of estimates. No distribution theory is available for this procedure, but the authors do provide some bootstrap standard errors.

An alternative way to proceed, and the approach adopted in this article, is to choose a distributional assumption that is consistent with the proportional nature of the data and to pursue maximum likelihood (ML) estimation. A suitable distribution is the Dirichlet distribution. It is a multivariate distribution for a vector of random variables that are shares that sum to unity. By relating the parameters of the Dirichlet distribution to Lorenz curve differences, we can accommodate the cumulative proportional nature of the Lorenz curve data and set up a likelihood function dependent on the unknown parameters of the Lorenz curve. A similar approach was adopted by Woodland (1979) for estimation of share equations that arise in demand and production theory. To further motivate the choice of a Dirichlet distribution, note that, with random sampling, the number of households in each of a number of income classes can be viewed as an observation from the multinomial distribution (Aigner and Goldberger 1970, Kakwani and Podder 1973). Furthermore, by using a transformation from cell numbers to cell proportions, the multinomial distribution can be approximated by a Dirichlet distribution (Johnson 1960, Johnson and Kotz 1969, p. 285). Thus, the Dirichlet distribution is a reasonable choice for share data, irrespective of the original income distribution from which the observations were drawn. The choice of a Dirichlet distribution for income shares is much less arbitrary than choosing a specific income distribution. In addition, the number of recognized multivariate distributions that are directly applicable to share data is very limited. Apart from the Dirichlet distribution, only two other possibly relevant generalized beta distributions are described by Johnson and Kotz (1972). These facts and the general lack of recognition of the share nature of the data in much of the literature on Lorenz curve estimation make the Dirichlet distribution a useful alternative to pursue.

In Section 2, we outline the distributional assumptions and how they relate to Lorenz curve estimation. The likelihood function for a set of unknown Lorenz curve parameters is derived. To illustrate our suggested techniques we use data on Sweden and Brazil considered earlier by Shorrocks (1983) and revisited by Sarabia et al. (1999). These data are described in Section 3; five different Lorenz functions that we use in the empirical work are presented. The results are given and discussed in Section 4. Several questions are investigated. To determine whether the results are sensitive to the chosen estimation technique we compare our estimates and their standard errors with those obtained by Sarabia et al. (1999) and those obtained using nonlinear least squares. Because Lorenz-curve estimation is usually a first step toward estimating inequality, ML and nonlinear least squares estimates for the Gini coefficient are obtained for each Lorenz curve specification. Finally, we attempt to determine which estimation technique leads to the best fitting Lorenz curve.

2. MODELS, ASSUMPTIONS, AND ESTIMATION

Suppose we have available observations on cumulative proportions of population ([[pi].sub.1], [[pi].sub.2], ..., [[pi].sub.M] with [[pi].sub.M] = 1) and corresponding cumulative proportions of income ([[eta].sub.1], [[eta].sub.2], ..., [[eta].sub.M] with [[eta].sub.M] = 1) obtained after ordering population units according to increasing income. We wish to use these observations to estimate a parametric version of a Lorenz curve that we write as [eta] = L ([pi]; [beta]), where [beta] is an (n x 1) vector of unknown parameters. Clearly, one would not expect all data points to lie exactly on the curve [[eta].sub.i] = L([[pi].sub.i]; [beta]). It seems reasonable to assume, however, that conditional on the population proportions [[pi].sub.i], the income shares [q.sub.i] = [[eta].sub.i] - [[eta].sub.i-1] are random variables with means

(1) E([q.sub.i]) = E([[eta].sub.i]) - E([[eta].sub.i-1]) = L([[pi].sub.i]; [beta]) - L([[pi].sub.i-1]; [beta]).

Our proposal is to also assume that q = ([q.sub.1], [q.sub.2], ..., [q.sub.M])' follows a Dirichlet distribution which is a distribution consistent with the share nature of the random vector q. The probability density function (pdf) for the Dirichlet distribution is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [alpha] = ([[alpha].sub.1], [[alpha].sub.2], ..., [[alpha.sub.M])' are the parameters of the pdf and [GAMMA](*) is the gamma function. By relating the [[alpha].sub.i] to the Lorenz function, we can find a pdf for q which has the mean given in Equation (1) and which is a function of the Lorenz curve parameters. Working in this direction, we set

(3) [[alpha].sub.i] = [lambda][L([[pi].sub.i]; [beta]) - L([[pi].sub.i-1]; [beta])],

where [lambda] is an additional unknown parameter. This definition for [[alpha].sub.i] gives the desired result because the mean of the Dirichlet distribution is given by

(4) E([q.sub.i] = [[alpha].sub.i]/[[alpha].sub.1] + [[alpha].sub.2] + ... + [[alpha].sub.M]

= [lambda][L([[pi].sub.i]; [beta]) - L([[pi].sub.i-1]; [beta])]/[lambda][[summation of].sup.M.sub.i=1] [L([[pi].sub.i]; [beta]) - L([[pi].sub.i-1]; [beta])]

= L([[pi].sub.i]; [beta] - L([[pi].sub.i-1]; [beta]),

inasmuch as L([[pi].sub.M]; [beta]) = 1 and L([[pi].sub.0]; [beta]) = 0. We can now write the pdf for q as

(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [theta] = ([beta]', [lambda])'.

The variances and covariances between the shares are given by (Johnson and Kotz, 1972, pp. 231-234)

(6) var([q.sub.i]) = E([q.sub.i])[1 - E([q.sub.i])]/[lambda] + 1

(7) cov([q.sub.i], [q.sub.j] = - E([q.sub.i])E([q.sub.j])/[lambda] + 1.

Thus, the income shares are correlated, with correlations given by

(8) [r.sub.ij] = - [[E([q.sub.i])E([q.sub.j])/[1 - E([q.sub.i])][1 - E([q.sub.j])]].sup.1/2].

Because the variances depend on E([q.sub.i]), the shares are also heteroscedastic. The parameter [lambda] acts as an inverse variance parameter. The larger the value of [lambda], the better the fit of the Lorenz curve to the data.

The ML estimate for [theta] can be found by maximizing the log-likelihood function,

(9) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

3. DATA AND LORENZ CURVES

To illustrate our suggested techniques we use income distribution data on national samples of income recipients for a year close to 1970, for two countries, Sweden and Brazil. These data were used by Sarabia et al. (1999). They were derived from Jain (1975) and first published by Shorrocks (1983). The data are in the form of decile cumulative income shares. Shorrocks used the data on these two countries as part of the data on a group of 20 countries to examine the ranking of income distributions given different social states. Sarabia et al. (1999) used the data to illustrate their proposed method for the estimation of Lorenz curves. The data on these two countries were chosen because of their differences in the degree of inequality in income distributions.

A large number of functional forms have been suggested in the literature for modeling the Lorenz curve. For details of the various alternatives, see Sarabia et al. (1999) and references therein. To keep our study manageable, we chose only five, ranging from one simple function with only one unknown parameter, to two three-parameter functions which are more flexible but also harder to estimate precisely. The five different Lorenz functions to which we applied the two data sets are

(10) [L.sub.1]([pi]; k) = [e.sup.k[pi]] - 1/[e.sup.k] - 1] k > 0

(11) [L.sub.2] ([pi]; [alpha], [delta]) = [[pi].sup.[alpha]][1 - [(1 - [pi]).sup.[delta]]] [alpha] [greater than or equal to] 0, 0 < [delta] [less than or equal to] 1

(12) [L.sub.3] ([pi]; [delta], [gamma]) = [[1 - [(1 - [pi]).sup.[delta]]].sup.[gamma]] [gamma] [greater than or equal to] 1, 0 < [delta] [less than or equal to] 1

(13) [L.sub.4] ([pi]; [alpha], [delta], [gamma]) = [[pi].sub.[alpha]][[1 - [(1 - [pi]).sup.[delta]].sup.[gamma]] [alpha] [greater than or equal to] 0, [gamma] [greater than or equal to] 1, 0 < b [less than or equal to] 1

(14) [L.sub.5] ([pi]; a, b, d) = [pi] - a[[pi].sup.d][(1 - [pi]).sup.b] a > 0, 0 < d [less than or equal to] 1, 0 < b [less than or equal to] 1.

The function [L.sub.1] is the relatively simple one-parameter function suggested by Chotikapanich (1993); [L.sub.2] coincides with the proposal of Ortega, Fernandez, Lodoux, and Garcia (1991). [L.sub.3] is a well-known form of Lorenz curve suggested by Rasche, Gaffney, Koo, and Obst (1980), and [L.sub.4] is an extension of [L.sub.3] and [L.sub.2] introduced by Sarabia et al. (1999). Note that [L.sub.4] nests both [L.sub.2] and [L.sub.3], where [L.sub.2] is [L.sub.4] with [gamma] = 1 and [L.sub.3] is [L.sub.4] with [alpha] = 0. Setting both [gamma] = 1 and [alpha] = 0 yields the Lorenz curve L = 1 - [(1 - [pi]).sup.[delta]], which originates from the classical Pareto distribution. The function [L.sub.5] is the "beta function" proposed by Kakwani (1980). It is considered one of the best performers among a number of different functional forms for Lorenz curves. See, for example, Datt (1998). Note that when a = 1 and d = 1, [L.sub.5] is the same as [L.sub.2] with [alpha] = 1.

Once a Lorenz curve has been estimated, one is usually interested in various inequality measures that are related to it. As an example, we compute ML estimates for the Gini coefficients that can be derived from each of the Lorenz functions. In each case the Gini coefficient is defined as

(15) G = 1 - 2 [[integral].sup.1.sub.0] L([pi]; [beta]) d[pi].

Alternative expressions for G can be found for some of the Lorenz curves. However, with the exception of [L.sub.1], they still generally involve a numerical integral. We obtain ML estimates by numerically evaluating (15) in each case with [beta] replaced by the ML estimate [beta].

4. RESULTS

In addition to ML estimation using the assumption of a Dirichlet distribution, we also estimated each function, using nonlinear least squares (NL). Because nonlinear least squares has been popular in the literature, it is useful to compare its estimates and standard errors with those from ML estimation. However, conventional NL standard errors are computed assuming independent identically distributed error terms, an assumption that is unrealistic for share data. Thus, for NL standard errors we report those suggested by Newey and West (1987). The estimates and standard errors obtained by Sarabia et al. (1999) for [L.sub.2], [L.sub.3], and [L.sub.4] are also reported; they provide further evidence of the sensitivity of estimates to choice of estimation technique. However, "Sarabia estimates" for [L.sub.1] and [L.sub.5] are not available, nor are the standard errors for the "Sarabia-based" Gini coefficient estimates for all functions.

Point estimates and standard errors of the Lorenz curve parameters and the corresponding Gini coefficients for Sweden are presented in Table 1. With the exception of the function [L.sub.4], the estimates of the Lorenz parameters and the Gini coefficient are not sensitive to the estimation technique. NL, ML, and "Sarabia" lead to almost identical estimates. For [L.sub.4] there is considerable variation in the Lorenz parameter estimates, and the Sarabia-estimated Gini coefficient is noticeably different from the others. A somewhat remarkable outcome is that, with the exception of the estimate by Sarabia et al. from [L.sub.4], the point estimates of the Gini coefficient are relatively insensitive to estimation technique and functional form specification.

Although point estimation is robust with respect to choice of estimation technique (and functional form), assessment of the reliability of the estimates, via their standard errors, is heavily dependent on estimation technique. Choosing a ML technique that is consistent with the share nature of the data can have a big impact on the perceived precision of the estimates. In Table 1 the standard errors for ML are generally higher than those for NL; those reported by Sarabia et al. are higher for some coefficients and lower for others. The standard errors of the Gini coefficient were calculated using the asymptotic approximation

(16) var(G) = [differential]G/[differential][beta] '[V.sub.[beta]][differential]G/[differential][beta],

where [V.sub.[beta]] is the asymptotic covariance matrix for the ML or NL estimator for [beta]. Expressions derived using (16) for each of the Lorenz curves are given in the Appendix.

The remarks made about Sweden also hold for the estimates for Brazil given in Table 2, with some minor exceptions. Once again, there are vastly different estimates for [L.sub.4], confirming considerable instability in the estimation of this function. In contrast to Sweden, estimates of the [L.sub.1] parameter and corresponding Gini coefficient are also sensitive to choice of estimation technique. The other functions remain insensitive to choice of estimation technique. Except for [L.sub.1] the Gini coefficient estimates are insensitive with respect to both estimation technique and choice of functional form. Despite yielding similar point estimates, the three estimation techniques yield very different standard errors.

We turn now to questions of goodness of fit and choice between alternative Lorenz functions. For a straight goodness-of-fit comparison, we compare values of information inaccuracy (Theil 1967, 1975). For testing of nested functional forms, we use likelihood ratio tests and the ML estimates.

Let [q.sub.i] denote the predicted income shares obtained from an estimated model. Theil's (1967) measure of information inaccuracy is defined as

(17) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Estimated functions with smaller values of I are better fits than those with larger values. If the [q.sub.i] are similar to the [q.sub.i], then knowing their values provides little information relative to knowledge of the predictions. The function is a good fit. On the other hand, [q.sub.i] quite different from the [q.sub.i] convey considerable information, leading to a large value of I and a poor fit. The information inaccuracy measure was computed using predictions from the NL and ML estimates, and for the Sarabia et al. estimates for functions [L.sub.2], [L.sub.3], and [L.sub.4]. The outcomes are presented in Table 3.

For the Swedish data, ML estimation provides a better fit than NL for all functional forms. It also provides better fits than those from the technique suggested by Sarabia et al. for the functions they considered. The differences are not great for [L.sub.1], [L.sub.2], and [L.sub.3]; they are most noticeable for [L.sub.4] and [L.sub.5]. The large improvement of ML over NL in the case of [L.sub.5] is perhaps surprising, given the apparent similarity of the two sets of Lorenz curve estimates. A closer examination of the two sets of predictions for this case revealed that they were not as close as one might suspect from a comparison of parameter estimates. Also, NL led to some relatively large overpredictions that were penalized heavily by the information criterion. Finally, it is interesting that a ranking of the relative magnitudes of the ML standard errors for the Gini coefficient corresponds exactly to a goodness-of-fit ranking of the ML-estimated Lorenz functions.

The information inaccuracies for the Brazilian data lead to the same conclusions, with two small modifications. NL and ML estimation of [L.sub.5] had the same fit. NL provided a better fit than ML for [L.sub.1].

To provide information about choice of functional form we attempted to determine whether likelihood ratio tests suggested that nested versions of [L.sub.4] and [L.sub.5] would be adequate. The availability of these tests is one of the advantages of the ML methodology that we have proposed. Table 4 contains [chi square] values for likelihood ratio tests for various hypotheses. These results suggest that [L.sub.3] is an acceptable restricted version of [L.sub.4] for both Sweden and Brazil. Also, [L.sub.2] is an acceptable restricted version of [L.sub.4] for Sweden, but not for Brazil. Finally, a restricted version of [L.sub.2], obtained by setting [alpha] = 1, is clearly rejected relative to the best fitting [L.sub.5].

5. CONCLUSIONS AND SUMMARY

One way of estimating a Lorenz curve is to assume a particular distribution for income, estimate the parameters of that distribution, and derive the corresponding Lorenz curve. Another way is to assume a particular Lorenz curve and estimate its parameters. For this second approach we have suggested a distributional assumption and a corresponding estimation technique that is consistent with the proportional nature of Lorenz curve data, can be used to approximate share data from any income distribution, and can be employed with any Lorenz curve specification.

Our model and estimation technique was applied to two data sets that have been the subject of past analyses, one for Sweden, a country with relatively low inequality, and one for Brazil, a country with relatively high inequality. Results were obtained for five different Lorenz curve specifications. Our findings do not necessarily carry over to other data sets and other functions. With this fact kept in mind, we reached the following conclusions. Point estimation of the Gini coefficient was generally insensitive to choice of distributional assumption, estimation technique, and Lorenz curve specification. There were two exceptions to this conclusion. One was for the function [L.sub.1] applied to the Brazilian data, using the Dirichlet distribution. The second exception was the estimate from [L.sub.4] with the Swedish data and the estimation technique of Sarabia et al. The discrepancy obtained in this case appears to be a consequence of estimation instability associated with this function.

Although point estimation of the Gini coefficient was robust, assessment of the precision of estimation was not. It depended heavily on choice of functional form and choice of estimation technique. With respect to estimation technique, we found that ML estimation, under our proposal to use the Dirichlet distribution, provided the best fit, Useful future work would be a Monte Carlo study to assess whether the standard errors produced by each estimation technique are an accurate reflection of finite-sample variability of the estimates.

APPENDIX: EXPRESSIONS FOR VARIANCES OF THE GINI COEFFICIENT

For [L.sub.1]: var(G) = [2([e.sup.k]([e.sup.2] - [k.sup.2] - 2) + 1)/[(k([e.sup.k] - 1)).sup.2].sup.2] var(k).

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[differential]G/[differential][alpha] = -2 [[integral].sup.1.sub.0] [[pi].sup.[alpha]]log([pi])[1 - [(1 - [pi]).sup.[delta]] d[pi]

and

[differential]G/[differential][alpha] = -2 [[integral].sup.1.sub.0] [[pi].sup.[alpha]] [(1 - [pi]).sup.[delta]] log (1 - [pi]) d[pi].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[differential]G/[differential][delta] = 2 [[integral].sup.1.sub.0] [gamma][[1 - [(1 -[pi]).sup.[delta]].sup.[gamma]-1] log (1 - [pi]) d[pi]

and

[differential]G/[differential][gamma] = -2 [[integral].sup.1.sub.0] [1 - [(1 - [pi]).sup.[delta]].sup.[gamma]] log [1 - [(1 - [pi]).sup.[delta]] d[pi].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[differential]G/[differential][alpha] = -2 [[integral].sup.1.sub.0] [[pi].sup.[alpha]] log([pi])[[1 - [(1 - [pi]).sup.[delta]].sup.[gamma]] d[pi]

[differential]G/[differential][gamma] = -2 [[pi].sup.[alpha]] [1 - [(1 - [pi]).sup.[delta]].sup.[gamma]] log [1 - [(1 - [pi]).sup.[delta]] d[pi]

[differential]G/[differential][delta] = 2 [[integral].sup.1.sub.0] [[pi].sup.[alpha]][gamma][[1 - [(1 - [pi]).sup.[delta]].sup.[gamma]-1] [(1 - [pi]).sup.[delta]] log(1 - [pi]) d[pi]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[differential]G/[differential]a = 2 [[integral].sup.1.sub.0] [[pi].sup.d][(1 - [pi]).sup.b] d[pi]

[differential]G/[differential]d = 2 [[integral].sup.1.sub.0] a[[pi].sup.d][(1 - [pi]).sup.b]log([pi]) d[pi]

[differential]G/[differential]b = 2 [[integral].sup.1.sub.0] a[[pi].sup.d][(1 - [pi]).sup.b]log(1 - [pi]) d[pi].

Table 1. Estimates and Standard Errors for Lorenz Parameters and
Gini Coefficients, Sweden

                      [alpha]   [delta]   [gamma]    Gini

[L.sub.2]     NL       .5954     .6352               .3880
                      (.0136)   (.0052)             (.0013)
              ML       .6068     .6412               .3872
                      (.0206)   (.0085)             (.0041)
            Sarabia    .5960     .6400               .3850
                      (.0018)   (.0303)

[L.sub.3]     NL                 .7269    1.5602     .3871
                                (.0032)   (.0076)   (.0007)
              ML                 .7335    1.5767     .3877
                                (.0072)   (.0176)   (.0036)
            Sarabia              .7300    1.5620     .3860
                                (.0263)   (.0022)

[L.sub.4]     NL      -.7552     .7931    2.2893     .3864
                      (.5638)   (.0366)   (.5458)   (.00004)
              ML       .0048     .7330    1.5721     .3876
                      (.6612)   (.0756)   (.6369)   (.0036)
            Sarabia    .0769     .6490    1.1740     .3210
                      (.0003)   (.0977)   (.0002)

                         k
[L.sub.1]     NL      2.5029                         .3792
                      (.0826)                       (.0292)
              ML      2.5313                         .3828
                      (.1831)                       (.0228)

                         a         d         b
[L.sub.5]     NL       .7664     .9397     .5929     .3876
                      (.0148)   (.0138)   (.0108)   (.0010)
              ML       .7492     .9199     .5862     .3870
                      (.0143)   (.0093)   (.0109)   (.0031)
Table 2. Estimates and Standard Errors for Lorenz Parameters and
Gini Coefficients, Brazil

                      [alpha]   [delta]   [gamma]    Gini

[L.sub.2]     NL       .5727     .2876               .6361
                      (.0223)   (.0019)             (.0012)
              ML       .5270     .2857               .6326
                      (.0383)   (.0053)             (.0052)
            Sarabia    .4900     .2780               .6350
                      (.0038)   (.0662)

[L.sub.3]     NL                 .3782    1.4357     .6328
                                (.0038)   (.0127)   (.0010)
              ML                 .3721    1.4160     .6325
                                (.0068)   (.0225)   (.0040)
            Sarabia              .3640    1.3960     .6340
                                (.0713)   (.0004)

[L.sub.4]     NL       .2169     .3467    1.2674     .6339
                      (.1950)   (.0289)   (.1473)   (.0013)
              ML       .0262     .3683    1.3950     .6325
                      (.2148)   (.0318)   (.1734)   (.0039)
            Sarabia    .0770     .6170    1.1740     .6440
                      (.0001)   (.1041)   (.0091)

                         k

[L.sub.1]     NL      5.3685                         .6368
                      (.6726)                       (.1647)
              ML      3.8438                         .5234
                      (.8237)                       (.0747)
                         a         d         b

[L.sub.5]     NL       .9151    1.0001     .2698     .6349
                      (.0030)   (.0024)   (.0016)   (.0003)
              ML       .9131     .9990     .2685     .6349
                      (.0044)   (.0024)   (.0021)   (.0013)
Table 3. Information Inaccuracy Measure

                     Sweden

              ML       NL     Sarabia

[L.sub.1]   .00888   .00892
[L.sub.2]   .00029   .00031   .00030
[L.sub.3]   .00025   .00027   .00026
[L.sub.4]   .00025   .00029   .01259
[L.sub.5]   .00017   .00032

                     Brazil

              ML       NL     Sarabia

[L.sub.1]   .10851   .08791
[L.sub.2]   .00056   .00067   .00070
[L.sub.3]   .00031   .00034   .00035
[L.sub.4]   .00031   .00038   .09710
[L.sub.5]   .00003   .00003

Table 4. The Likelihood Ratio Test

                                             Sweden   Brazil   Critical
                                                                value
[L.sub.4] vs. [L.sub.2]                       1.351    5.333    3.841

[L.sub.4] vs. [L.sub.3]                        .000     .015    3.841

[L.sub.5] vs. [L.sub.2] (with [alpha] = 1)   36.907   31.355    5.991

REFERENCES

Aigner, D. J., and Goldberger, A. S. (1970), "Estimation of Pareto's Law from Grouped Observations," Journal of American Statistical Association, 65, 712-723.

Basmann, R. L., Hayes, K. J., Slottje, D. J., and Johnson J. D. (1990), "A General Functional Form for Approximating the Lorenz Curve," Journal of Econometrics, 43, 77-90.

Beach, C. M., and Davidson, R. (1983), "Distribution-Free Statistical Inference with Lorenz Curves and Income Shares," Review of Economic Studies, 50, 723-735.

Bishop, J. A., Chakraborti, S., and Thistle, P. D. (1989), "Asymptotically Distribution-Free Statistical Inference for Generalized Lorenz Curves," Review of Economics and Statistics, 71, 725-727.

Chotikapanich, D. (1993), "A Comparison of Alternative Functional Forms for the Lorenz Curve," Economics Letters, 41, 129-138.

Datt, G. (1998), "Computational Tools for Poverty Measurement and Analysis," FCND Discussion Paper No. 50, Washington, DC: International Food Policy Research Institute, World Bank.

Gastwirth, J. L. (1972), "The Estimation of the Lorenz Curve and Gini Index," Review of Economics and Statistics, 54, 306-316.

Gastwirth, J. L., and Gail, M. H. (1985), "Simple Asymptotically Distribution-Free Methods Comparing Lorenz Curves and Gini Indices Obtained from Complete Data," in Advances in Econometrics 4, eds. R. L. Basmann and G. F. Rhodes, Jr., Greenwich, CT: JAI Press.

Jain, S. (1975), Size Distribution of Income, Washington, DC: World Bank.

Johnson, N. L. (1960), "An Approximation to the Multinomial Distribution: Some Properties and Applications," Biometrika, 47, 93-102.

Johnson, N. L., and Kotz, S. (1969), Discrete Distributions, New York: Wiley.

-- (1972), Distributions in Statistics: Continuous Multivariate Distributions, New York: Wiley.

Kakwani, N. C. (1980), "On a Class of Poverty Measures," Econometrica, 48, 437-446.

Kakwani, N. C., and Podder, N. (1973), On Estimation of Lorenz Curves from Grouped Observations," International Economic Review, 14, 278-292.

-- (1976), "Efficient Estimation of the Lorenz Curve and Associated Inequality Measures from Grouped Observations," Econometrica, 44, 137-148.

McDonald, J. B. (1984), "Some Generalized Functions for the Size Distribution of Income," Econometrica, 52, 647-663.

McDonald, J. B., and Xu, Y. J. (1995), "A Generalization of the Beta Distribution with Applications," Journal of Econometrics, 66, 133-152.

Newey, W., and West, K. (1987), "A Simple Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix," Econometrica, 55, 703-708.

Ortega, P., Fernandez, M. A., Lodoux, M., and Garcia, A. (1991), "A New Funational Form for Estimating Lorenz Curves," Review of Income and Wealth, 37, 447-452.

Rasche, R. H., Gaffney, J., Koo, A., and Obst, N. (1980), "Functional Forms for Estimating the Lorenz Curve," Econometrica, 48, 1061-1062.

Ryu, H. K., and Slottje, D. J. (1996), "Two Flexible Functional Form Approaches for Approximating the Lorenz Curve," Journal of Econometrics, 72, 251-274.

Sarabia, J.-M., Castillo, E., and Slottje, D. J. (1999), "An Ordered Family of Lorenz Curves," Journal of Econometrics, 91, 43-60.

Shorrocks, A. F. (1983), "Ranking Income Distributions," Economica, 50, 3-17.

Theil, H. (1967), Economics and Information Theory, Amsterdam: North-Holland.

-- (1975), Theory and Measurement of Consumer Demand, Amsterdam: North-Holland.

Woodland, A. D. (1979), "Stochastic Specification and the Estimation of Share Equations," Journal of Econometrics, 10, 361-383.

[Received November 1999. Revised July 2001.].

Markku LANNE
Department of Economics, University of Helsinki, Finland
(markku.lanne@helsinki.fi)

Pentti SAIKKONEN
Department of Statistics, University of Helsinki, Finland
(pentti.saikkonen@helsinki.fi)