Bayesian correction for missing rich using a Pareto II tail with unknown threshold: Combining EU-SILC and WID data

Résumé 0

Survey data are known for under-reporting rich households while providing large information on contextual variables. Tax data provide a better representation of top incomes at the expense of lacking any contextual variables. So the literature has developed several methods to combine the two sources of information. For Pareto imputation, the question is how to chose the Pareto model for the right tail of the income distribution. The Pareto I model has the advantage of simplicity. But Jenkins (2017) promoted the use of the Pareto II for its nicer properties, reviewing three different approaches to correct for missing top incomes. In this paper, we propose a Bayesian approach to combine tax and survey data, using a Pareto II tail. We build on the extreme value literature to develop a compound model where the lower part of the income distribution is approximated with a Bernstein polynomial truncated density estimate while the upper part is represented by a Pareto II. This provides a way to estimate the threshold where to start the Pareto II. Then WID tax data are used to build up a prior information for the Pareto coefficient in the form of a gamma prior density to be combined with the likelihood function. We apply the methodology to the EU-SILC data set to decompose the Gini index. We finally analyse the impact of top income correction on the Growth Incidence Curve between 2008 and 2018 for a group of 23 European countries.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en