7  Conclusion

This analysis has investigated how survey responses from HES compare with the administrative data available in the IDI on a limited range of variables. Three types of comparative analysis were presented. The first compared the linked population to the unlinked population, which concluded that there are significant differences in the two populations in terms of income and ethnicity, which would need to be addressed through adjustments to the calibration processes for Taxwell. The second type of comparative analysis compared HES income to IRD income. We found an overall strong correlation, although on average, HES income is 1.5-6% higher than IRD income. This suggests that a tax model based on IRD income data could be somewhat more accurate. The third type of comparative analysis compared HES benefit measures to MSD benefit measures and concluded that, because the survey measures of benefit receipt compare poorly to the administrative measures, there is a strong case for incorporating administrative data on benefit receipt into Taxwell.

For researchers considering supplementing HES with administrative data, changing the benefit data is where the largest gains are likely to be made. Following that, replacing the HES income data with IRD data is the next logical step, though because some categories of income are collected in HES but not recorded in the linked IRD data (such as investment income and irregular income), which income source a researcher prefers will depend on their research question. As part of making any of these changes, the survey weights and population definition would need careful consideration.

