Comparing the Household Economic Survey to administrative records: An analysis of income and benefit receipt

4  Comparing the linked people to the unlinked people and summary statistics


  1. The linked and unlinked population are different on socioeconomic and demographic characteristics. Relative to the unlinked population, people in the linked population have higher incomes, are more likely to be male, more likely to report European ethnicity and are less likely to be on a benefit.

  2. Restricting analysis to the linked HES-IDI population means the survey weights would need to be adjusted to make the linked population representative of the New Zealand population.
  3. About 82% of HES income and 96% of IRD income is comparable. Of the HES comparable income, 84% of this income comes from wages and salaries, while for IRD income, this figure is 77%.

This section provides summary statistics for the HES and IRD income data and also compares the linked sample to the unlinked samples across selected socioeconomic and demographic variables. We compare the linked to the unlinked sample on HES variables, as administrative data are, by definition, not available for unlinked people. Caution is required interpreting comparisons based on HES reported measures, as there may be a correlation between being linked and reporting error.[16] All comparisons are on characteristics that can be assigned to the individual. All results are unweighted.

Table 1 compares the linked and unlinked populations across demographic and socioeconomic variables available from HES, while Table 2 shows summary statistics of IRD income (which is naturally only available for the linked population). Most variables in Table 1 show statistically significant differences across the linked and unlinked population, though absolute differences (statistically significant or not) are important too. In general, the linked HES population has higher average income, has a higher average age, is more likely to be of European ethnicity, is more likely to be male and is less likely to be on a benefit.

We have not looked at whether these distributional changes make the linked subsample more or less representative of the New Zealand population. However, because the linked population is different to the unlinked population, it follows that different survey weights would need to be calculated. This calculation is beyond the scope of this paper, and the relevant variables to weight on would depend on a study's research question.

There are plausible reasons why we would expect the link rate differentials observed. Females are more likely to change their name due to marriage (and divorce), and some ethnicities may have more variety in name transliteration or shortening - all of which would present difficulties in the linking process. These groups have lower incomes on average, which may explain why the unlinked populations have lower average incomes.

Table 2 shows summary statistics for IRD data. Naturally this is only for the linked population. The next section compares Table 1 and Table 2.

Table 1: Linked vs unlinked population on selected variables within HES
Number of people 10,626 53,136    
Income († = not part of compared income)        
Total HES income - all categories including those not in IRD (mean) 36,460 41,425 -4966*** -12%
HES compared income (mean) 29,884 34,609 -4,725*** -14%
Wages and salaries (mean) 25,079 29,052 -3,973*** -14%
Self-employment income (mean) 1,754 2,098 -343 -16%
Benefit income (mean)† 2,262 2,019 244*** 12%
Investment income (mean)† 1,326 1,603 -277** -17%
Overseas income (mean)† 930 613 317*** 52%
Other regular income (mean)† 907 1,182 -275*** -23%
Irregular income (mean)† 1,074 1,261 -187 -15%
Age (mean) 43.7 45.9 -2.2*** -5%
Female (proportion) 0.543 0.523 0.02*** 4%
European (proportion) 0.678 0.779 -0.101*** -13%
Māori (proportion) 0.143 0.113 0.03*** 27%
Pacific (proportion 0.083 0.056 0.027*** 48%
Asian (proportion) 0.132 0.082 0.05*** 61%
Middle Eastern/Latin American/African (proportion) 0.014 0.009 0.005*** 56%
Other ethnicity (proportion) 0.03 0.031 -0.001 -3%
New benefits (HES 2014/15 only)        
Proportion with JSS 0.05 0.041 0.01* 24%
Proportion with SPS 0.029 0.021 0.008** 38%
Proportion with SLP 0.032 0.026 0.006 23%
Old benefits (HES 2006/07-2012/13)        
Proportion with UB 0.027 0.02 0.007*** 35%
Proportion with SB 0.023 0.019 0.004** 21%
Proportion with DPB 0.033 0.028 0.005** 18%
Proportion with IB 0.024 0.023 0.001 4%

Notes: This table reports comparisons between linked and unlinked HES members across a range of variables. Dollar values are rounded to the nearest dollar, mean age to one decimal place and proportions to three decimal places. Due to rounding, the difference in column (3) may not be the same as the difference of column (1) and (2). Column (4) uses IRD income as the denominator in the percentage difference and is based on rounded data and so is quite approximate for the smaller proportions. Stars denote: * p<0.1, ** p<0.05, *** p<0.01.

Table 2: IRD income composition (linked sample)
  Mean income Percentage with
non-zero income
Mean conditional
on non-zero income
Wages and salaries 25,497 60.5% 42,115
Benefits† 1,113 11.7% 9,541
IR20 director Income 1,659 5.1% 32,523
PAYE director Income 851 2.0% 42,326
Withholding tax director income 9 0.0% 20,870
ACC payments† 201 2.0% 10,030
IR20 partner income 652 3.7% 17,852
PAYE partner income 29 0.1% 31,396
NZ Superannuation 2,879 18.2% 15,814
Paid parental leave 52 1.1% 4,713
IR3 income (sole-trader) 956 5.7% 16,834
Sole-trader receiving PAYE deducted income 5 0.0% 18,655
Sole-trader withholding tax income 556 3.1% 17,727
IR3 rental income 34 1.8% 1,815
Student allowance 107 2.2% 4,854
IRD income - all sources 34,600 90.0% 38,432
Total income - comparable sources 33,286 84.5% 39,393

† non-compared category


  • [16] For example, if individuals misreport their age, it is unlikely they will be correctly linked.
