3 Data Description
The data we use in the analyses in this paper comes from the New Zealand Household Labour Force Survey (HLFS). In this section we begin with a brief discussion of the HLFS, and describe the characteristics of our analysis samples.
3.1 The Household Labour Force Survey
The HLFS is an ongoing quarterly survey which began in 1985, and is designed to produce a comprehensive range of statistics relating to the employed, the unemployed and those not in the labour force who comprise New Zealand's working-age population. The current target population for the survey is the civilian non-institutionalised usually resident New Zealand population aged 15 and over. The HLFS sample frame uses an eight-quarter rotating panel, with one-eighth of households rotating out each quarter, consisting of a representative sample of approximately 15,000 households and 30,000 individuals who have a statutory obligation to respond to the survey.[6]
In the first quarter a household is in the frame, personal interviews are used to collect responses to both a household and an individual questionnaire for each working-age person in the household, while telephone interviews are used in the subsequent quarters. The HLFS collects information on labour-force status, hours worked, and educational status,[7] together with basic demographic information, of individuals and households, but does not collect any wage or non-categorical income information. However, since 1997, the June quarter HLFS has included an extensive supplemental questionnaire known as the New Zealand Income Survey (HLFS-IS), which collects information on pre-tax income from wages and salaries, self-employment, government transfers, and other sources for the purpose of producing a comprehensive range of income statistics. The core HLFS survey is often conducted by proxy interview and missing responses in the Income Survey are imputed by Statistics New Zealand.
3.2 The Samples
We construct samples of 16-25 year-olds from both the core HLFS and HLFS-IS supplements. Our analysis of non-income related outcomes (employment, hours worked, studying, unemployment and inactivity) uses quarterly data from the core HLFS survey over the period from the first quarter of 1997 to the third quarter of 2003. Our analysis of income related outcomes (wages, receipt of non-student benefits, weekly earnings and weekly income) uses annual data from the 1997—2003 June quarter HLFS-IS supplements.[8] All our analyses include observations that have been attained by proxy interview and/or where any data has been imputed.[9] Although data from these observations are likely to contain significant measurement error, as can be seen in appendix table A1, they are clearly not randomly distributed in the population.[10] Preliminary analysis suggested that there is potentially large sample selection bias caused by omitting the large numbers of proxy and imputed responses. For this reason, in our regression analysis, we allow the relationship between all covariates in the models and the outcome of interest to differ for both proxy and imputed observations.
Table 2 presents summary statistics for key demographic characteristics and all outcome variables for our analysis samples. The first two columns pertain to the sample of quarterly data from the HLFS, while the latter two columns pertain to the sample of annual data from the HLFS-IS. The first and third columns describe the characteristics of the full samples and the second and forth columns describe the characteristics of wage and salary workers in these sub-samples. The summary statistics and regression results are estimated using sampling weights created by Statistics New Zealand to increase the representativeness of the samples to take account of the sample frame and non-random survey response and individual attrition.
During 1997—2003, 58 percent of 16-25 year-olds, on average, are employed as wage or salary workers, 30 percent are studying, 8 percent are unemployed, 17 percent are inactive and 15 percent receive a non-student benefit.[11] The average real wage of the wage and salary workers is $11.07 and they work around 31 hours per week. Our quarter sample has 125,486 observations, and our annual sample 31,371, an average of 465 in each age-quarter cell and 523 in each age-year cell respectively.[12] With the exception of hours worked per week being 1.2 hours higher in the annual (June Quarter) sample than quarterly sample, the sample characteristics are almost identical in these two samples.
Our analysis of employment, hours worked, hourly wages, and weekly labour earnings focuses on wage and salary workers, as minimum wage laws do not apply to the self-employed.[13] The second and fourth columns pertain to the wage or salary workers in our data. Compared to the full samples, these workers are, on average, older, more likely to be male, married, have European ethnicity (and less likely to be Maori, Pacific Islander or Asian), less likely to be studying or receiving benefit income, and have higher total incomes.
Notes
- [6]The sampling frame for the HLFS is updated every five years following the New Zealand Census. When this occurs, more complicated panel rotation rules are used to reduce the transition period to the new frame.
- [7]The information collected in the HLFS does not allow us to accurately identify individuals who are studying if they have finished secondary school and are in the labour force. This makes it difficult to meaningfully compare study rates across groups with different participation rates.
- [8]Fifteen year-olds are not covered by minimum wage legislation and thus we choose to exclude them from our current analysis. Although they may provide a suitable comparison group, the minimum schooling leaving age of 16 means that all employed 15 year-olds will be school pupils working part-time.
- [9]HLFS proxy interviews are used for 50, 38, and 27% of 16-17, 18-19, and 20-25 year-olds respectively. The vast majority of these are conducted with one of the sample member’s parents. We only have information on whether any Income Survey data for a particular observation has been imputed: data has been imputed for 11, 13, and 13% of 16-17, 18-19, and 20-25 year-olds, respectively. Also, in rare circumstances, proxy interviews are used for the income survey: this occurs for 4, 2, and 2% of 16-17, 18-19, and 20-25 year-olds respectively.
- [10]Appendix tables A1 and A2 present summary statistics comparing the proxy and imputed data to non-proxy, non-imputed observations in the HLFS and HLFS-IS samples. One key finding is that teenagers who are employment are much more likely to have their data collected by a proxy interview.
- [11]Note that these states are not all mutually exclusive – e.g. students may also be working. Labour market inactivity is defined as neither working nor studying. Study rates are higher in the annual samples because we are able to use receipt of student benefits to further classify individuals as students.
- [12]In our regression analysis, 255 observations are dropped from models of hours worked and 145 observations are dropped from models of labour earnings because of missing data or zero values on these outcomes. An additional 64, 31, 17, and 4 observations are dropped from all models with covariates using the full quarterly sample, the wage and salary quarterly sample, the full annual sample, and the wage and salary annual sample, respectively, because of missing marital status or country of birth. To handle non-positive incomes, we have censored weekly incomes at the 1st percentile of the positive all age income distribution.
- [13]We often refer to wage and salary workers simply as ‘workers’ throughout the paper. When the distinction is not obvious, we will be explicit.
