Appendix B
Survey methodology
When SoFIE commenced in 2002 a total of 15,000 households were approached, of whom around 11,500 (77%) agreed to participate. In the initial interview, data was collected from around 22,000 individuals aged 15 and over. All respondents in the original sample (original sample members) are followed over time, even if their household or family circumstances change, forming a longitudinal sample. In later waves new cohabitants of the sample members are interviewed but asked only a reduced set of questions. These additional sample members are not followed if in future waves they no longer live with the original sample member. For these reasons, only original sample members are included in this analysis. All SoFIE interviews are carried out face to face using computer assisted interviewing.[37][38]
Statistics New Zealand provides a longitudinal weight which accounts for non-response and aligns the composition of the sample with that of the New Zealand population in October 2002. SoFIE interviews were conducted throughout the year with the sample spread evenly over the 12-month wave period. Each respondent is asked about the previous 12 months (their annual reference period). As a result of this continuous interviewing, there are 12 reference periods in each wave. Some variables collected in each wave of SoFIE, such as age, can be measured at the household interview date or at a point in the reference period. Figure B1 shows the relationship between these dates for a hypothetical SoFIE respondent.
At the end of the SoFIE health module respondents were asked to give permission for their data to be linked to information on hospitalisations and cancer registrations held by the New Zealand Health Information Service back to 1990. For those respondents who agreed to the data linkage, and were successfully matched, it was possible to identify those respondents who are listed on the Cancer Register as having been diagnosed with cancer.[39] As the linked information only goes back to 1990 this is only a measure of recent cancer diagnosis. Where descriptive (prevalence) statistics are presented where only the linked sample is used, adjusted weights were used to realign the sample with the population (adjusted longitudinal weight) as opposed to the weights provided by Statistics New Zealand (standard longitudinal weights).[40]
Population and sample of interest
The questionnaire is only asked to those aged 15 and over. To ensure there is full information on respondents in all waves, the analysis is focused on those aged 15 and over at the end of the reference period in Wave 1 who remain eligible and respond in all three waves of the survey (adult longitudinal respondents). This is the balanced panel made up of 17,615 respondents in Waves 1-3; an unadjusted attrition rate of 20.5%. Once this is adjusted, to remove those people who move out of the scope of the survey or die, the adjusted attrition rate is 17.2%. Those over working age or who are full-time students in each wave are excluded from the analysis. The results are therefore representative of the usual adult resident population of New Zealand who lived in private dwellings on the main islands of New Zealand in 2002/03 who are working age non-students. Around three-quarters of the 17,615 adult longitudinal respondents are working age non-students in Waves 1, 2 or 3.[41][42]
Figure B1 - SoFIE wave structure
Household is selected for interview - January 2003 Wave 1 (October 2002 to September 2003)
- Household interview date - usually a day in January 2003*
- Annual reference period - January 2002 to December 2002
- Household interview date - usually a day in January 2004*
- Annual reference period - January 2003 to December 2003
- Household interview date - usually a day in January 2005*
- Annual reference period - January 2004 to December 2004
* This date could be later if there are problems contacting respondent or arranging an interview; however, even if this moves into February or March the reference period will not change.
Limitations and strengths of SoFIE
The SoFIE data has a few limitations. As with all surveys, there is potential for non-response error - that is, errors because not all potential respondents take part in the survey. Unlike in cross-sectional surveys, non-response in longitudinal surveys has a second element as respondents can also choose whether to respond in each wave. If this non-response (known as attrition) is non-random (that is, the characteristics of those who do respond are systematically different from those who do not) then any inferences based on analyses of the data may be biased. In addition, where longitudinal data is linked to other sources, information is only observed for part of the sample (those who agree to the linkage) and these differences could also be non-random and potentially bias results. While there are differences in the response, consent and matching rates in SoFIE there are no groups of interest that do not contain any respondents. The weights (both the standard weights provided by Statistics New Zealand and adjusted weights to take account of non-consenters) go some way to restore the distribution of respondents over the variables of interest and any bias as a result of this should be small when making inferences about the population as a whole.[43] However, it should be remembered that as a longitudinal survey, those who are most unhealthy will die or move into institutions where they may not be able to be traced, meaning that the SoFIE population is likely to be healthier than the wider New Zealand population it represents.
A further limitation is that not all variables are available in all waves. An indicator for psychiatric conditions is only available in Wave 3 and an indicator for cancer is only available for the subset of respondents who agreed for their data to be matched to the Cancer Registrations database and were successfully linked. This potentially reduces the sample size considerably if only Wave 3 matched consenters are considered. Making an assumption about the presence of psychiatric conditions for Waves 1 and 2 and coding the non-consenters' cancer status as “unknown” rather than missing goes someway to countering this problem, allowing analysis to be undertaken on all three waves rather than the restricted sample.
While SoFIE is a longitudinal survey, there are only currently three waves of information. While this provides a wealth of information for variables that do not change very frequently, such as diagnosis of new diseases, modelling the impact of these variables with such a short span of data is difficult.
Lastly, if dependants of respondents have ill health or chronic diseases this may also affect the respondent's labour market participation. The SoFIE questionnaire does not allow “carers” to be identified except when the ill health of a family member is given as a reason for inactivity. In addition, when people do report the ill health of a family member as a reason for inactivity the cause of ill health cannot be identified or attributed to a specific chronic disease or illness. The effect of this on labour market participation is therefore not explored in this analysis.
Despite its limitations, SoFIE collects a wealth of information on respondents over time. This allows a range of labour market transitions, durations and repeat occurrences of respondents to be analysed. It allows comparison of labour market activity and disease presence at more than one point in time. Further, attempts to account for the presence of unobserved variables can be made given that the same respondent is being monitored over time. The linking of SoFIE data to cancer and hospitalisation information adds further depth to the SoFIE data and this additional information is subject to less reporting error than additional questioning of respondents.
While there are differences in response and consent rates by respondent characteristics, for a longitudinal survey of this kind the response and consent rates are high by international standards.
Notes
- [37]Full details of the sampling design for SoFIE can be found here: http://www2.stats.govt.nz/domino/external/pasfull/pasfull.nsf/84bf91b1a7b5d7204c256809000460a4/4c2567ef00247c6acc256fab0082e7fc?OpenDocument. There was no formal oversampling of specific groups; however, stratification was used in the first stage of the sample selection to try to ensure sufficient representation in the survey from specific groups. The strata were defined according to region; urban/rural; high/low Māori population density and other socio-economic variables derived from the most recent census.
- [38]The full SoFIE questionnaire can be found here: http://www2.stats.govt.nz/domino/external/quest/sddquest.nsf/12df43879eb9b25e4c256809001ee0fe/14d945bb95ab2bbbcc256fb70077b3bb?OpenDocument.
- [39]Around 80% of all SoFIE respondents agreed for their data to be linked. Of these, 97% were linked successfully.
- [40]More information on the adjusted weight is available from the author.
- [41]Those respondents with a missing value for any of the variables of interest in a particular wave are excluded from the models for data based on that wave. The number of missing values is small and analysis indicates they appear to be random.
- [42]Respondents can change status with regard to being a student or moving out of working age over the survey period. Therefore there are not always three responses for each respondent in the analysis even though the balanced panel is the starting point for the analysis (ie, the student/working age values criteria make the panel unbalanced).
- [43]More information on sample attrition and consent in SoFIE and the adjusted weights is available from the author.
