The Treasury

Global Navigation

Personal tools

3 Methodology

The principal approach to the analysis of the survey of individuals is to estimate a series of multivariate regression models of the form:

Υ = βΟ + β1 Χ1 + ......... βnΧn + ε

in which a dependent variable Y is expressed as a linear function of a series of explanatory variables plus an error term. Estimates are made of the coefficients βi together with their corresponding standard errors. The strength of this approach is that it allows an estimate of the effect of a particular independent variable to be made while holding constant the effect of other variables. For example, the expected level of retirement income (Y) might be associated with age and health status amongst other independent variables. A bivariate analysis might show that older people expect lower retirement incomes. However this apparent association may simply have been due to the fact that the self-reported health status declines with age, and those with poorer health, regardless of age, would tend to have a lower expected retirement income. Only by correcting for the effect of age would the true underlying effect of health be revealed.

The dependent variable may be a binary variable (eg, the respondent is a KiwiSaver member: Yes=1 or No=0) or a continuous variable (eg, the amount of any retirement income shortfall). The explanatory variables may be continuous (eg, net wealth), binary (eg, partner is a KiwiSaver member: Yes=1 or No=0), or categorical (eg, self-reported health status: Excellent, Very Good, Good, Fair, Poor).

In the case of a binary dependent variable, a logit model is fitted and marginal effects for each independent variable are estimated assuming all other variables are held at their mean values. For example, if we wish to analyse the effect of employment status on whether a respondent is a KiwiSaver member we fit a logit regression and hold all variables at their mean level (age, education, occupation, wealth, income, etc) and allow a change in labour force status (eg, from part-time to full-time employment) and derive an estimate of the marginal change in the probability of being a KiwiSaver member.

In the case of a continuous dependant variable the models are estimated using either Ordinary Least Squares, or where appropriate, a Heckman selection procedure.

In order to hold constant as many factors as possible, each regression contains an extensive set of conditioning variables. While the exact number of conditioning variables varies with the particular question being addressed the overall set of variables used is listed in the Appendix. These include age, gender, income, wealth, number of children, labour force status, occupation, ethnicity, home-ownership, risk attitude, NZS main source of retirement income, self-assessed health status, marital status, education, year joined KiwiSaver and the experience of traumatic event(s). Importantly KiwiSaver membership is also included where appropriate. In general, when presenting the results of regression analysis we have restricted the tables to include only those variables which are statistically significant.

Page top