The Treasury

Global Navigation

Personal tools

2 Data

This paper uses unit record data from two household surveys conducted by Statistics New Zealand. The first is the Survey of Family, Income and Employment (SoFIE) and the second is the Household Economic Survey (HES).

SoFIE, the primary data source for this study, is a longitudinal survey where the original sample members are tracked and surveyed each year. It began in October 2002 with an original sample size of about 11,500 households, amounting to over 22,000 individuals 15 years of age and over. It concluded in September 2010 after running annually for a total of eight years (waves). The core survey collects information on individual and family characteristics, as well as labour market and income spells. In alternate years health, and assets and liabilities modules are included respectively.

At the time of this analysis only the first seven waves of SoFIE were available for analysis. The assets and liabilities module was included for three of these waves (waves 2, 4 and 6) and is required for our examination of house prices, ownership and affordability. Interviews for each wave were evenly spread over a 12 month period so that some households were interviewed in October and others the following September. However, we index all asset values to the mid-point of the relevant wave. Asset values for wave two are therefore indexed to approximately 31 March 2004, wave 4 asset values to 31 March 2006 and wave 6 asset values to 31 March 2008.

Indexation was particularly important during this period, with strong house price growth potentially leading to non-trivial increases in individuals' net wealth even within the interview period of a particular wave. Fortunately respondents in SoFIE were asked not only for the value of any residential property they owned but also to provide a valuation date. We used this date, together with detailed regional house price indices from Quotable Value (QV) (aggregated to the six major SoFIE regions) to index housing assets as described in the previous paragraph.[1] For all other assets the Consumer Price Index (CPI was used).

Another issue is that only the total value of all mortgages is recorded in SoFIE. There is no information about the number of mortgages or to which property the mortgages are assigned. For tax benefits, investment properties usually have high loan-to-value ratios, and consistent with Le et al. (2012), we therefore allocate mortgages to investment properties up to their asset value, with any remaining mortgage value then allocated to the owner-occupied property.

SoFIE required a great deal of careful cleaning in order to minimise loss of observations due to question non-response or apparent errors in recording of individual information.[2] Wherever possible we made use of the longitudinal nature of the data to attempt to correct for this. For example, if we observed an individual owning a house worth just $1 in wave four we would examine their housing assets in other waves. If it turned out that that same person in wave two owned a house worth say $900,000 and in wave 6 worth $1,100,000 we changed the value recorded in wave four to $1,000,000. Similar anomalies or non-response were observed across most of the variables we used in this analysis and so are too numerous to mention here. For more information about SoFIE and some of the problems researchers can expect to encounter, see for example Scobie and Henderson (2009) or Carter et al. (2009).

For most of our analysis using SoFIE the sample was restricted to those individuals aged 25 years and older. For descriptive analysis weighting of survey responses was necessary, however, Statistics New Zealand only provide longitudinal survey weights for those respondents who were original in scope sample members.[3] Therefore a further restriction to the sample was required. For regression analysis we elected not to apply survey weights, allowing the use of significantly more observations, as many of the control variables included in regressions are those upon which the construction of weights would be based.

Finally, as SoFIE was not designed to collect detailed expenditure data we also make use of HES. This allows us to examine for example the pattern of rental or mortgage expenditures over time as well as patterns of detailed housing tenure.[4] We employ HES going back to 1983. For more information about HES see, for example, Perry (2011).

Notes

  • [1]In a number of cases respondents failed to provide valuation dates. In these cases we assumed that the distance between the respondents' interview date and valuation date was the same as the average of that distance for those respondents that were able to provide valuation dates. This distance was between two and three years depending on the survey wave.
  • [2]To construct a usable panel data set for analysis SoFIE also required a great deal of manipulation / formatting, with the data originally being stored in around 20 separate files with different (often incompatible) formats.
  • [3]Though preferred for the current analysis, cross-sectional weights were not provided. Longitudinal weights are for the 2002 New Zealand population, regardless of survey wave.
  • [4]SoFIE and HES are not linked in any way. In other words, different individuals are surveyed in each case. Therefore we are not able to link the respective respondents’ expenditures and assets, for example.
Page top