The Treasury

Global Navigation

Personal tools

3.2  Methodology

Two descriptive methodologies are used to examine differences in economic performance and economic growth across the five ‘regional’ areas and four Auckland zones.

3.2.1  Graphical Analysis

Graphs are used to display the mean (average level) of each of the six measures of economic performance in each year for the five areas and four Auckland zones. These graphs indicate the relative level of economic performance in each area/year without controlling for any of the underlying differences in the population of these areas. These graphs also display the mean of each measure of economic performance in each year indexed to the level of that measure in 1997 in each of the areas and four Auckland zones. These graphs indicate the relative change in economic performance in each area/year, again without controlling for any of the underlying differences in the population of these areas.

Graphs are also used to display the overall distribution (ie. the 10th, 25th, 50th (median), 75th, and 90th percentile) of the two Hourly Wage measures in each year for the five ‘regional’ areas. We compare wages across areas at different points in the distribution to give further insight into relative productivity performance in each area. Again, these graphs do not control for any of the underlying differences in the population of the different areas in New Zealand. Thus, we next turn to a regression analysis to examine whether the graphical differences are statistically significant and whether controlling for differences in the attributes of individuals living in different areas ‘explains’ difference in regional economic performance.[20]

3.2.2  Regression Analysis

Individual level regression models are estimated using IS unit record data. Ordinary Least Squares (OLS) regression models are used to estimate mean differences in the six measures of economic performance across the five ‘regional’ areas and separately across the four Auckland zones in parallel to the graphical analysis. All of these estimates are weighted to create correct inferences based on the full prime-age population. Quantile regression models, also known as Least Absolute Value (LAV) regressions, are used to estimate hourly wage differences at various points in the distribution across the five ‘regional’ areas. Sample weights cannot be incorporated into these analyses. Instead, bootstrapping (replication analysis) is used to estimate standard errors for these models.[21]

Two specifications of the OLS and LAV regression models are estimated for each measure of economic performance:

(1) The levels specification examines the relative level/distribution of economic performance in each area/zone. This specification estimates the relative difference in the average level (or a different point in the distribution) of each outcome measure during the sample period in each region compared to Auckland (or in each zone of Auckland compared to the Northern zone)

(2) The growth specification examines the relative change in economic performance in each area/zone between 1997 and 2004. This specification estimates the relative difference in the change in the outcome measure between 1997 and 2004 in each region compared to Auckland (or in each zone of Auckland compared to the Northern zone).[22]

We present results from both:

(1) “unadjusted” regression models that include no control variables besides indicator variables controlling for aggregate differences in outcomes across years; and

(2) “adjusted” regression models that include additional control variables for age, gender, ethnicity, immigration status, educational qualifications, occupation, industry, and employment type.[23]

The “adjusted” regression models allow individuals with particular characteristics to have different outcomes than other individuals regardless of their location and reveal whether controlling for differences in the attributes of individuals living in different areas ‘explains’ differences in regional economic performance.[24] These models are purely descriptive models which partition the difference in economic performance across areas into components that are correlated with differences in attributes and those which are ‘unexplained’. These attributes account statistically for the ‘explained’ variation in outcomes across areas, but do not necessarily have a casual impact on economic performance.[25]

Notes

  • [20]Statistical significance is an important criterion for judging whether observed differences in the data are ‘true’ but caution should be used in relying solely on this criteria because with the smaller sample sizes used for this regional analysis ‘true’ difference may not be statistically significant.  The overall magnitude of the differences and of trends over time are also useful criteria for judging the ‘economic’ significance of different results.
  • [21]The reported standard errors for both the OLS and LAV models are adjusted to account for the clustering of survey households at the primary sampling unit (PSU) level.  One hundred replications are used for bootstrapping standard errors.
  • [22]The ‘growth’ specification of the LAV model estimates the relative difference in the change in the outcome measure at a particular point in the distribution of that measure between 1997/98 and 2003/04 in each region compared to Auckland.
  • [23]Age is controlled for using seven indicator variables for five-year age groups.  Ethnicity is controlled for using ten indicator variables for complex combinations of multiple ethnicities (i.e. instead of using prioritised ethnicity, we account explicitly for multiple response).  Immigration status is controlled for using seven indicator variables capturing whether an individual is born in New Zealand, and if not, how long they have been in New Zealand.  The categories are: 5 years or less, 6 to 10 years, 11 to 20 years, 21 to 30 years, more than 30 years, and foreign born but unknown number of years in New Zealand.  Educational qualifications are controlled for using seven indicator variables combining information on highest school qualifications, vocational qualifications, and university qualifications to create a more complete picture of an individual’s educational background.  The categories are: no qualifications, only vocational qualifications, only school certificate (or other low school qualification), school certificate and vocational qualifications, only six-form (or other higher school qualification), six-form and vocational qualifications, university qualifications.  Occupation is controlled for using twenty-six indicator variables for each of the two-digit NZSOC90 classification groups.  Industry is controlled for using fifty-four indicator variables for each of the two-digit ANZSIC classification groups.  Employment type is controlled using four indicator variables for: wage/salary employment, employer of others, self-employed, unpaid family worker.  Additional indicator variables are included to control for missing values in any of the above characteristics.  Industry, occupation, and employment type cannot be included as control variables when the examined outcome is employment.  Appendix Table 3 display the percentage of the working-age population in each region/zone with certain characteristics.  This is displayed for each of the covariates used in the regression models but some are simplified to more aggregate categories.
  • [24]We estimate no models where the ‘returns’ to characteristics are allowed to vary across areas, thus we are assuming that regional differences in economic performance are homogenous across the population.
  • [25]It is not possible to identify casual effects because differences in the attributes of the population in each area are likely to be endogenously related to past and expected future economic performance. 
Page top