The Treasury

Global Navigation

Personal tools

Executive summary

Purpose of the paper

This paper summarises findings from an analysis of integrated administrative data aiming to identify the characteristics of young people aged 15 to 24 who are most at risk of poor long-term outcomes. The work is part of a broader emphasis by government agencies to target services more effectively towards those most at need and reflects the recognition that such an approach requires better evidence about who these at-risk groups are.

The Treasury has identified the need for the state sector to play a particular role in helping the most disadvantaged to participate in society and the economy and has noted the importance of agencies doing this through working innovatively and collaboratively across agency boundaries.[1] This has driven the development of a 'social investment' approach to decision making about government investment in social services. The social investment approach involves “using information and technology to better understand the people who need public services and what works, and then adjusting services accordingly”.[2]

This work represents one in a number of steps towards implementing a social investment approach. The overall work programme was led by the Ministry of Education, with the analysis summarised in this paper being led by the Treasury's Analytics and Insights team, in collaboration with a number of other agencies, and using integrated administrative data held in Statistics New Zealand's Integrated Data Infrastructure (IDI).

The results of this analysis are also described in the accompanying A3 document entitled 'Youth at risk: Identifying a target population', produced by the Ministry of Education. This paper provides a general description of the process adopted, presents a descriptive analysis of the populations of interest, summarises the results of the modelling work undertaken and describes the target populations identified through the project. It also provides some guidance to assist with interpreting the A3 document.

Research objectives

The aim of this work is to identify which risk factors between the ages of 15 and 24 are most strongly associated with poor long-term outcomes at ages 25 to 34, identify target populations between the ages of 15 and 24 who are most at risk of experiencing poor long-term outcomes and identify some of the larger fiscal costs associated with those target populations.

Data and methods

The study uses the Integrated Data Infrastructure (IDI), which brings together information from a wide range of government departments. Records are linked using name and date of birth. The data is anonymised and used only for research purposes.

The main analysis is a birth cohort analysis, which focuses on those born between 1 July 1990 and 30 June 1991, who can be observed through to age 21 in the data set. The analysis describes the key characteristics and outcomes that could be observed for this cohort at various ages and the various service use patterns and outcomes that were experienced by different subgroups within this population. The future outcomes of this birth cohort out to age 35 are also estimated using a statistical record linkage technique, in which data for an older birth cohort is linked to that of the 1990/91 cohort.

This is complemented by an analysis of the characteristics and outcomes of the current youth population, defined as being aged 15 to 24 as at the end of December 2013. We are able to describe these young people's interactions with selected social services up to the end of December 2013. Projected future outcomes and selected service costs are also estimated for this population using data for other birth cohorts and statistical record linkage techniques.

Limitations and caveats

The study has a number of limitations and caveats:

  • The scope of the study is limited by the nature and breadth of the information collected in agencies' administrative systems and included in the IDI. For example, the administrative data used in this work provides only a partial picture of childhood adversity, service use and service costs.
  • The population coverage errors, linkage errors and biases present mean that the results are unlikely to be completely accurate and should be viewed as providing broad estimates of scale.
  • The methods used to estimate future outcomes and costs are designed to provide a comparative picture of future outcomes and costs for different population subgroups, but they have some significant limitations. These estimates should not be viewed as forecasts of the actual outcomes and costs that will be incurred in the future.

While the results highlight the power of using integrated administrative data in new and innovative ways, this is the first time some of the data has been used in this way, and as such, these results should be considered as preliminary and will need further testing and development over time.

The caveats and limitations are discussed in more detail later in the paper.

Key findings

  • Integrated administrative data can be a powerful tool for government and other agencies to identify at-risk groups in the population. Limitations in some of the data mean that the findings of this analysis need to be treated with some caution. However, the results provide a useful insight into the lives of at-risk youth. The data used for this type of analysis will continue to improve over time.
  • A number of characteristics can be identified throughout a person's early life that are predictive of future poor outcomes including early contact with government agencies such as Child, Youth and Family (CYF), demographic characteristics and geographic location, characteristics of the young person's caregiver and early outcomes evident in data from the education, corrections, welfare and health systems. These can be used to quantify risk at an individual level and to identify the size and characteristics of at-risk groups of young people at different ages.
  • The characteristics that are predictive of future outcomes change over time. As young people progress into early adulthood, poor future outcomes become directly evident through contact with the benefit, corrections and health systems. Whilst it becomes easier to predict poor outcomes as a young person ages, these outcomes may become more difficult to influence.
  • It is possible to identify groups of at-risk youth at different ages using a small set of identifying characteristics. However, these predictions are by no means perfect. Young people who are identified as being at risk are highly likely to have poor future outcomes, but a large number of people have poor outcomes despite not falling into one of these defined groupings.
  • In general, geographic location is strongly associated with risk of poor outcomes, with location-based measures such as the New Zealand Deprivation Index (NZDep) and territorial authority area being important predictors of risk, even controlling for other observed characteristics. Youth at risk of poor outcomes tend to be concentrated in specific areas such as the Far North, Kawerau, Opotiki and Wairoa. However, it is important to note that the largest numbers of at-risk youth still live in larger urban centres such as Manukau, Waitakere, Hamilton and Christchurch.
Page top