The Treasury

Global Navigation

Personal tools

5 Target populations

5.1  Approach

For each age group, a cluster analysis was undertaken identifying groups of individuals within the at-risk population, defined as being at extreme risk (top 5% of population risk) of at least one outcome measure. Multiple correspondence analysis was used to redefine the key categorical predictors from the regression modelling into a smaller number of continuous variables, and these were then used to identify a number of clusters at each year of age for females and males jointly.

The youth population was next split into the late teen population (aged 15 to 19) and the early 20s population (aged 20 to 24), and we sought to identify a small number of target populations within each of these age groups. The aim was to identify clearly defined groups at risk of poor outcomes that aligned as closely as possible with the estimated risk from the regression analysis. Target population groupings were informed by the factors that were most predictive of poor outcomes in the regression analysis outlined in the previous section, as well as the clusters identified through the correspondence and cluster analysis, and constructed using the following guiding criteria:

  • Parsimony – target populations should be able to be identified using only a few criteria.
  • Separation – overlap between target populations should be minimised.
  • High sensitivity – most people identified as being at risk should fall into at least one target population.
  • High specificity – most people identified as not being at risk should fall outside of the target populations.

In the end, five groups were identified in each age range. Between them, these groups covered a majority of the at-risk population, and there were no additional clearly identifiable groups that met the criteria above. For the early 20s population, risk was mainly defined using the welfare and corrections outcomes measures, as health and education outcomes could have already occurred at these ages, conflating the risk and outcomes measures.

Page top