The Treasury

Global Navigation

Personal tools

4 Predicting poor outcomes

In this section, regression analysis is used to more systematically identify which characteristics observed in the administrative data between ages 15 and 24 were most strongly associated with a higher likelihood of poor outcomes as an adult. The aims of this analysis are predictive in nature, seeking to identify the factors most strongly associated with poor outcomes and to use these to predict risk and identify target populations for investment decision making. The analysis does not seek to understand the causes of poor outcomes and does not answer underlying questions of causality. Just because a factor predicts poor outcomes does not necessarily mean the underlying concept causes those outcomes.

We are restricted to using existing administrative data sources from each agency. We are also limited by the confines of the data collected in the IDI for the cohort we are studying. This meant, for example, that health data relating to early childhood was not able to be used in the study. As such, the results are less definitive in terms of the relative importance of various factors. Nevertheless, the analysis is useful for understanding which interactions with government agencies at particular stages of a young person's life are more strongly associated with poor outcomes later on.

As discussed above in section 2, logistic regression models were run for the 1990/91 birth cohort for each year of age from the year they turned 15 to the year they turned 22 (ie, from the 2005/06 July to June year to the 2012/13 year). Predictive factors were selected for the models on the basis of a forward selection approach.[17] Models were also run separately for males and females on the basis that different risk factors were likely to be important for each gender. Models were run separately for our four outcome measures, with models on the health and educational outcomes being run up to ages 20 and 21 respectively to avoid the predictive factors becoming conflated with the outcome measures. These outcome measures are measured at an earlier age than the welfare and corrections measures.


  • [17]Appendix 1 summarises the number of models each factor was included in at different ages (out of a possible eight in each year up to age 20, six at age 21 and four at age 22).
Page top