The Treasury

Global Navigation

Personal tools

5.2 Model framework

5.2.1 A pooled model

This section outlines the modelling framework and assumptions. Assume that the desire to be in the labour force can be captured by a latent index denoted y*, which represents preference for work. The higher y*, the more likely an individual is to be in the labour force, and vice versa. When y* becomes positive, the preference for work is realised as , and the i-th individual is observed to be in the labour force at time t.

This preference for work is likely to reflect such factors as age, health, household income, wealth, marital status, access to pensions and other financial benefits, and spousal work status. The first model employed is a pooled logit regression. This relates the probability of participation to a range of explanatory variables, whilst imposing uniform intercepts and slopes for all individuals, as in equation (1).

The latent index y* is a function of time varying (X) and time invariant (Z) covariates, in addition to an idiosyncratic error term (uit). This specification is susceptible to problems such as omitted-variable bias, stemming from unobserved heterogeneity, and reverse causality. As such, the resulting coefficients do not necessarily imply a causal relationship.

5.2.2 An extended model

A limitation of the pooled model is that it does not account for the potentially confounding effects of unobserved heterogeneity. In an attempt to address this issue, we theorise that participation in the labour force is a function of the characteristics that are observed and reported in the survey, as well as permanent unobservable characteristics which differ between individuals (eg, time preference, motivation).

This relationship can be estimated using a logit model with random effects, as in equation (2); where once again X and Z are time-varying and time-invariant regressors respectively, αi is a normally-distributed individual specific effect, and uit is an idiosyncratic error term. This model accounts for the unobserved effects αi; however, a restrictive assumption is that the unobservables (αi) are assumed to be random, and independent of the covariates. This is unrealistic in this context. As such, a correlated random effects framework will be used, whereby the theorised relationship between time-varying covariates and permanent unobservables is explicitly parameterised, in an effort to account for these correlated unobservable differences between people (Chamberlain, 1984; Mundlak, 1978; Wooldridge, 2001). This is done by splitting the individual specific term αiinto a component, which is correlated with time-varying observables; and a random error term ηit, which is assumed to be independent of covariates by construction.

The coefficients λ represent the extent of the correlation between the unobserved heterogeneity and the time-varying covariates; that is,λ will be equal to zero only in the case where the time-varying explanatory variables are unrelated to the unobservable effects. Following Mundlak (1978), λ is restricted to be the same over time, so λ1 = λ2 = λ3 = λ, leading to a simpler proxy for this relationship, given by:

Substituting (4) into (2) gives:

Here, is the individual specific average of a covariate, and ηit and Xit are assumed to be conditionally independent given.[4]

The final specification which includes a term intended to capture differences between people ()[5], as well as a term which captures changes within individuals over time (Xit). As outlined earlier, there may be theoretical reasons to believe that the within and between effects may differ, due to omitted person-specific explanatory variables which affect the mean level of health, yet do not imply a causal effect.

Notes

  • [4]Note that any potential correlation between the errors and time-invariant regressors is not addressed, and this model relies on the assumption that the effects of unobserved heterogeneity are transmitted in a linear fashion through the individual specific time averages.
  • [5]Specifically, the coefficients on time means represent the difference between the between and within effects.
Page top