The Treasury

Global Navigation

Personal tools

Treasury
Publication

Health and Labour Force Participation WP 10/03

4.3  Modelling the health effect

4.3.1  Modelling methods and issues

Standard logistic regressions were the starting point for this analysis. Binomial and multinomial logistic regression models were fitted to the data to quantify the relationship between: the presence of different chronic diseases and labour force status; and self-rated health and labour force status (while holding all other variables constant). The binomial and multinomial models use the available characteristics of people to predict the chance of being in each labour market state. All other characteristics can then be held constant to determine the impact of a small change in one characteristic on the chance of participating. In this cross-sectional analysis, responses in each wave were combined together (pooled) so that each respondent had up to three responses in the data. Standard binomial or multinomial logistic regressions were then fit to this pooled data (these models are hereafter referred to as pooled logistic regressions). This “pooling” maximises the data available for analysis. The correlation between the error term for the same respondent in each wave was allowed for by identifying the people as clusters. Full details of the model and methods used in this paper can be found in Appendix C.

The results of binomial logistic regressions can be presented in two main ways:

  • Probability - This is the chance that a respondent with certain characteristics participates in the labour market. In a logit model a marginal effect is the relationship between a small change in a variable and the change in the probability of the outcome. As an example, where the characteristic of interest is a binary variable (such as disease present/not present), the difference between the probabilities of the outcome (participating) for two groups (which share all the same characteristics other than for the binary variable) is known as the marginal effect.
  • Odds ratio - This is defined as the ratio of the odds of an event occurring in one group to the odds of it occurring in another group.[6] For example, the ratio of the odds of participating for those with chronic diseases to the odds of participating for those with no chronic diseases. The odds ratios are equal to the exponential of the coefficient when all other factors are held constant. An odds ratio greater than one indicates a positive effect, whilst one between zero and one indicates a negative effect. It is important to remember that a relative change in odds is not the same thing as a relative change in probabilities. In general,the magnitude of the odds ratios will be larger than that of the marginal effects because they are summarising the results in different ways.

The relationship between probabilities, odds, odds ratios and marginal effects in a binomial logistic regression model can be seen in Figure 1, where the results from the first model described in Section 5.1 are presented. The benefit of using odds ratios is that all other variables can be held constant but a value for these variables does not have to be specified. This is not the case for probabilities (or marginal effects) where the values of the other variables need to be specified (these are usually set at their mean value for the whole sample).[7] However, the interpretation of marginal effects is more intuitive. For these reasons, both odds ratios and marginal effects are presented here.

Figure 1 - Relationship between results from binomial logistic regression - numeric example

When all other variables are fixed at their mean value the probability of participating in the labour force for people:

  • with a chronic disease = P1 = 0.865
  • without a chronic disease = P2 = 0.903.

The odds of participating in the labour force for people:

  • with a chronic disease = [P1/(1- P1)] = [0.865/(1-0.865)] = 6.40
  • without a chronic disease = [P2/(1- P1)] = [0.903/(1-0.903)] = 9.34.

That means that people with chronic diseases are 6.4 times more likely to participate in the labour force than not participate, while people without chronic diseases are 9.34 times more likely to participate in the labour force than not.

The odds ratio for those with chronic diseases is the ratio of the odds of participating for those with chronic diseases to those without chronic diseases. If this value is less than 1 then the odds of participating are lower for those with chronic diseases compared to those without a chronic diseases:

  • Odds ratio = [P1/(1- P1)] /[ P2/(1- P2)] = 6.40/9.34 = 0.685
  • Percentage change in odds = (0.685-1)*100 = -31.5%.

The marginal effect is the difference in the probability of participating for those with chronic diseases compared to those without chronic diseases:

  • Marginal effect = P1-P2 = 0.865-0.903 = -0.038
  • Percentage point (ppts) change in probability = -0.038*100 = -3.8ppts
  • Percentage change in probability = (-0.038/0.903)*100 = -4.3%.

This leads to the following conclusions:

1. The odds of participating (relative to not participating) are 31.4% lower for people with a chronic disease compared to people without a chronic disease.

2. The probability of participating in the labour force is 3.8 percentage points lower for people with a chronic disease compared to people without a chronic disease.

3. The probability of participating in the labour force is 4.3% lower for people with a chronic disease compared to people without a chronic disease.

Note: These results are derived from Appendix Tables D1 and D2. Probabilities are calculated using the formula outlined in Appendix Figure C1.

While a binomial logistic regression model predicts the chance of participating, multinomial models predict the chance of multiple states (ie, working full-time, part-time, being unemployed or being inactive). As with the binomial logistic regression the results from the multinomial logistic regression can be presented in various ways, including probabilities/marginal effects or odds ratios. However, there is a slight difference in how these are interpreted for the multinomial model which is important to understand. The interpretation of the results is explained below and a numeric example, based on the first multinomial model discussed in Section 5.2, can be found in Figure 2.

  • Probability - This is the chance that a respondent with certain characteristics is in each labour market state: that is full-time; part-time; unemployed; or inactive. Each respondent has a probability of being in each of the four labour market outcomes (although the probability for any state can be zero). These four probabilities always sum to one, as a person has to be in one of the four states. The marginal effect is the relationship between a small change in a variable and the change in the probabilities of being in each of the four labour market outcomes. As an example, where the characteristic of interest is a binary variable (disease present/no disease present), the difference between the probabilities of being in each labour market outcome (full-time/part-time/unemployed/inactive) for two groups (which share all the same characteristics other than for the binary variable) are known as the marginal effects. The marginal effects sum to zero across each respondent. So if the chance of being in three of the four labour market states increases, then the chance of being in the fourth labour market state must decrease by the same amount. Unlike the odds ratios, the marginal effects are not interpreted relative to a particular labour market category, but need to be interpreted across the labour market states.
  • Odds ratio - This is defined as the ratio of the odds of an event occurring in one group to the odds of it occurring in another group. The odds ratios are equal to the exponential of the coefficient when all other factors are held constant. In these results the reference labour market outcome is inactive. Taking part-time as an example, the odds ratios for those with chronic diseases is the ratio of the odds of working part-time (rather than being inactive) for those with one or more chronic diseases to the same odds for those without chronic diseases.[8] As with the binomial models an odds ratio greater than one indicates a positive effect, whilst one between zero and one indicates a negative effect.

Owing to the differences as to what odds ratios and marginal effects measure, and therefore the different magnitudes of the two measures, it is perfectly plausible for the odds ratio for a specific category to be significantly different from the reference category, but for the marginal effect for the same group to not be significant. When calculating the odds ratio, the baseline odds (the ratio of the probability of an event occurring to the probability of it not occurring) drop out, so the magnitude of the probability is not important in the odds ratio calculation. The test for significance indicates whether the odds ratio (which is not dependent on the baseline odds) is different from one. However, the magnitude of the probabilities is important in testing the significance of a marginal effect. The test here is whether the marginal effect significantly changes the baseline probability. If the base probability for the sample is very small or very large then small marginal effects may not be significant. Another way of thinking about this is that a big sounding odds ratio can easily correspond to a very small sounding difference in marginal effect.

Notes

  • [6]Where the odds is the ratio of the probability of an event occurring to the probability of it not occurring within a group; so the probability of participating to the probability of not participating.
  • [7]The marginal effects presented here use this method. Alternative methods include using the means for certain groups (ie, those with chronic diseases) or calculating the person-specific marginal effects and averaging them over the groups of interest. These methods were considered here but, as the differences in the resulting marginal effects using these methods were small, the mean for the whole sample was used.
  • [8]So the odds are the probability of working part-time to the probability of being inactive.
Page top