The Treasury

Global Navigation

Personal tools

Treasury
Publication

Health and Retirement of Older New Zealanders WP 12/02

8 Appendix A

8.1 Statistical methods

This study makes extensive use of discrete choice models, specifically logit models. These are used to model discrete outcomes: for example, participation in the labour force (Y=1) as opposed to being retired (Y=0). The outcomes of interest (Y) are theorised to be a function of a set of explanatory variables, X. This section will provide a brief explanation of the binary outcome logit case.

The probability of “success” (Y=1) is denoted p and is assumed to follow a logistic distribution.

Suppose the probability of some event denoted p is 0.8 (eg, the probability that a respondent is in the labour force). The probability that the individual is not in the labour force is then 1-p = 1-0.8 = 0.2. The odds of the event are defined as the ratio of the probability of “success” to the probability of “failure”:

In the above example, this would be 0.8/(1-0.8) = 4. In other words, the odds of a person participating (relative to being retired) are four to one.

The estimated coefficients from this model describe the amount by which the log odds change in response to a one-unit change in the corresponding Xi.

By working with the logarithm of the odds, the problem of the restricted range for of probability is circumvented. The transformation to logarithmic odds maps the underlying probability whose range is from zero to one, into a variable with range from negative infinity to positive infinity. This is referred to as a logit transformation.

We now sketch the use of this transformation in the estimation of the coefficients associated with the explanatory variables in the underlying model. Let p be the probability of success. Then:

We can now show that Z is equal to the log of the odds. Rearranging equation (1) to solve for Z yields:

and taking natural logarithms of both sides gives:

Hence we can see a linear relationship between covariates and the log odds, contrasting with the non-linear relation between covariates and the probability of success.

We can now proceed to estimate equation (3), in this case using maximum likelihood estimation, an iterative procedure which searches for that set of values for α and the set of β such that the probability of observing the dependent variables in the sample of data is maximised. The logit specification is advantageous in that it generates predicted probabilities which lie between zero and one, and it can be extended to incorporate panel-data methods; one potential limitation, however, is the somewhat arbitrary imposition of the logistic distribution for the error term.

8.2 Interpreting the logit regression

We turn now to the interpretation of the coefficients in the logit equation. The estimated values of each of the coefficients describe the amount by which Z, the log odds, changes in response to a one-unit change in the corresponding Xi.

However, this interpretation is not especially intuitive. A preferable approach is to consider the impact of a unit change in a particular Xj on the odds rather than the log odds. The odds ratio is defined as

ie, it is the ratio of two odds. Consider the following which involves finding the odds ratio for a one unit change in say X1:

This expression reduces to as all other terms cancel out. This result simply states that, for a one-unit increase in X1, the ratio of the odds between the base level and the increased level is given by . It is constant; specifically, it does not depend on the values of the other variables (Xj). Note that while the odds ratio is constant, this does not imply that the odds themselves are constant at various values of the Xj. In fact, owing to the multiplicative effect, the actual change in the odds depends on the starting point. An odds ratio of two would increase odds of one to two, and odds of two to four.

Another way to interpret the output of logit models is to consider the probability of success with some explanators fixed at certain values of interest. This can be used to evaluate the marginal effect of a specific change; for example, the marginal effect on the probability of participation in moving from ill to good health. The baseline probability of success is denoted by ρ0 (eg, ill health), and ρ1 represents the probability of success after a change (eg, to excellent health).

Average marginal effect = ρ0 - ρ1

Here, ρ0 is found by predicting the probability of participation for each individual, whilst fixing the variable of interest at a certain level, for example, setting health as “ill health”. Then the average predicted probability of participation over all individuals can be calculated.

The same procedure is then enacted to find ρ1, now setting health to represent “good health” for all individuals (for example). The difference between these average probabilities is the marginal effect of health, as health status is the only thing that has changed; all other explanators are left as they are originally observed.

Page top