5 Parameter estimation
The previous sections have examined the discrete choice model underlying an individual’s labour supply behaviour. The basic assumptions are that individuals maximize their utility and that utility depends on two arguments, income and hours of work. Utility is expected to increase with income and to decrease with hours of work (or increase with the complement of working hours, leisure time).
This section discusses how this model can be estimated with the help of data, using the method of maximum likelihood. An advantage of the discrete hours framework, in contrast to the continuous approach, is that it can be applied to any legitimate utility function. Hence, no explicit assumption about utility functions is made in the present section: their specification is discussed in section 7. The extreme value error distribution, examined in the previous section, is used. The construction of the likelihood function is described in subsection 1 and its maximisation is considered in subsection 2.
5.1 The likelihood function
The notation used in the previous sections did not need to distinguish between individuals, since only a single individual was examined. However, estimation uses information from a cross-section of individuals. Suppose there are
individuals and the index
is used to refer to individuals
. There are, as before,
discrete hours levels
for
. It is first necessary to indicate the optimal hours level for the
th person; denote this by
, so that
indicates the chosen value of
(the hours index) for person
. Consistent with this notation, the probability of selecting this hours level is
and the corresponding optimal utility level is
. All other utility levels (associated with other hours levels) are denoted
for
.
Using this notation:
(16)
Thus when all
are assumed to follow the extreme value distribution discussed in Section 1, the probability associated with the optimal hours chosen by person
is expressed as:
(17)
The joint probability that individual
selects
and individual
selects
and individual
selects
and so on, is given, assuming that the decisions are made independently, by the product:
(18)
This joint probability concerns the probability of the set of hours levels,
for
, being chosen by the
individuals, given their preferences and other personal characteristics, and assuming that all
follow identical extreme value distributions.
The situation facing researchers is that they do not know the parameters of (the assumed form of) preference functions, but have information about the hours worked by each individual in a random sample taken from the population. In addition, data are available on personal characteristics and net incomes of each individual at each discrete hours point. The net incomes are not observed directly but are obtained from knowledge of each individual’s wage rate and the details of the tax and transfer system.[17]
The probability in (18) can be viewed from another perspective. Given an assumption about the general form of the utility functions, it is possible to find parameter values that, if true, would produce the highest probability of observing the actual hours values. The expression in (18) is reinterpreted as being a function of the unknown parameter values, for a given set of observed hours. Since the framework is one in which a particular ‘true’ set of parameters is assumed to exist, and any variations are attributed to sampling variations, it is not appropriate, when discussing the function in terms of parameters, to refer to a ‘probability’ of parameters taking particular values. Rather, it is necessary to refer to the probability of observing this particular sample of individuals (with their combinations of characteristics and hours worked) conditional on the parameter values. Suppose that each individual’s utility function depends on a vector of coefficients
, with elements
, for
. The probability statement in (18) can be rewritten as:
(19)
where
a function of the unknown parameters (for a given sample of observed hours worked), is referred to as the Likelihood Function. Here the (fixed) parameters are effectively treated as if they were variables. The estimates,
produced by finding values for
that maximise the value of this function are referred to as maximum likelihood estimates.
Taking logarithms gives the log-likelihood for this model:
(20)
This monotonic transformation does not affect the maximum likelihood estimates but, by converting products into sums, makes analysis easier.
Notes
- [17]The taxation and benefit rules are applied to the gross income of each individual at each of the discrete points to obtain the associated net income. Depending on the complexity of the rules and the data available, it may not be possible to include all benefits. Furthermore, the wage rates of those who are not in employment at the time of the survey cannot be observed, so it is necessary to impute wage rates using estimated wage functions. An alternative is to estimate a joint wage and labour supply model (see for example, Gerfin, 1993).
