4.3 Some distance functions
One reason why the chi-squared distance function produces a solution is that no constraints are placed on the size of the adjustment to each of the survey weights. It is therefore also possible for the calibrated weights to become negative. However, Deville and Särndal (1992) suggested the following simple modification to the chi-squared function, although the explicit solution for the chi-squared case is no longer available and the iterative method must be used.
Suppose it is required to constrain the proportionate changes to certain limits, different for increases compared with decreases in the weights. Define
and
such that
The objective is to ensure that, for increases, the proportionate change,
is less than
or that
. For decreases, the aim is to ensure that
(or the negative of the proportional change) is less than
so that
.
For the chi-squared distance function, it has been seen that
where
and
solves for
. Hence if
is outside the specified range, it is necessary to set it to the relevant limit, either
or
rather than allow it to take the value generated. Since
it is clear that the limits are exceeded if
and if
. In each case where the value of
has to be set to the relevant limit, the corresponding value of
is zero. This approach ensures that weights are kept within the range,
. Hence, negative values of
are avoided simply by setting
to be positive.[9]
It has been seen above that the solution procedure requires only an explicit form for the inverse function
from which its derivative can be obtained. It is not necessary to start from a specification of
. Deville and Särndal (1992) suggest the simple form:
(23)
The gradient function,
is given by solving (23) for
so that:
(24)
and the form of the distance function can be obtained by integrating (24).[10] This is referred to as Case A, and its properties are given in the first row of Table 3. The second row of the table provides details of Case B, where
, and the final row gives the corresponding properties of the basic chi-squared function.[11] A feature of these functions is that they do not require any parameters to be set.
| Case |
|
|
|
|
|---|---|---|---|---|
| A |
|
|
|
|
| B |
|
|
|
|
| Chi-squared |
|
|
|
|
Deville and Särndal (1992) also suggest the use of an inverse function
of the form:[12]
(25)
where
and
are as defined above and:
(26)
Thus
and
, so that the limits of
are
and
This function therefore has the property that adjustments to the weights are kept within the range,
, although, unlike the chi-squared modification, no checks have to be made during computation.
The derivative required in the computation of the Hessian is therefore:
(27)
Since
solves for
(25) can be rearranged, by collecting terms in
, to give:
(28)
so that the gradient of the distance function is:
(29)
The special nature of this gradient function is illustrated by the line D-S in Figure 1, which shows the profile of (29) for the wide range where
and
The first characteristic of the S-D function that is evident is the restriction of
to the range specified. Figure 1 also shows the function
for the other cases discussed above. In all cases, the slope is zero (corresponding to a turning point of the distance function) when
. Given the quadratic U-shaped nature of the chi-squared distance function, the gradient increases at a constant rate, being negative in the range
. Cases A and B also imply U-shaped distance functions, but with the gradient increasing more sharply for
and more slowly than the chi-square function in the range
.
The distance function is given by integrating (29) with respect to
. It is most convenient to apply the variate transformation
, so that
, and it is required to obtain:
(30)
Using the result that:
(31)
and:
(32)
substitution and rearrangement gives
multiplied by:
(33)
plus a term
, which, since it is a constant, may be dropped without loss.[13] Examples of this distance function are shown in Figure 2.[14]
Notes
- [9]This is much more convenient than imposing inequality constraints and applying the more complex Kuhn-Tucker conditions. Also, it is desirable to restrict the extent of proportional changes even where they produce positive weights.
- [10]Hence it is required to obtain which can be written as , and dropping the last term, which is a constant, this is equal to .
- [11]Deville and Särndal (1992) discuss the use of a normalisation whereby is set to some specified value, but this is not necessary for the approach.
- [12]Singh and Mohl (1996), in reviewing alternative calibration estimators, refer to this ‘inverse logit-type transformation’ as a Generalised Modified Discrimination Information method.
- [13]Equation (33) is the result stated without proof by Deville and Särndal (1992, p. 378).
- [14]Folsom and Singh (2000) propose a variation on this, which they call a ‘generalised exponential model’, in which the limits are allowed to be unit-specific. In practice they suggest the use of three sets of bounds for low, medium and high initial weights.
