2 The problem
For each of
individuals in a sample survey, information is available about
variables; these are placed in the vector:
(1)
For present purposes these vectors contain only the variables of interest for the calibration exercise (rather than all measured variables). Many of the elements of
are likely to be
variables. For example
if the
th individual is in a particular age group (or receives a particular type of social transfer), and zero otherwise. The sum
therefore gives the number of individuals in the sample who are in the age group (or who receive the transfer payment).
Let the sample design weights (provided by the statistical agency responsible for data collection) be denoted
for
These weights can be used to produce estimated population totals,
based on the sample, given by the
-element vector:
(2)
The problem examined in this paper can be stated as follows. Suppose that other data sources, for example census or social security administrative data, provide information about ‘true’ population totals,
. The problem is to compute new weights,
for
which are as close as possible to the design weights,
while satisfying the set of
calibration equations:
(3)
It is thus necessary to specify a criterion by which to judge the closeness of the two sets of weights.
In general, denote the distance between
and
as
. The aggregate distance between the design and calibrated weights is thus:[4]
(4)
The problem is therefore to minimise (4) subject to (3). The Lagrangean for this problem is:
(5)
where
for
are the Lagrange multipliers. The following two sections consider methods of obtaining values of
that minimise (5).
Notes
- [4]Some authors, such as Folson and Singh(2000) write the distance to be minimised as , but the present paper follows Deville and Särndal (1992).
