New Methods for Estimating Detailed Fertility Schedules from Abridged Data
Pavel Grigoriev, Max Planck Institute for Demographic Research
Vladimir M. Shkolnikov, MPIDR
Anatoli Michalski, Institute for Control Sciences
Vasily Gorlishchev, Institute for Control Sciences
Dmitri A. Jdanov, National Research University Higher School of Economics
Occasionally there is a need to split aggregated fertility data into fine grid of ages. As the existing disaggregation methods are not free of limitations, we seek for a method which satisfies the following criteria: 1) Shape - the plausibility and smoothness of estimated fertility curves 2) Fit – predicted values should trace closely to the observed ones 3) Non-negativity – only positive values should be returned 4) Balance – estimated five-year age group totals should match the input data; and in case of birth-order data 5) Parity – the balance by parity has to be maintained. To our knowledge none of the existing methods fully meets first four criteria. Also, no attempt has been made to extend the restrictions to the criterion (5). To address the disadvantages of the existing methods we introduce two alternative approaches for splitting abridged fertility data: Quadratic Optimization (QO) and Neural Network (NN) methods. We rely on the high-quality fertility data from the Human Fertility Database (HFD) as well as the large and heterogeneous data from the Human Fertility Collection (HFC). The QO and NN methods are tested against the current HFD splitting protocol (HFD method) and Calibrated Spline (CS) method. The results of thorough testing suggest good performance of both methods. The main advantage of the QO approach is that it meets all five requirements. However, it does not provide such good fit as the NN and CS methods. NN method does not satisfy the balance and parity criteria but it returns the best results in terms of fit. The QO method satisfies the needs of large databases such as the HFD and the HFC. Despite strict requirements it returns plausible results. The NN splitting is a good alternative in cases when the priority is given to fit criterion.
Session 1069: Data and Methods