© 2001 by Biometrika Trust
Alternative EM methods for nonparametric finite mixture models
1 Division of Epidemiology & Biostatistics, 2121 West Taylor Street, University of Illinois, Chicago, Illinois 60612, U.S.Apillar{at}uic.edu 2 Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.bgl{at}psu.edu
This research focuses on a general class of maximum likelihood problems in which it is desired to maximise a nonparametric mixture likelihood with finitely many known component densities over the set of unknown weight parameters.Convergence of the conventional EM algorithm for this problem is extremely slow when the component densities are poorly separated and when the maximum likelihood estimator requires some of the weights to be zero, as the algorithm can never reach such a boundary point. Alternative methods based on the principles of EM are developed using a two-stage approach. First, a new data augmentation scheme provides improved convergence rates in certain parameter directions. Secondly, two cyclic versions of this data augmentation are created by changing the missing data formulation between the EM-steps; these extend the acceleration directions to the whole parameter space, giving another order of magnitude increase in convergence rate. Examples indicate that the new cyclic versions of the data augmentation schemes can converge up to 500 times faster than the conventional EM algorithm for fitting nonparametric finite mixture models.
Key Words: Augmentation; Complete data; EM algorithm; Finite mixture distribution; High-dimensional; Maximum likelihood; Missing data; Rate of convergence; Nonparametric mixture; Zero-elimination
Received October 1998. Revised September 2000