Biometrika Advance Access originally published online on November 5, 2008
Biometrika 2008 95(4):875-889; doi:10.1093/biomet/asn047
Articles |
Pairwise curve synchronization for functional data
Division of Biostatistics, Center for Devices and Radiological Health, Food and Drug Administration, Rockville, Maryland 20850, U.S.A. rong.tang{at}fda.hhs.gov
Department of Statistics, University of California, Davis, California 95616, U.S.A. mueller{at}wald.ucdavis.edu
Received for publication 1 May 2007. Revision received 1 April 2008.
Data collected by scientists are increasingly in the form of trajectories or curves. Often these can be viewed as realizations of a composite process driven by both amplitude and time variation. We consider the situation in which functional variation is dominated by time variation, and develop a curve-synchronization method that uses every trajectory in the sample as a reference to obtain pairwise warping functions in the first step. These initial pairwise warping functions are then used to create improved estimators of the underlying individual warping functions in the second step. A truncated averaging process is used to obtain robust estimation of individual warping functions. The method compares well with other available time-synchronization approaches and is illustrated with Berkeley growth data and gene expression data for multiple sclerosis.
Key Words: Alignment Curve registration Functional data analysis Gene expression profile Multiple sclerosis Synchronization Time warping
References
-
Eilers P., Marx B. Flexible smoothing with B-splines and penalties (with Discussion). Statist. Sci. (1996) 11:89–121.[CrossRef]
Facer M., Müller H. G. Nonparametric estimation of the peak location in a response surface. J. Mult. Anal. (2003) 87:191–217.[CrossRef]
Fan J., Gijbels I. Local Polynomial Modeling and Its Applications (1996) London: CRC Press.
Gasser T., Kneip A. Statistical tools to analyze data representing a sample of curves. Ann. Statist. (1992) 20:1266–1305.[CrossRef]
Gasser T., Kneip A. Searching for structure in curve samples. J. Am. Statist. Assoc. (1995) 90:1179–88.[CrossRef][Web of Science]
Gervini D., Gasser T. Self-modelling warping functions. J. R. Statist. Soc. B (2004) 66:959–71.
Gervini D., Gasser T. Nonparametric maximum likelihood estimation of the structural mean of a sample of curves. Biometrika (2005) 92:801–20.
James G. Curve alignment by moments. Ann. Appl. Statist. (2007) 1:480–501.[CrossRef]
Kaminski N., Bar-Joseph Z. A patient-gene model for temporal expression profiles in clinical studies. J. Comput. Biol. (2007) 14:324–38.[CrossRef][Web of Science][Medline]
Kuhn H., Tucker A. Nonlinear programming. In: Proc 2nd Berkeley Symp. Math. Statist. Prob.—Neyman J, ed. (1951) Berkeley, CA: University of California Press. 481–92.
Leng X., Müller H. Time ordering of gene co-expression. Biostatistics (2006) 7:569–84.
Liang Y., Tayo B., Cai X., Kelemen A. Differential and trajectory methods for time course gene expression data. Bioinformatics (2005) 21:3009–16.
Liu X., Müller H. Functional convex averaging and synchronization for time-warped random curves. J. Am. Statist. Assoc. (2004) 99:687–99.[CrossRef][Web of Science]
Mack Y., Silverman B. W. Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Related Fields (1982) 61:405–15.
Müller H. G. Weighted local regression and kernel methods for nonparametric curve fitting. J. Am. Statist. Assoc. (1987) 82:231–38.[CrossRef][Web of Science]
Müller H. G., Stadtmüller U. Variable bandwidth kernel estimators of regression curves. Ann. Statist. (1987) 15:182–201.[CrossRef]
Ramsay J. O., Silverman B. W. Applied Functional Data Analysis: Methods and Case Studies (2002) New York: Springer.
Ramsay J. O., Silverman B. W. Functional Data Analysis (2005) 2nd ed. New York: Springer.
Ramsay J. O., Li X. Curve registration. J. R. Statist. Soc. B (1998) 60:351–63.
Rønn B. Nonparametric maximum likelihood estimation for shifted curves. J. R. Statist. Soc. B (2001) 63:243–59.
Sagan H. Introduction to the Calculus of Variations (1992) Mineola, NY: Dover Publications.
Sakoe H., Chiba C. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Sig. Proces. (1978) 26:43–9.[CrossRef]
Tuddenham R., Snyder M. Physical growth of California boys and girls from birth to age 18. Calif. Public Child Dev. (1954) 1:183–364.
Wang K., Gasser T. Alignment of curves by dynamic time warping. Ann. Statist. (1997) 25:1251–76.[CrossRef]
Wang K., Gasser T. Synchronizing sample curves nonparametrically. Ann. Statist. (1999) 27:439–60.[CrossRef]
Weinstock-Guttman B., Badgett D., Patrick K., Hartrich L., Santos R., Hall D., Baier M., Feichter J., Ramanathan M. Genomic effects of INF-beta in multiple sclerosis patients. J. Immunol. (2002) 171:1503–8.
Zamvil S., Steinman L. Diverse targets for intervention during inflammatory and neurodegenerative phases of multiple sclerosis. Neuron (2003) 38:685–8.[CrossRef][Web of Science][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||