| Statistical Methods for Estimating Mutation Rate and Effective Population Size from Samples of DNA Sequences |
| The parameter Theta = 4N*mu, where N is the effective population size and mu is the mutation rate, is very important in population genetics in explaining the statistical properties of genetic variation. For DNA sequences, there have been several commonly used unbiased (or approximately unbiased) estimators for Theta under the Wright-Fisher model without recombination and population subdivision, such as Fu's UPBLUE (Fu, 1994) under infinite-site model assumption and Deng's UPBLUE (Deng and Fu, 1996) under finite-site model assumption. Both UPBLUE estimators were derived from samples with sample sizes n<=100, thus their performance of "unbiasedness" are not satisfactory when n is considerably larger than 100 as indicated by our simulation. In this study we first have an extensive review and comparison in terms of bias and variance (MSE) among some existing estimators of Theta, such as classical estimators of Watterson'estimator and Tajima's estimator, or phylogenetic estimators of Griffith's estimator and Kuhner's estimator using simulated samples, each of which contains 5,000 replicates; then based on the BLUE algorithm for both UPBLUE estimators we will modify the two UPBLUE estimators using generalize linear model regression using simulated samples, each of which contains 10,000 replicates. The two UPBLUE estimators will be used for estimating and compared with original UPBLUE estimators as well as two classical estimators using the same simulated samples (10,000 replicates). |
|
Feng Zhan School of Public Health University of Texas at Houston Student Poster Session |
![]()
![]() |
![]() |
April 4-5, 2003
Texas A&M University
College Station, TX
![]()
Email:
cots@stat.tamu.edu
Fax: (979) 845-3144
Phone: (979) 845-3141