The Department of Statistics at Texas A&M University, one of the premier statistics departments in the world, offers quality online statistics degree and certificate programs. This online statistics program is an integrated extension of the renowned on-campus program at Texas A&M University. It provides the same instruction, course materials, and exams - with the flexibility to fit your schedule. We offer a 36 hour non-thesis Masters of Science in Statistics program and also a 12 hour Certificate program. We can also provide individual Statistics courses.
Texas A&M Statistics is home once again to two of the top five teams in the national finals of the Capital One Modeling Competition, focused this year on search engine marketing. And the winner is . . . Texas A&M, placing first and third to claim its second title in the three-year-old event!Read More →
Texas A&M statistician Clifford Spiegelman is one of two Texas A&M University faculty members honored as 2014 Fellows of the American Association for the Advancement of Science (AAAS) for scientifically or socially distinguished efforts to advance science or its applications.Read More →
11:30 AM / 12:20 PM Blocker Building (BLOC), Room 113 979-845-3141
Department of Statistics
"Semi-Nonparametric Inference for Massive Data"
In this talk, we consider a partially linear framework for modelling (possibly heterogeneous) massive data. The major goal is to extract common features across all sub-populations while exploring heterogeneity of each sub-population. In particular, we propose an aggregation type estimator that possesses the (non-asymptotic) minimax optimal bound and asymptotic distribution as if there were no heterogeneity. Such an oracle result holds when the number of sub-populations does not grow too fast. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. A large scale heterogeneity testing is also considered. Our general theory applies to the divide-and-conquer approach that is often used to deal with massive homogeneous data in a parallel computing environment. A technical by-product of this talk is the statistical inferences for the general kernel ridge regression.
01:00 PM / 02:00 PM Blocker Building (BLOC), Room 411 979-845-3141
PhD Candidate, Department of Statistics
Texas A&M University
"Statistical Inference for Medical Costs with Censored Data"
Cost-effectiveness analysis is widely conducted in the economic evaluation of new medical treatments. Censored costs data poses a unique problem for cost estimation due to “induced informative censoring” problem. Thus, many standard approaches for survival analysis are not valid for the analysis of cost data. We first study how to intuitively explain some existing mean cost estimators, based on the generalized redistribute-to-the-right algorithm. Then we derive the confidence interval for the incremental cost-effectiveness ratio for a special case, when terminating events are different for survival time and costs. Finally, we present two improved survival estimators of costs, which are motivated by the idea of generalized redistribute-to-the-right algorithm and kernel method. The simulation studies and real data example show that our methods perform very well for some practical settings.
02:00 PM / 03:00 PM Blocker Building (BLOC), Room 503 979-845-3141
Ph.D Candidate, Department of Statistics
Texas A&M University
"Statistical Methods for Integrating Genomics Data"
This dissertation focuses on methodology to integrate multiplatform genomic data with cancer applications. Such integration facilitates the discovery of biological information crucial to the development of targeted treatments. We present iBAG (integrative Bayesian Analysis of Genomics data), a two-step hierarchical Bayesian model that uses the known biological relationships between genetic platforms to integrate an arbitrary number of platforms in a single model. This method identifies genes important to a clinical outcome, such as survival, and the integration approach also identifies which platforms are modulating the important gene effects. A glioblastoma multiforme (GBM) data set publicly available from The Cancer Genome Atlas (TCGA) is analyzed with both a linear and nonlinear formulation of iBAG. Next we present a pathway iBAG model, piBAG, which includes gene pathway membership information and utilizes hierarchical shrinkage to simultaneously select important genes and assign pathway scores. The integration of multiple genomic platforms again allows us to determine which platform is regulating each important gene, and it also provides insight as to through which platform each pathway is taking effect. We apply this method to a different subset of the TCGA GBM data. Finally, we present integrative heatmaps, a novel visualization tool for illustrating integrated data. We use a TCGA colorectal cancer data set to demonstrate the integrative heatmaps.