Course Calendar & Materials
STAT 605: Advanced Statistical Computations
David B. Dahl
Spring 2009
Wednesday, January 21, 2009
Computation numerical methods used by statisticians, including:
- Simulation (a.k.a., Monte Carlo) studies
- The evaluation of distributional functions for random variables
- Random number generation, including Markov chain Monte Carlo (MCMC)
- Numerical optimization & root finding methods, including the Newton-Raphson method and EM algorithm
- Resampling techniques, including the permutation test and the bootstrap
- Numerical treatment of linear algebra
- Algorithmic complexity
Computing world of a statistician
Computing resources:
- Graduate computer lab (room 421) has Linux workstations: g1.stat.tamu.edu to g8.stat.tamu.edu
- General purpose Linux login servers: s0.stat.tamu.edu to s1.stat.tamu.edu
- General purpose Linux computational servers: s7.stat.tamu.edu to s9.stat.tamu.edu
- Note: Only s0 and s1 are accessible outside of the university, unless you use VPN.
- Text-based login from Windows using PuTTY
- Graphical login from Windows using NX client
- File transfer using WinSCP
- Henrik's somewhat-dated orientation to computing infrastructure
Working at the command line:
VIM: Non-graphical text editor:
Friday, January 23, 2009
Subversion: Revision control system
Comprehensive documentation: Version Control with Subversion
Subversion clients:
- svn: Command-line client, recommended over graphical interfaces
- TortoiseSVN: Graphical interface for Subversion using Windows Explorer
- Available in KDE and GNOME desktop environments
- Available in many integrated development environments
In your home directory, check out the STAT 605 repository using: "svn co svn://dahl.stat.tamu.edu/stat605/2009a-spring stat605"
kompare4svn: Shell script to graphically show differences in revisions
Wednesday, January 28, 2009
Project Due
Computing topics:
- More on Subversion
- Configuration of VIM
- Line feeder in VIM
Simulation Studies:
- Represent a real-world process of interest on the computer to evaluate statistical properties associated with the process.
- Uses include:
- Evaluating point estimators
- Forming confidence intervals
- Testing hypothesis
- Describing distributions
- and many others...
- Especially useful when theoretical derivations are unavailable, difficult, or intractable.
- "Simulation study" and "Monte Carlo study" are synonymous.
Monte Carlo Studies in Statistics by James E. Gentle
Read Givens and Hoeting, pgs. 143-144
Notes on Monte Carlo integration
Friday, January 30, 2009
OpenSSH public key authentication
R Project for Statistical Computing
Wikipedia entry on R
Read "An Introduction to R"
R for MATLAB users
R example using Old Faithful data
Wednesday, February 4, 2009
Board game simulation
Probability distributions in R
Friday, February 6, 2009
Project Due
Welch's 1938 paper regarding two-sample tests for equality of means
Reminder of definition of power
Simulation studies comparing Welch's method to others in welch1938.R
Simple linear regression illustrated in faithful.R
Wednesday, February 11, 2009
Project 2 reprise... Scripts are in "solutions" directory of the repository
Friday, February 13, 2009
Reminder of definition of p-value
Polya urn simulation to compute p-values and power in urn.R
New York Times articles regarding R: initial article and follow-up blog
Notes from Roger D. Peng on R functions
Wednesday, February 18, 2009
Project Due
Monte Carlo p-value for tack problem in 2005 examination
Permutation tests
Givens and Hoeting, Section 9.7
Fisher's exact test in dieting.R
Permutation test for correlation in gpa.R
Permutation test for earnings of brothers in 2005 examination
Friday, February 20, 2009
Permutation test for positional dependences among protein backbone torsion angles distributions in densities directory
Wednesday, February 25, 2009
A collection of shell commands in an executable file is called a shell script
We are using the bash shell
Quick guide to bash shell scripts
Advanced Bash-Scripting Guide: When you are ready to dive in!
Dirsize: Shell script to recursively show directory sizes
Hunter: Shell script to check computer resources in the department
Shell script for least-squares clustering method
R uses a shell script to get itself started
Running jobs in the background using "&" and "screen"
Friday, February 27, 2009
Project Due
Inverse CDF method
Givens and Hoeting, pg. 145
Box-Muller Transformation
Sampling from familiar distributions
Givens and Hoeting, pg. 146
LaTeX
The Not So Short Introduction to LaTeX 2e
LaTeX Tutorials: A Primer
Kile
Spell check a LaTeX file "foo.tex" using "aspell -t -c foo.tex"
Including figures in LaTeX documents processed by "pdflatex"
Beamer: PowerPoint-like presentations using pdfLaTeX
Wednesday, March 4, 2009
random-exponentials: An example of an executable R script
use-executable-script.R: Using an executable R script in R code
Rejection sampling
Givens and Hoeting, pgs. 147-150
Generic code for rejection sampling in rejection-sampler.R
Sampling from a beta distribution using rejection sampling in sample-via-rejection.R
Project 4 reprise
Friday, March 6, 2009
Project Due
Importance sampling, take I
Importance sampling, take II
Givens and Hoeting, pgs. 162-169
Demonstration of importance sampling in R
Wednesday, March 11, 2009
Scripting Languages, Ruby, and R: Why?
Ruby homepage
Ruby documentation
Try Ruby! using Firefox (not Konqueror)
Ruby in Twenty Minutes
Ruby User's Guide
Ruby Core Reference
Friday, March 13, 2009
Reading and writing files in Ruby
RCon example
Hotdog example
Environments for regexp
Wednesday, March 25, 2009
Project Due
Regular expressions
Quick start on regexp
KDE's regular expression editor: kregexpeditor
Centroids example
Friday, March 27, 2009
Milano example
RinRuby: Accessing the R Interpreter from Pure Ruby
Wednesday, April 1, 2009
svnadmin: To manage (i.e. create, backup, etc.) Subversion repositories
Comprehensive documentation: Version Control with Subversion
Friday, April 3, 2009
Project Due
quota shell and ruby script
xtable: R contributed package to export tables to LaTeX or HTML
Writing R Extensions
CRAN Package Check Results
rsync: Utility that provides fast incremental file transfer
rsync example for update your webpage
unison: Bidirectional file synchronizer
Review for midterm exam
Discussion of Project 8
Discussion of Project 9
Wednesday, April 8, 2009
Computer-based midterm exam in graduate student computer lab
Friday, April 10, 2009
University-wide "reading day" --- No class
Wednesday, April 15, 2009
Introduction to Markov chain Monte Carlo (MCMC) in simple-pmf.R
Markov chain Monte Carlo (MCMC), take I
Friday, April 17, 2009
MCMC for the beta distribution in R
Givens and Hoeting, Chapter 7 introduction, Section 7.1, and Section 7.3
Markov chain Monte Carlo (MCMC), take II
Article: Monte Carlo Sampling Methods Using Markov Chains and Their Applications
Article: Understanding the Metropolis-Hastings Algorithm
Bayesian logistic regression via mcmc-sampler.R.
Wednesday, April 22, 2009
Gibbs sampler for MCMC
Givens and Hoeting, Section 7.2
Gibbs sampler for mean and precision in mean-precision.R
Friday, April 24, 2009
Project Due
"There are 10 kinds of people in the world - those who understand binary and those who don't."
Binary integer arithmetic
Bits and bytes
Useful KDE programs: kcalc, khexedit/okteta
Endianness
Notes for Dean Joe Newton
Wikipedia information on IEEE floating point standard
Underflow and overflow
Wednesday, April 29, 2009
Potential topic for final exam:
- Simulation studies
- Monte Carlo error
- Permutation tests
- Monte Carlo integration
- Inverse CDF method
- Rejection sampling
- Importance sampling
- Metropolis Hastings algorithm
- Gibbs sampling
- Numerical stability
- Binary arithmetic
- Machine representation of floating point numbers
- Regular expressions
- Implementing algorithms in pseudo-code
Previous exams:
Accessing R from scripting languages:
Accessing low-level languages (e.g., C, C++, Fortran, Java) from R:
Some other interesting items:
Friday, May 1, 2009
Written final (10:15am - 12:30pm in regular classroom)
Tuesday, May 5, 2009
Student presentations for Project 9
Wednesday, May 13, 2009
Project Due
Student presentations for Project 9 during exam time (10:30am - 12:30pm in regular classroom)