Review for Exam 1
Describing Univariate Data
Type of data: categorical vs. numerical, continuous vs. discrete.
Measuring Location (Center)
Three most commonly used measure of center: mean, median and mode.
Percentiles. Quartiles (1st quartile = Q1, 3rd quartile = Q3).
Extremes: minimum and maximum.
The median, upper and lower quartiles, minimum and maximum are
collectively called the five-number summary.
Z-score: How to compute it? What does it mean?
Measuring Scale (Spread)
Most commonly used: variance, standard deviation (square-root of
variance), inter-quartile range (IQR), range (max-min).
Shapes of distribution: symmetric, skewed left or skewed-right.
Unimodal: has a single peak.
Tails: long tailed (many outliers) or short tailed (few outliers).
When are median and IQR preferable to mean and variance: outliers
present, or distribution skewed.
Empirical Rule: what does it say? What assumption(s) do you need to
Chebychev's Rule: what does it say? What assumption(s) do you need
to use it?
Stem-and-leaf display: How to construct it? What does it show?
Advantages and disadvantages?
Histogram: What does it show? Advantages and disadvantages?
Boxplot: How to construct it? What does it show? Advantages and
Normal quantile plot: What does it show?
Pearson's correlation coefficient: what does it measure? when is it
applicable? what values can it take on?
Spearman's rank correlation coefficient: what is it? why use it?
Least Squares line: what is it? what's the relationship with
correlation coefficient? what does the intercept and slope mean?
Coefficient of determination: What is it? what does it mean? what
values can it take on?
Graphical Summaries: scatterplot.
Definition (or meaning of) sample space, event, random experiment,
union, intersection, complement, mutually exclusive, random variable,
The three basic rule of probability.
Conditional probability: what does it mean?
The difference between mutually exclusive and independent.
What are the situations/setup for the following distributions:
binomial, hypergeometric, negative binomial, Poisson.
The difference between sampling with and without replacement.
Normal Distribution, Sampling Distribution of a Statistic
How does a generic normal distribution relate to the standard normal
Definitions: population, random sample, parameter, statistic, sampling
Basic situations for statistical inference: what are the parameters
and corresponding statistics?
Sampling from one, two or several continuous populations.
Sampling from one, two or several 0-1 populations.
Simple linear regression and correlation.
Central Limit Theorem
What does it say? What does it assume?
Normal approximation to the binomial probabilities: an application of
A transformed statistic measures how close a statistic is to the
corresponding parameter. The sampling distribution of a transformed
statistic tells us how likely for a random sample to yield a statistic
with the value we got from our particular sample.