Help for Random Sampling Laboratory
Objectives:
- Graphically illustrate what random sampling looks like.
- Motivate the idea of the distribution of a population.
- Preview the idea of sampling distribution of a statistic.
Description:
When you start this laboratory:
- 100 boxes are placed on the screen in 10 columns each containing 10
boxes. The boxes are numbered from 1 to 100 and each box in a has the value
marked on the horizontal axis below that column of boxes.
- Think of the 100 boxes as being a population. It is discrete since
it only has 10 different values. It has the uniform distribution because
there are the same number of boxes having each value. In the How are Things
Distributed Lab we will see that populations can have many different distributions,
for example the bell shaped or normal distribution instead of having the
same number of boxes for each value would have many boxes for the value
in the middle and then fewer boxes for each value away from the middle.
- The lab randomly selects 5 of the boxes to be a sample from the population.
These boxes are highlighted in red.
- The sample mean (average) of the 5 values is calculated and a yellow
box is placed on top of the column of boxes closest to the value of the
mean.
- A dialog box appears with the following choices.
- Reset: Clicking on this option starts the lab over again.
- Sample: This choice tells the lab to select another random sample.
The old sample (represented by red boxes) is redrawn in blue so that the
population is complete and then the sample selected, colored in red, the
mean calculated, and another yellow box drawn above the appropriate column
of boxes.
- Help: Clicking here causes this help screen to appear.
- Close: This option causes the Random Sampling Lab to end and control
of Stataquest returns to the main menu.
- 5 radio boxes allowing you to select sample sizes of 1, 2, 5, 10 and
20. Click on the desired sample size.
Things to Notice and Suggestions on How to Use the Lab:
- Rapidly click on the Sample button to get a feel for what randomness
really looks like. Notice that samples that look `nonrandom' appear surprisingly
often (such as three boxes chosen from the same row or column). Many people
think random means `spread out,' but of course what it really means is
that every subset of size n has the same chance of being chosen as any
other, including some subsets that don't look `random.'
- Try doing n = 1 and notice how after a while the yellow boxes start
to even out as they should, although it is surprising how much variability
there can be in each column even though the population is known to be uniformly
distributed.
- For n = 5 (or larger), do you notice that the yellow boxes tend to
fall primarily above the middle columns? This is because to fall near the
end columns, almost all boxes in a sample would have to come from those
end columns and this is very unlikely. This leads to what is called the
Central Limit Theorem which we will study in later labs.
- If instead of being uniformly distributed, the population were highly
skewed, that is, had lots of boxes at one end and few at the other, then
the sample means could easily fall at the high end of the population distribution.
- What we are doing is showing the Sampling Distribution of Sample Means,
that is, what the likely values we can expect to get when we sample repeatedly
from a population. We will study this further in the Sampling Distribution
concept lab.
- For each of n = 2, n = 5, and n = 10, generate many samples and notice
how the yellow boxes tend to fall above fewer and fewer columns. Why is
this?
Homework Based on the Lab:
Do suggestion #6 above and hand in plots for each of the three values
of n. Also hand in an explanation of the smaller variability in means observed
in the three plots.