next up previous contents
Next: Summary Up: Inference for Two-Way Tables Previous: Expected Counts

Significance Tests

To test tex2html_wrap_inline4201 , that there is no relationship between the row and column classifications, a statistic called the chi-square statistic/ is used. This statistic compares the sample counts with their expected values. Specifically, we take the difference between the sample count and its expected count, square these values, and divide by the expected count, then sum over all entries. That is, to compare the sample and expected counts we use a statistic tex2html_wrap_inline5771 , called the chi-square statistic. It is calculated from the following formula:

displaymath5773

where observed/ represents the sample counts, and expected/ represents the expected counts, and the sum is over all tex2html_wrap_inline5467 entries in the sample or expected count tables.

To test tex2html_wrap_inline4201 , we need a distribution to compare tex2html_wrap_inline5771 to, under the assumptions that tex2html_wrap_inline4201 is true. This leads us to the chi-squared distribution. The tex2html_wrap_inline3701 distribution is described by a single parameter, its degrees of freedom. Furthermore, the tex2html_wrap_inline3701 distribution is skewed to the right.

The data for an tex2html_wrap_inline5467 table can be obtained by random sampling as described by either of the two models previously discussed.

The null hypothesis to be tested is that the row and column classifications are independent (first model) or that the row classification proportions for the c populations are all equal (second model). The alternative hypothesis is that the null hypothesis is not true.

The test statistic is the tex2html_wrap_inline5771 statistic

displaymath5793

If tex2html_wrap_inline4201 is true, the statistic tex2html_wrap_inline3701 has approximately a tex2html_wrap_inline3701 distribution with (r - 1)(c - 1) degrees of freedom.

The p-value for the test is tex2html_wrap_inline5805 where tex2html_wrap_inline3701 is a random variable having the tex2html_wrap_inline5809 distribution. The approximation is based on having a large sample. The sample is judged large enough if the average of the expected counts is 5 or more, and the smallest expected count is 1 or more.


next up previous contents
Next: Summary Up: Inference for Two-Way Tables Previous: Expected Counts

Jan Lethen
Wed Nov 13 16:20:46 CST 1996