The F-Test by Hand Calculator - Department of Statistics

1

The F-Test by Hand Calculator

Where possible, one-way analysis of variance summaries should be obtained using a statistical computer package. Hand-calculation is a tedious and errorprone process, especially with large data sets. This section gives formulae for calculating the F -test statistic and introduces tables of the F -distribution for those situations in which computers with statistical software are not easily available.

We take each group in turn, calculating and recording their mean and standard deviation, as in the table below.1 We then use the following formulae directly.2 The degrees of freedom are df1 = k - 1 and df2 = ntot - k.

x..

=

n1x1. + n2x2. + . . . + nkxk. . n1 + n2 + . . . + nk

s2B

=

n1(x1.

-

x..)2

+

n2(x2.

- x..)2 k-1

+

...

+

nk (xk.

-

x..)2 ,

s2W

=

(n1

- 1)s21

+ (n2

- 1)s22 + . . . ntot - k

+ (nk

- 1)s2k ,

and

f0

=

s2B s2W

.

[Substantial simplifications can be made when all group sizes are equal ? see below.]

Example 1 For the reading-methods data of Example 10.3.1 in the text we have k = 4 groups with summary statistics given in Table 1.

Table 1 : Summary Statistics For the Reading Methods Data

1. Both: 2. Map Only: 3. Scan Only: 4. Neither:

n1 = 22 n2 = 12 n3 = 7 n4 = 9

x1. = x2. = x3. = x4. =

1.4590909

1.2333333

0.9142857 -0.5555556

s1 = 1.543545 s2 = 1.441170 s3 = 1.301830 s4 = 1.134803

Here, ntot = 22+12+7+9 = 50 and the degrees of freedom are df1 = k-1 = 3

1This subsection assumes the availability of a calculator which automatically calculates means and standard deviations. 2Recall that the grand mean, x.., is the average of all ntot observations (regardless of group).

2

and df2 = ntot - k = 46.

x..

=

22 ? 1.4590909 + 12 ? 1.2333333 + 7 ? 1.301830 + 9 ? -0.5555556 50

=

0.966

s2B = 22 ? (1.4590909 - 0.966)2 + 12 ? (1.2333333 - 0.966)2

+7 ? (0.9142857 - 0.966)2 + 9 ? (-0.5555556 - 0.966)2 /3 = 9.02052

s2W

=

21 ? 1.5435452 + 11 ? 1.4411702 + 6 ? 1.3018302 + 8 ? 1.1348032 46

= 2.029361

f0 = 9.02052/2.029361 = 4.445.

This agrees with the computer generated value of f0 given in Fig 10.3.2 in the text.

Simplifications for equal sample sizes

In many examples, all of the individual sample sizes ni are the same, i.e. n1 = n2 = . . . = nk(= n, say). Here, we can obtain s2B as follows. Enter the individual sample means x1., x2., . . . , xk. as k numbers into the calculator and obtain their sample standard deviation ? we will write the result as sx. Calculate s2B = n s2x. Obtain s2W as the sample mean of the k numbers s21, s22, . . . , s2k (note that it is the squares of the standard deviations which are being averaged).

Example The summary statistics in Table 2 relate to the cell ratio data

described in Review Exercises 10, problem 12 and given in Table 10 there.

There are k = 6 treatment groups and n = 50 observations in each group. The degrees of freedom are df1 = k - 1 = 5 and df2 = ntot - k = 6 ? 50 - 6 = 294.

Table 2 : Summary Statistics For the Cell Ratio Data

1. Control: 2. Choral hydrate: 3. Hydroquinone: 4. Diazepam: 5. Econidazole: 6. Colchicine:

n1= 50 n2= 50 n3= 50 n4= 50 n5= 50 n6= 50

x1= 0.2366 x2= 0.2686 x3= 0.2812 x4= 0.3116 x5= 0.2646 x6= 0.4482

s1 = 0.1124243 s2 = 0.1406111 s3 = 0.1204692 s4 = 0.1761535 s5 = 0.1258475 s6 = 0.1755189

The 6 numbers in the means column have sample standard deviation sx = 0.07575022. Thus, s2B = ns2x = 50 ? 0.075750222 = 0.2869048. The sample mean of the squares of the 6 numbers in the standard deviation column is3 0.02076634. Thus, s2W = 0.02076634. Finally, f0 = s2B/s2W = 0.2869048/0.02076634 = 13.81586.

3i.e. the sample mean of 0.11242432, 0.14061112, . . . , and 0.17551892 is 0.02076634.

3

Use of F-distribution Tables

Having obtained the F -test statistic using a hand-calculator, we need tables of the F-distribution in order to obtain the corresponding P -values . The Fdistribution is very similar in shape to the Chi-square distribution.4 However, since the F-distribution depends upon two "degrees of freedom" parameters, we need a complete page of tables for each upper tail area of the distribution. The "10%" table in Appendix 1 at the end of this module gives us the value f such that pr(F f ) = 0.10. For example, suppose that we look up the entry in the column of the 10% table defined by df1 = 4 and the row defined by df2 = 8 we find that it is 2.81. Thus, for df1 = 4 and df2 = 8, pr(F 2.81) = 0.10. The 5% table in Appendix 2 and the 1% table in Appendix 3 work in the same way. From the df1 = 4 column and df2 = 8 row of each of the latter tables we find that pr(F 3.84) = 0.05 (5% table) and pr(F 7.01) = 0.01 (1% table). Thus, when F F(df1 = 4, df2 = 8), we have

pr(F 2.81) = 0.10, pr(F 3.84) = 0.05, and pr(F 7.01) = 0.01.

Let us now use this information to bracket a P -value . Suppose that f0 = 3.12. Since f0 lies between 2.81 and 3.84 we have that the P -value lies between 0.10 and 0.05, and it is closer to 0.10 than it is to 0.05.

Suppose now that df1 = 13, df2 = 25 and the F -test statistic is f0 = 2.5. When we look up the tables, say Appendix 1, we find that we have df1 = 12 with f = 1.82 and df1 = 15 with f = 1.77, but there is no entry for df1 = 13. Without access to more detailed tables we have to choose one of these two entries. Which one? Since we always act conservatively, we choose the tabulated df1 with the bigger f -value (which makes it harder to get significance), namely df1 = 12. We therefore use the tabulated value of df1 immediately less than the value required. What happens if df2 is not tabulated? Inspecting the tables we see that we do the same thing with df2 as with df1. For example, if df2 = 33 we enter the table with df2 = 30.

Exercises

1. For the F -distribution, obtain upper and lower values that the P -value lies between in the following cases: ( a ) f0 = 2.13, df1 = 10, df2 = 6 (b) f0 = 2.41, df1 = 5, df2 = 25 ( c ) f0 = 3.83, df1 = 8, df2 = 18 (d) f0 = 3.41, df1 = 4, df2 = 30 ( e ) f0 = 2.98, df1 = 6, df2 = 45

2. If 0.01 < P -value < 0.05 for an F -test with df1 = 5 and df2 = 23, between what two values did f0 lie?

4In fact, if F F(df1, df2 = ) then df1F Chi-square(df1).

4

3. Recompute the analysis of variance table for the variable DISPERSION in Exercises 10.3 in the text with the outlier in the Black group omitted. (The F -ratio is 17.08.) Were your intuitions about the effect of the outlier confirmed?

5 Appendix 1 F-distribution, 10% Table

For fixed df1, df2, the tabulated value is the number f = Fdf1,df2 (0.10) such that for F F (df1, df2), pr(F f ) = 0.10.

0

prob = 0.10

f

=

Fdf

,

1

df2

(0.10)

df 2

1 2 3 4 5

df 1

1 2 3 4 5 6 7 8 9 10 12 15 20 30 60 1000

39.9 49.5 53.6 55.8 57.2 58.2 58.9 59.4 59.9 60.2 60.7 61.2 61.7 62.3 62.8 63.3 8.53 9.00 9.16 9.24 9.29 9.33 9.35 9.37 9.38 9.39 9.41 9.42 9.44 9.46 9.47 9.49 5.54 5.46 5.39 5.34 5.31 5.28 5.27 5.25 5.24 5.23 5.22 5.20 5.18 5.17 5.15 5.13 4.54 4.32 4.19 4.11 4.05 4.01 3.98 3.95 3.94 3.92 3.90 3.87 3.84 3.82 3.79 3.76 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 3.30 3.27 3.24 3.21 3.17 3.14 3.10

6 3.78 3.46 3.29 3.18 3.11 3.05 3.01 2.98 2.96 2.94 2.90 2.87 2.84 2.80 2.76 2.72 7 3.59 3.26 3.07 2.96 2.88 2.83 2.78 2.75 2.72 2.70 2.67 2.63 2.59 2.56 2.51 2.47 8 3.46 3.11 2.92 2.81 2.73 2.67 2.62 2.59 2.56 2.54 2.50 2.46 2.42 2.38 2.34 2.29 9 3.36 3.01 2.81 2.69 2.61 2.55 2.51 2.47 2.44 2.42 2.38 2.34 2.30 2.25 2.21 2.16 10 3.29 2.92 2.73 2.61 2.52 2.46 2.41 2.38 2.35 2.32 2.28 2.24 2.20 2.16 2.11 2.06

11 3.23 2.86 2.66 2.54 2.45 2.39 2.34 2.30 2.27 2.25 2.21 2.17 2.12 2.08 2.03 1.97 12 3.18 2.81 2.61 2.48 2.39 2.33 2.28 2.24 2.21 2.19 2.15 2.10 2.06 2.01 1.96 1.90 13 3.14 2.76 2.56 2.43 2.35 2.28 2.23 2.20 2.16 2.14 2.10 2.05 2.01 1.96 1.90 1.85 14 3.10 2.73 2.52 2.39 2.31 2.24 2.19 2.15 2.12 2.10 2.05 2.01 1.96 1.91 1.86 1.80 15 3.07 2.70 2.49 2.36 2.27 2.21 2.16 2.12 2.09 2.06 2.02 1.97 1.92 1.87 1.82 1.76

16 3.05 2.67 2.46 2.33 2.24 2.18 2.13 2.09 2.06 2.03 1.99 1.94 1.89 1.84 1.78 1.72 17 3.03 2.64 2.44 2.31 2.22 2.15 2.10 2.06 2.03 2.00 1.96 1.91 1.86 1.81 1.75 1.69 18 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2.00 1.98 1.93 1.89 1.84 1.78 1.72 1.66 19 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 1.96 1.91 1.86 1.81 1.76 1.70 1.63 20 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 1.94 1.89 1.84 1.79 1.74 1.68 1.61

21 2.96 2.57 2.36 2.23 2.14 2.08 2.02 1.98 1.95 1.92 1.87 1.83 1.78 1.72 1.66 1.59 22 2.95 2.56 2.35 2.22 2.13 2.06 2.01 1.97 1.93 1.90 1.86 1.81 1.76 1.70 1.64 1.57 23 2.94 2.55 2.34 2.21 2.11 2.05 1.99 1.95 1.92 1.89 1.84 1.80 1.74 1.69 1.62 1.55 24 2.93 2.54 2.33 2.19 2.10 2.04 1.98 1.94 1.91 1.88 1.83 1.78 1.73 1.67 1.61 1.53 25 2.92 2.53 2.32 2.18 2.09 2.02 1.97 1.93 1.89 1.87 1.82 1.77 1.72 1.66 1.59 1.52

26 2.91 2.52 2.31 2.17 2.08 2.01 1.96 1.92 1.88 1.86 1.81 1.76 1.71 1.65 1.58 1.50 27 2.90 2.51 2.30 2.17 2.07 2.00 1.95 1.91 1.87 1.85 1.80 1.75 1.70 1.64 1.57 1.49 28 2.89 2.50 2.29 2.16 2.06 2.00 1.94 1.90 1.87 1.84 1.79 1.74 1.69 1.63 1.56 1.48 29 2.89 2.50 2.28 2.15 2.06 1.99 1.93 1.89 1.86 1.83 1.78 1.73 1.68 1.62 1.55 1.47 30 2.88 2.49 2.28 2.14 2.05 1.98 1.93 1.88 1.85 1.82 1.77 1.72 1.67 1.61 1.54 1.46

40 2.84 2.44 2.23 2.09 2.00 1.93 1.87 1.83 1.79 1.76 1.71 1.66 1.61 1.54 1.47 1.38 60 2.79 2.39 2.18 2.04 1.95 1.87 1.82 1.77 1.74 1.71 1.66 1.60 1.54 1.48 1.40 1.29 80 2.77 2.37 2.15 2.02 1.92 1.85 1.79 1.75 1.71 1.68 1.63 1.57 1.51 1.44 1.36 1.24 100 2.76 2.36 2.14 2.00 1.91 1.83 1.78 1.73 1.69 1.66 1.61 1.56 1.49 1.42 1.34 1.21 120 2.75 2.35 2.13 1.99 1.90 1.82 1.77 1.72 1.68 1.65 1.60 1.55 1.48 1.41 1.32 1.19

1000 2.71 2.30 2.08 1.94 1.85 1.77 1.72 1.67 1.63 1.60 1.55 1.49 1.42 1.34 1.24 1.00

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download