ࡱ> UWT `bjbj 4b,TQ\?AAAAAAR"A9Az??16+0,"""DAA" : Analysis of Variance Lecture 8 Mar 24th, 2009 A. Introduction B. One-Way Analysis of Variance C. Computing Contrasts D. Analysis of Variance: Two Independent Variables E. Interpreting Significant Interactions F. N-Way Factorial Designs G. Unbalanced Designs: PROC GLM A. Introduction When you have more than two groups, a t-test (or the nonparametric equivalent) is no longer applicable. Instead, we use a technique called analysis of variance. This chapter covers analysis of variance designs with one or more independent variables, as well as more advanced topics such as interpreting significant interactions, and unbalanced designs. B. One-Way Analysis of Variance The method used today for comparisons of three or more groups is called analysis of variance (ANOVA). This method has the advantage of testing whether there are any differences between the groups with a single probability associated with the test. The hypothesis tested is that all groups have the same mean. Before we present an example, notice that there are several assumptions that should be met before an analysis of variance is used. Essentially, we must have independence between groups (unless a repeated measures design is used); the sampling distributions of sample means must be normally distributed; and the groups should come from populations with equal variances (called homogeneity of variance). Example: 15 Subjects in three treatment groups X,Y and Z. X Y Z 700 480 500 850 460 550 820 500 480 640 570 600 920 580 610 The null hypothesis is that the mean(X)=mean(Y)=mean(Z). The alternative hypothesis is that the means are not all equal. How do we know if the means obtained are different because of difference in the reading programs(X,Y,Z) or because of random sampling error? By chance, the five subjects we choose for group X might be faster readers than those chosen for groups Y and Z. We might now ask the question, What causes scores to vary from the grand mean? In this example, there are two possible sources of variation, the first source is the training method (X,Y or Z). The second source of variation is due to the fact that individuals are different. SUM OF SQUARES total; SUM OF SQUARES between groups; SUM OF SQUARES error within groups ; F ratio = MEAN SQUARE between groups/MEAN SQUARE error = (SS between groups/(k-1)) / (SS error/(N-k)) SAS codes: DATA READING; INPUT GROUP $ WORDS @@; DATALINES; X 700 X 850 X 820 X 640 X 920 Y 480 Y 460 Y 500 Y 570 Y 580 Z 500 Z 550 Z 480 Z 600 Z 610 ; PROC ANOVA DATA=READING; TITLE ANALYSIS OF READING DATA; CLASS GROUP; MODEL WORDS=GROUP; MEANS GROUP; RUN; The ANOVA Procedure Dependent Variable: words Sum of Source DF Squares Mean Square F Value Pr > F Model 2 215613.3333 107806.6667 16.78 0.0003 Error 12 77080.0000 6423.3333 Corrected Total 14 292693.3333 Now that we know the reading methods are different, we want to know what the differences are. Is X better than Y or Z? Are the means of groups Y and Z so close that we cannot consider them different? In general , methods used to find group differences after the null hypothesis has been rejected are called post hoc, or multiple comparison test. These include Duncans multiple-range test, the Student-Newman-Keuls multiple-range test, least significant-difference test, Tukeys studentized range test, Scheffes multiple-comparison procedure, and others. To request a post hoc test, place the SAS option name for the test you want, following a slash (/) on the MEANS statement. The SAS names for the post hoc tests previously listed are DUNCAN, SNK, LSD, TUKEY, AND SCHEFFE, respectively. For our example we have: MEANS GROUP / DUNCAN; Or MEANS GROUP / SCHEFFE ALPHA=.1 At the far left is a column labeled Duncan Grouping. Any groups that are not significantly different from one another will have the same letter in the Grouping column. The ANOVA Procedure Duncan's Multiple Range Test for words NOTE: This test controls the Type I comparison wise error rate, not the experiment wise error rate. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 6423.333 Number of Means 2 3 Critical Range 110.4 115.6 Means with the same letter are not significantly different. Duncan Grouping Mean N group A 786.00 5 x B 548.00 5 z B B 518.00 5 y C. Computing Contrasts Suppose you want to make some specific comparisons. For example, if method X is a new method and methods Y and Z are more traditional methods, you may decide to compare method X to the mean of method Y and method Z to see if there is a difference between the new and traditional methods. You may also want to compare method Y to method Z to see if there is a difference. These comparisons are called contrasts, planned comparisons, or a priori comparisons. To specify comparisons using SAS software, you need to use PROC GLM (General Linear Model) instead of PROC ANOVA. PROC GLM is similar to PROC ANOVA and uses many of the same options and statements. However, PROC GLM is a more generalized program and can be used to compute contrasts or to analyze unbalanced designs. PROC GLM DATA=READING; TITLE ANALYSIS OF READING DATA -- PLANNED COMPARIONS; CLASS GROUP; MODEL WORDS = GROUP; CONTRAST X VS. Y AND Z GROUP -2 1 1; CONTRAST METHOD Y VS Z GROUP 0 1 -1; RUN; The GLM Procedure Contrast DF Contrast SS Mean Square F Value Pr > F X VS. Y AND Z 1 213363.3333 213363.3333 33.22 <.0001 METHOD Y VS Z 1 2250.0000 2250.0000 0.35 0.5649 D. Analysis of Variance: Two Independent Variables Suppose we ran the same experiment for comparing reading methods, but using 15 male and 15 female subjects. In addition to comparing reading-instruction methods, we could compare male versus female reading speeds. Finally, we might want to see if the effects of the reading methods are the same for males and females. DATA TWOWAY; INPUT GROUP $ GENDER $ WORDS; DATALINES; X M 700 X M 850 X M 820 X M 640 X M 920 Y M 480 Y M 460 Y M 500 Y M 570 Y M 580 Z M 500 Z M 550 Z M 480 Z M 600 Z M 610 X F 900 X F 880 X F 899 X F 780 X F 899 Y F 590 Y F 540 Y F 560 Y F 570 Y F 555 Z F 520 Z F 660 Z F 525 Z F 610 Z F 645 ; PROC ANOVA DATA=TWOWAY; TITLE ANALYSIS OF READING DATA; CLASS GROUP GENDER; MODEL WORDS=GROUP | GENDER; MEANS GROUP | GENDER / DUNCAN; RUN; In this case, the term GROUP | GENDER can be written as GROUP GENDER GROUP*GENDER Source DF Anova SS Mean Square F Value Pr > F group 2 503215.2667 251607.6333 56.62 <.0001 gender 1 25404.3000 25404.3000 5.72 0.0250 group*gender 2 2816.6000 1408.3000 0.32 0.7314 In a two-way analysis of variance, when we look at GROUP effects, we are comparing GROUP levels without regard to GENDER. That is, when the groups are compared we combine the data from both GENDERS. Conversely, when we compare males to females, we combine data from the three treatment groups. The term GROUP*GENDER is called an interaction term. If group differences were not the same for males and females, we could have a significant interaction. E. Interpreting Significant Interactions Now consider an example that has a significant interaction term. We have two groups of children. One group is considered normal; the other, hyperactive. data ritalin; do group = 'normal' , 'hyper'; do drug = 'placebo','ritalin'; do subj = 1 to 4; input activity @; output; end; end; end; datalines; 50 45 55 52 67 60 58 65 70 72 68 75 51 57 48 55 ; proc anova data=ritalin; title 'activity study'; class group drug; model activity=group | drug; means group | drug; run; Source DF Anova SS Mean Square F Value Pr > F group 1 121.0000000 121.0000000 8.00 0.0152 drug 1 42.2500000 42.2500000 2.79 0.1205 group*drug 1 930.2500000 930.2500000 61.50 <.0001 proc means data=ritalin nway noprint; class group drug; var activity; output out=means mean=; run; proc plot data=means; plot activity*drug=group; run; data ritalin_new; set ritalin; cond=group || drug; run; proc anova data=ritalin_new; title 'one-way anova ritalin study'; class cond; model activity = cond; means cond / duncan; run; Duncan Grouping Mean N cond A 71.250 4 hyper placebo B 62.500 4 normal ritalin C 52.750 4 hyper ritalin C C 50.500 4 normal placebo F. N-Way Factorial Designs With three independent variables, we have three main effects, three two-way interactions, and one three-way interaction. One usually hopes that the higher-order interactions are not significant since they complicate the interpretation of the main effects and the low-order interactions. PROC ANOVA DATA=THREEWAY; TITLE THREE WAY ANALYSIS OF VARIANCE; CLASS GROUP GENDER DOSE; MODEL ACTIVITY = GROUP | GENDER | DOSE; MEANS GROUP | GENDER | DOSE; RUN; G. Unbalanced Designs: PROC GLM Designs with an unequal number of subjects per cell are called unbalanced designs. For all designs that are unbalanced (except for one-way designs), we cannot use PROC ANOVA; PROC GLM (general linear model) is used instead. LMEANS will produce least-square, adjusted means for main effects. PDIFF option computes probabilities for pair wise difference. Notice that there are two sets of values for SUM OF SQUARES, F VALUES, and probabilities; Notice new TITLE statements. TITLE2, TITLE3, TITLE4; data pudding; input sweet flavor : $9. rating; datalines; 1 vanilla 9 1 vanilla 7 1 vanilla 8 1 vanilla 7 2 vanilla 8 2 vanilla 7 2 vanilla 8 3 vanilla 6 3 vanilla 5 3 vanilla 7 1 chocolate 9 1 chocolate 9 1 chocolate 7 1 chocolate 7 1 chocolate 8 2 chocolate 8 2 chocolate 7 2 chocolate 6 2 chocolate 8 3 chocolate 4 3 chocolate 5 3 chocolate 6 3 chocolate 4 3 chocolate 4 ; proc glm data=pudding; title 'pudding taste evaluation'; title3 'two-way ANOVA - unbalanced design'; %'(-.026BCEIefhl   ! " & 1 2 绷~znh'Bhj5CJaJh'Bh%ahjCJ$aJ$h%ahj5CJ$\aJ$h%ah%a5CJ$\aJ$hj5CJ\aJ hj5\hjh%ahj5CJaJh%a5CJaJh%ah%a5CJH*aJh'"5CJaJh%ah%a5CJaJh%ahj5CJ,aJ,+.0Cf " 2 o ~ dd[$\$m$dd[$\$dhdd[$\$gd%am$hdd[$\$^h`gd%ahdd[$\$^h` $dd[$\$a$ N n ~ "&   "<>vxz!"#$<=dewxֻֻ֏֏֏֏֏֏֏֏֏֏֏֏h%aCJOJQJaJhjCJOJQJaJh%ah%a5 h%a5 hj5h%ahj5OJPJo(h%ahj5 hj5\h%ahj6h%ahjh'Bhj5CJ$\aJ$h'BhjCJ$aJ$h'Bh'B5CJ$aJ$0 *@B"& ">zdd[$\$dhdd[$\$gd%a $dd[$\$a$ $dd[$\$a$m$ dd[$\$m$"$=ex6<&?dd[$\$dhdd[$\$gd%a $dd[$\$a$m$ dd[$\$m$56 AINf?TUv!45xyz{ RS׼hj5CJOJQJ\aJhjCJOJQJaJ hj0J hj5\hjCJOJQJaJh8hjCJOJQJaJh8hjCJaJ"h8hj5CJOJQJ\aJhjhjOJQJ:?Uw!5y{S_ac dd[$\$m$ $dd[$\$a$m$dhdd[$\$gd%add[$\$`gd%a^_`abc2345pqy !!D!E!W!X!r!s!!!!!!𢔢hj0JCJOJQJaJhjCJOJQJaJ hj5\h%ahjCJ$aJ$h%ahj5CJ$\aJ$h%ah%aCJ$aJ$h'BCJ$aJ$hj5OJQJ\h%ahjhjCJOJQJaJ:35q !E!X!s!!!!!!!! $dd[$\$a$m$dhdd[$\$gd%ahdd[$\$^h`gd%a dd[$\$m$!!!!!!!!!!B"C"D"E"~"""""""""""""#/#0#W$Z$m$n$o${$|$$$$$$$$r%ɻɻtthTMCJOJQJaJh Gh%ahj5CJ$\aJ$h%ah%aCJ$aJ$h'BCJ$aJ$h%ahjCJ$aJ$hj5OJQJ\hj0JCJOJQJaJhjCJOJQJaJhj5CJOJQJ\aJhjOJQJh%ahjCJOJQJaJhj,!C"E""""0#o$|$$$$$#%K%s%%%%%%&>&C&dhdd[$\$gd%ahdd[$\$^h`gd%add[$\$ dd[$\$m$r%s%w%%%%%%%%%%%&&=&>&B&C&D&&&&&&&&M'N''''''''))))))~***կrhj0JCJOJQJaJhhj5CJ$aJ$hhj5CJ$\aJ$hh5CJ$aJ$h'B5CJ$aJ$hhj0JCJOJQJaJhjCJOJQJaJhjOJQJhjhTMCJOJQJaJhjCJOJQJaJhTMhTMCJOJQJaJ,C&E&~&&&&N''''))))~*****$+D+[+l+w+hdd[$\$^h`gddhdd[$\$gd dd[$\$m$dd[$\$*************#+$+C+D+Z+[+k+l+v+w++++++++++++++,,%,&,?,@,D,E,F,G,m,r,,,,,,,,,%-/-D-E-x---------𻭻𻭻𻭻𻭻hj0JCJOJQJaJhjCJOJQJaJ4hjB*CJOJQJ^JaJfHphq hj0JCJOJQJaJhjhjCJOJQJaJDw++++++,&,@,E,G,,,,E-------. . .!.=.B.D.dd[$\$ dd[$\$m$--------------... . . . .!.<.=.@.A.B.C.D.I.T.U.V.Z.a.b.c.g.v.w.{.|.}.~...................////////hwCJOJQJaJhjCJOJQJaJ4hjB*CJOJQJ^JaJfHphq hjhjCJOJQJaJhj0JCJOJQJaJED.V.c.w.|.~.....///D/F/n//////0'1dhdd[$\$gdhdd[$\$^h`dd[$\$dd[$\$`m$ dd[$\$m$////?/C/D/E/F/m/n////////////////00'1@1A1n1o111111111111122E2W2344,4-4¶¶ hj5\hhj5CJ$\aJ$hhj5CJ$aJ$hh5CJ$aJ$h'B5CJ$aJ$hj5OJQJ\hj0JCJOJQJaJhjCJOJQJaJhjCJOJQJaJhj5'1A1o11111122h3334-484D4P4\4h4t4444dhdd[$\$gdhdd[$\$^h`dd[$\$ dd[$\$m$-4647484C4D4O4P4[4\4g4h4s4t44444444444444444444455555 5-5.5;5<5I5J5W5X5e5f5s5t5u5v5{5~5555555`1`2`K`L`p`q`````````````hjOJQJUhjhjCJOJQJaJhj0JCJOJQJaJP4444444455 5.5<5J5X5f5t5v55552`L`q`````` dd[$\$m$ title5 '---------------------------------'; class sweet flavor; model rating = sweet | flavor; means sweet | flavor; lsmeans sweet | flavor / pdiff; run; 21h:p8/ =!"#$% ^ 666666666vvvvvvvvv666666>6666666666666666666666666666666666666666666666666hH6666666666666666666666666666666666666666666666666666666666666666662 0@P`p2( 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p 0@P`p8XV~_HmH nHsH tH@`@ NormalCJ_HaJmH sH tH DA`D Default Paragraph FontViV 0 Table Normal :V 44 la (k ( 0No List $O$ spellePK![Content_Types].xmlj0Eжr(΢Iw},-j4 wP-t#bΙ{UTU^hd}㨫)*1P' ^W0)T9<l#$yi};~@(Hu* Dנz/0ǰ $ X3aZ,D0j~3߶b~i>3\`?/[G\!-Rk.sԻ..a濭?PK!֧6 _rels/.relsj0 }Q%v/C/}(h"O = C?hv=Ʌ%[xp{۵_Pѣ<1H0ORBdJE4b$q_6LR7`0̞O,En7Lib/SeеPK!kytheme/theme/themeManager.xml M @}w7c(EbˮCAǠҟ7՛K Y, e.|,H,lxɴIsQ}#Ր ֵ+!,^$j=GW)E+& 8PK!Ptheme/theme/theme1.xmlYOo6w toc'vuر-MniP@I}úama[إ4:lЯGRX^6؊>$ !)O^rC$y@/yH*񄴽)޵߻UDb`}"qۋJחX^)I`nEp)liV[]1M<OP6r=zgbIguSebORD۫qu gZo~ٺlAplxpT0+[}`jzAV2Fi@qv֬5\|ʜ̭NleXdsjcs7f W+Ն7`g ȘJj|h(KD- dXiJ؇(x$( :;˹! I_TS 1?E??ZBΪmU/?~xY'y5g&΋/ɋ>GMGeD3Vq%'#q$8K)fw9:ĵ x}rxwr:\TZaG*y8IjbRc|XŻǿI u3KGnD1NIBs RuK>V.EL+M2#'fi ~V vl{u8zH *:(W☕ ~JTe\O*tHGHY}KNP*ݾ˦TѼ9/#A7qZ$*c?qUnwN%Oi4 =3ڗP 1Pm \\9Mؓ2aD];Yt\[x]}Wr|]g- eW )6-rCSj id DЇAΜIqbJ#x꺃 6k#ASh&ʌt(Q%p%m&]caSl=X\P1Mh9MVdDAaVB[݈fJíP|8 քAV^f Hn- "d>znNJ ة>b&2vKyϼD:,AGm\nziÙ.uχYC6OMf3or$5NHT[XF64T,ќM0E)`#5XY`פ;%1U٥m;R>QD DcpU'&LE/pm%]8firS4d 7y\`JnίI R3U~7+׸#m qBiDi*L69mY&iHE=(K&N!V.KeLDĕ{D vEꦚdeNƟe(MN9ߜR6&3(a/DUz<{ˊYȳV)9Z[4^n5!J?Q3eBoCM m<.vpIYfZY_p[=al-Y}Nc͙ŋ4vfavl'SA8|*u{-ߟ0%M07%<ҍPK! ѐ'theme/theme/_rels/themeManager.xml.relsM 0wooӺ&݈Э5 6?$Q ,.aic21h:qm@RN;d`o7gK(M&$R(.1r'JЊT8V"AȻHu}|$b{P8g/]QAsم(#L[PK-![Content_Types].xmlPK-!֧6 +_rels/.relsPK-!kytheme/theme/themeManager.xmlPK-!Ptheme/theme/theme1.xmlPK-! ѐ' theme/theme/_rels/themeManager.xml.relsPK] , b !r%*-/-4` "$&(*,. ?!C&w+D.'14`!#%')+-/8@0(  B S  ?  !* ! !!!!!!!""########c$n$t${$}$$$$$$$$$$$$ % %%%"%(%Y%]%%%%%G*P*++,,,,,hj!rt "&hn !!,!1!V!\!p!s!!!!!!!!!!! ""#"("F"K"Z"]""" ##_#e#######$ $$!$%$)$=$A$W$Z$^$b$p$s$}$$$$$$$$$$$$%%*%-%%%%%**&*+*?*E*G*P*++++9,>,S,X,x,},,,,,,3333333333333333333333333333333333333333333333333333333333330"2 P  (  ^$p$*%1%& &A'a'( ()*,{q n^`CJOJQJo(^`CJOJQJo(pp^p`CJOJQJo(@ @ ^@ `CJOJQJo(^`CJOJQJo(^`CJOJQJo(^`CJOJQJo(^`CJOJQJo(PP^P`CJOJQJo({qhFl0D-OK,Djt8$$!u  '"'BTMGw Gj%a8t`,,@4  ,,H@HH(@H@UnknownG* Times New Roman5Symbol3. * Arial;[SOSimSunC5  SAS Monospace7Georgia?= * Courier NewA BCambria Math"qhs&\ĸN& Q& Q$920,,2HX $P82!xx!Lecture 11  Analysis of Variance Katy Sharpewill Oh+'0 , L X dpx$Lecture 11 C Analysis of Variance Katy Sharpe Normal.dotmwill4Microsoft Office Word@#@X@p; &՜.+,0 hp   PersonalQ, #Lecture 11 C Analysis of Variance Title  !"#$%&'()*+,-./013456789:;<=>?@ABCEFGHIJKMNOPQRSVRoot Entry FX1Table2"WordDocument4bSummaryInformation(DDocumentSummaryInformation8LCompObjy  F'Microsoft Office Word 97-2003 Document MSWordDocWord.Document.89q