ࡱ> /1. :;bjbj .J||DL#c    :#<#<#<#<#<#<#%s(<#u<#  #  :#:#r*!T" 07x[6~!&##0#!z ) R )" )" <#<#^:# )| : STAT 516 --- STATISTICAL METHODS II STAT 516 is primarily about linear models. Model: A mathematical equation describing (approximating) the relationship between two (or more) variables. % Any assumptions we make about the variables are also part of the model. Simple Linear Regression (SLR) Modeling % Analyzes the relationship between two quantitative variables. % We have a sample, and for each observation, we have data observed for two variables: Dependent (Response) Variable: Measures major outcome of interest in study (often denoted Y ) Independent (Predictor) Variable: Another variable whose value may explain, predict or affect the value of the dependent variable (often denoted X ) Example: % In SLR, we assume the relationship between Y and X can be mathematically approximated by a straight-line equation. % We assume this is a statistical relationship: not a perfect linear relationship, but an approximately linear one. Example: Consider the relationship between X = Y = We might expect that gas spending changes with distance traveled  maybe nearly linearly. % If we took a sample of trips and measured X and Y for each, would the data fall exactly along a line? Picture: % Our goal is often to predict Y (or to estimate the mean of Y ) based on a given value of X. Examples: Simple Linear Regression Model: (expressed mathematically)  EMBED Equation.3  Deterministic Component: Random Component: Regression Coefficients: b0 = b1 = e = We assume e has a Since e has mean 0, the mean (expected value) of Y, for a given X-value, is: % This is called the conditional mean of Y. % The deterministic part of the SLR model is simply the mean of Y for any value of X: Example: Suppose b0 = 2, b1 = 1. Picture: %When X = 1, E(Y) = % When X = 2, E(Y) = % The actual Y values we observe for these X values are a little different  they vary along with the random error component e. Assumptions for the SLR model: % The line@FHJz |   b d h  P ̦̾վ̾̌zncUzh?"0h?"05>*CJ$aJ$hh 5>*CJ$aJ$h?"0hh 5CJ$aJ$h5CJ$aJ$hh 5CJ$aJ$h>Mh>M5CJ$aJ$h>Mh?"05>*CJ$aJ$h?"05CJ$aJ$h>M5>*CJ$OJQJaJ$h>Mh>M5>*CJ$aJ$h>M5CJ$aJ$h?"0h>M5CJ$aJ$h<*J5CJ$aJ$h?"0h?"05CJ$aJ$h*!j5CJ$aJ$ HJ| ~   d f X Z gdh gd?"0$a$gd>Mgd>M$a$gd?"0P R T V X Z ~   F`z|ɾޤɄvvhhބ\hh5CJ$aJ$hh5>*CJ$aJ$hh56CJ$aJ$h?"0h5CJ$aJ$hh hh CJ$aJ$hh 5CJ$aJ$h?"0hh 5CJ$aJ$h?"0hh 5>*CJ$aJ$hh 5>*CJ$aJ$h?"05CJ$aJ$h?"0h?"05CJ$aJ$h5CJ$aJ$h56CJ$aJ$hh56CJ$aJ$!z|vxz|~Z\prtvgd?"0  tvTVXZ\npv "$&(Ƚta$jhi)hi)5CJ$EHUaJ$j*M hi)CJUVaJjhi)5CJ$UaJ$hi)hi)5>*CJ$aJ$hi)5CJ$aJ$h305CJ$aJ$hNFhNF5CJ$aJ$hNF5>*CJ$aJ$h305>*CJ$aJ$h?"05CJ$aJ$hhNF5CJ$aJ$hNFhNF56CJ$aJ$hNF5CJ$aJ$&vxz|~(*,^`bdf$a$gdi)gd?"068"$*4 ĸتتت؞yؕi[ihi)hC5CJ$H*aJ$hi)hC5CJ$OJQJaJ$hChC5>*CJ$aJ$hChC56CJ$aJ$hC5CJ$aJ$hChC5CJ$aJ$hi)hi)56CJ$aJ$hi)hi)5CJ$aJ$hi)5>*CJ$aJ$h305CJ$aJ$hi)5CJ$aJ$hi)hi)5CJ$H*aJ$hi)hi)5CJ$OJQJaJ$#$&(*(*0gdCgdYgd?"0XZjl,--V-X-Z-------...$.&.(...v/x/////00&1J1L1P1R111b2˳鑅hChC5CJaJhze5CJaJhB}K56CJ$aJ$hChC5>*CJ$aJ$hChC5CJ$H*aJ$hCCJ$aJ$UhChC5CJ$OJQJaJ$hChC56CJ$aJ$hC5CJ$aJ$hi)hC5CJ$H*aJ$402468:<>@BDFHJLvx@,,--gdYgdCar model is correctly specified % The error terms are independent across observations % The error terms are normally distributed % The error terms have the same variance, s2, across observations Notes: % Even if Y is linearly related to X, we rarely conclude that X causes Y. -- This would require eliminating all unobserved factors as possible causes for Y. % We should not use the regression line for extrapolation: that is, predicting Y for any X values outside the range of our observed X values. -- We have no evidence that a linear relationship is appropriate outside the observed range. Picture: Example: Data gathered on 58 houses (Table 7.2, p. 328) X = size of house (in thousands of square feet) Y = selling price of house (in thousands of dollars) % Is a linear relationship between X and Y appropriate? On computer, examine a scatter plot of the sample data. % How to choose the  best slope and intercept for these data? Estimating Parameters % b0 and b1 are unknown parameters. % We use the sample data to find estimates  EMBED Equation.3  and EMBED Equation.3 . % Typically done by choosing  EMBED Equation.3  and EMBED Equation.3 to produce the least-squares regression line: Picture: For each data point, predicted Y-value is denoted  EMBED Equation.3 . Picture: % Residual (or error) = Y   EMBED Equation.3  for each data point. % We want our line to make these residuals as small as possible. Least-squares line: The line chosen so that the sum of squared residuals (SSE) is minimized. % Choose  EMBED Equation.3  and EMBED Equation.3 to minimize: Example: (House Price data): The following can be calculated from the sample: So the estimates are: Our estimated regression line is: % Typically, we calculate the least-squares estimates on the computer. Interpretations of estimated slope and intercept: ---0../000000000000000N1P1122222`gdCgdYb2d2n2p2233333333V4X4~444444444444 5"5´´ubSj*M hndCJUVaJ$jhndhnd5CJ$EHUaJ$jr*M hndCJUVaJ$jchndhnd5CJ$EHUaJ$jV*M hndCJUVaJjhnd5CJ$UaJ$hi)hnd5CJ$H*aJ$hi)hnd5CJ$OJQJaJ$hndhnd5>*CJ$aJ$hnd5CJ$aJ$hC5CJ$aJ$hChC56CJ$aJ$233333 3333344555555555n6p666666$a$gdndgdY"5$5&5.505V5X5Z5\5z556666<6>6d6f6h6j66666666677848Ǵ֦֦֘։v֘gT֦֦$j hndhnd5CJ$EHUaJ$j_*M hndCJUVaJ$j hndhnd5CJ$EHUaJ$j%*M hndCJUVaJhndhnd56CJ$aJ$hndhnd5>*CJ$aJ$$jhndhnd5CJ$EHUaJ$j*M hndCJUVaJhnd5CJ$aJ$jhnd5CJ$UaJ$$jhndhnd5CJ$EHUaJ$ 67777^8`88888888899999 9F999999999gdY48r8t888888888888 992:4::: ;&;*;:;vj^Uhg$5CJ$aJ$hQhQ5CJ$aJ$hQh8Q5CJ$aJ$h h 5>*CJ$aJ$hQ5CJ$aJ$hQh 5>*CJ$aJ$h 5CJ$aJ$$j9hndhnd5CJ$EHUaJ$j*M hndCJUVaJ$jhndhnd5CJ$EHUaJ$j*M hndCJUVaJjhnd5CJ$UaJ$hnd5CJ$aJ$99999990:2:4:6:::*;,;.;0;2;4;6;8;:;gdY,1h/ =!"#$% cDd h|yb  c $A? ?3"`?2~].5&l64 mD`!~].5&l64 m @ |Oxcdd``dd``baV d,FYzP1n:&&n! KA?H1Z ㆪaM,,He`H @201W&d|b<L{*F\ 78AM񯱂Ip &>4¥nⶏD@m=XP]pO! ~ Ay8 no;Ȅ=J ~$H#t=`2° TrAc `kq)F&&\ {: @> 1,{r)Dd ,b  c $A? ?3"`?2sI8rY!?O`!GI8rY!?hxcdd`` @c112BYL%bpu{ +!5#\1)Dd ,b  c $A? ?3"`?2sFN@3Fc;O`!GFN@3Fc;hxcdd`` @c112BYL%bpuUsi#.&>{ +!5# @c112BYL%bpu 1,K(Dd @ib  c $A? ?3"`?2N UɞCaCk*P `!" UɞCaCkŒ`! xcdd``> @c112BYL%bpu<=AB@CEFGHIJKLMNOPQRSTUVWXRoot Entry( Fu[62 Data &bWordDocument'.JObjectPool* 07x[6u[6_1294654593F07x[607x[6Ole CompObjfObjInfo !%&'()*+-./02 FMicrosoft Equation 3.0 DS Equation Equation.39q E,n Y= 0 + 1 X+ FMicrosoft Equation 3.0 DS EqEquation Native a_1294656086 F07x[607x[6Ole CompObj fuation Equation.39q 4~ "  0 FMicrosoft Equation 3.0 DS Equation Equation.39q d "  1ObjInfo Equation Native  :_1294656114F07x[607x[6Ole  CompObj fObjInfoEquation Native :_1294656155F07x[607x[6Ole CompObjfObjInfoEquation Native : FMicrosoft Equation 3.0 DS Equation Equation.39q 4~ "  0 FMicrosoft Equation 3.0 DS Equation Equation.39q_1294656154F07x[607x[6Ole CompObjfObjInfo d "  1 FMicrosoft Equation 3.0 DS Equation Equation.39q f~ 2Y Equation Native :_1294656293"F07x[607x[6Ole CompObj fObjInfo!Equation Native -_1294656351$F07x[607x[6Ole CompObj#% fObjInfo&"Equation Native #-1TableD)) FMicrosoft Equation 3.0 DS Equation Equation.39q f~ 2Y Oh+'0 ,8 X d p |$STAT 515 --- STATISTICAL METHODS;hxcdd`` @c112BYL%bpuUsi#.&>{ +!5#3\`?/[G\!-Rk.sԻ..a濭?PK!֧6 _rels/.relsj0 }Q%v/C/}(h"O = C?hv=Ʌ%[xp{۵_Pѣ<1H0ORBdJE4b$q_6LR7`0̞O,En7Lib/SeеPK!kytheme/theme/themeManager.xml M @}w7c(EbˮCAǠҟ7՛K Y, e.|,H,lxɴIsQ}#Ր ֵ+!,^$j=GW)E+& 8PK!Ptheme/theme/theme1.xmlYOo6w toc'vuر-MniP@I}úama[إ4:lЯGRX^6؊>$ !)O^rC$y@/yH*񄴽)޵߻UDb`}"qۋJחX^)I`nEp)liV[]1M<OP6r=zgbIguSebORD۫qu gZo~ٺlAplxpT0+[}`jzAV2Fi@qv֬5\|ʜ̭NleXdsjcs7f W+Ն7`g ȘJj|h(KD- dXiJ؇(x$( :;˹! I_TS 1?E??ZBΪmU/?~xY'y5g&΋/ɋ>GMGeD3Vq%'#q$8K)fw9:ĵ x}rxwr:\TZaG*y8IjbRc|XŻǿI u3KGnD1NIBs RuK>V.EL+M2#'fi ~V vl{u8zH *:(W☕ ~JTe\O*tHGHY}KNP*ݾ˦TѼ9/#A7qZ$*c?qUnwN%Oi4 =3ڗP 1Pm \\9Mؓ2aD];Yt\[x]}Wr|]g- eW )6-rCSj id DЇAΜIqbJ#x꺃 6k#ASh&ʌt(Q%p%m&]caSl=X\P1Mh9MVdDAaVB[݈fJíP|8 քAV^f Hn- "d>znNJ ة>b&2vKyϼD:,AGm\nziÙ.uχYC6OMf3or$5NHT[XF64T,ќM0E)`#5XY`פ;%1U٥m;R>QD DcpU'&LE/pm%]8firS4d 7y\`JnίI R3U~7+׸#m qBiDi*L69mY&iHE=(K&N!V.KeLDĕ{D vEꦚdeNƟe(MN9ߜR6&3(a/DUz<{ˊYȳV)9Z[4^n5!J?Q3eBoCM m<.vpIYfZY_p[=al-Y}Nc͙ŋ4vfavl'SA8|*u{-ߟ0%M07%<ҍPK! ѐ'theme/theme/_rels/themeManager.xml.relsM 0wooӺ&݈Э5 6?$Q ,.aic21h:qm@RN;d`o7gK(M&$R(.1r'JЊT8V"AȻHu}|$b{P8g/]QAsم(#L[PK-![Content_Types].xmlPK-!֧6 +_rels/.relsPK-!kytheme/theme/themeManager.xmlPK-!Ptheme/theme/theme1.xmlPK-! ѐ' theme/theme/_rels/themeManager.xml.relsPK] JP b2"548:; !#v0-269:;  "$|+ ? A F Z \ |  2 4 a u w 9MOThj:::::::::8@0(  B S  ? 5=ty(+HRW jnTW"%X[     a j   333333333333333333333   + B F ]  5  + B F ]  5 n >vK FxJ % FQ#^`OJPJQJ^Jo(n^`OJQJ^Jo(hHopp^p`OJQJo(hH@ @ ^@ `OJQJo(hH^`OJQJ^Jo(hHo^`OJQJo(hH^`OJQJo(hH^`OJQJ^Jo(hHoPP^P`OJQJo(hH||^|`o(0^`0o(.0^`0o(..88^8`o(... `^``o( .... ^`o( ..... ^`o( ...... pp^p`o(.......  ( ^ `(o(........808^8`0o(() ^`hH. pLp^p`LhH. @ @ ^@ `hH. ^`hH. L^`LhH. ^`hH. ^`hH. PLP^P`LhH.808^8`0o(() ^`hH. pLp^p`LhH. @ @ ^@ `hH. ^`hH. L^`LhH. ^`hH. ^`hH. PLP^P`LhH.vK % J `2                hMgxO8QW 5Yndze1g*!jHmqwq-v,xeGxxzH{4npY({i)x]!Canh qeC0!-ScVQ@p ,UnknownG* Times New Roman5Symbol3. * ArialeTimes New (W1)Times New Roman;Wingdings?= * Courier NewA BCambria Math"h G GhCFTI TI !24d3QKP ??"02!xx STAT 515 --- STATISTICAL METHODS hitchcockdavid hitchcock