Application - Loudoun County Public Schools / Overview



RESEARCH STATISTICS TUTORIALIntroducing Experimental DesignMany experiments compare two sets of measurements of the same variable. There are many ways of setting up an experiment to produce such data, but when we talk about experimental design, we are referring to a specific aspect of the experiment: whether or not the two sets of measurements can be sensibly paired off with each other. If you can pair off each measurement from one sample with a natural partner from the other sample, then you have a paired design. This may be because you are measuring something twice under different conditions (sometimes called a repeated measures design) or because the two things you are measuring are naturally related; EX: Comparing the running speed of horses for a week of eating one type of feed with the same horses for a week on a different type of feed would be a paired design as you can pair off measurements from the same horse.If there is no sensible way of pairing off the values from the two samples, then you have an independent design. EX: Comparing the running speeds of horses and zebras would be an independent design as there is no sensible way to pair off each horse with each zebra.What is a Hypothesis?A hypothesis is designed to be tested as being either supported or not supported (refuted) by experimental data. This means that a hypothesis will have an opposite, which is the fact that the hypothesis should be rejected! There is a vocabulary that comes with all of this. Here it is: The experimental (or research) hypothesis is the prediction that your theory makes, or the effect you suspect you will see. This is referred to as H1. The null hypothesis is the statement that the effect described in the experimental hypothesis does not exist. This is referred to as H0.One thing to remember when wording your hypotheses is that it is important to decide whether or not you expect to see a difference in a particular direction. If you think that coffee improves memory, then you expect memory scores to be better for coffee drinkers, so there is an expected direction.Here are lists of sentences that represent either an experimental hypothesis or a null hypothesis. Work through them saying which category each falls into and say whether the hypothesis has a direction or not.CHOOSE EITHER: EXPERIMENTAL HYPOTHESIS OR NULLAlcohol consumption does not affect reaction time _________ Does the hypothesis have a direction? _______Sports drinks improve recovery time after exercise _________ Does the hypothesis have a direction? _______There is a difference between the ability of girls and boys to learn statistics. ___________ Does the above hypothesis have a direction? ________Application104775146685Post-ninth grade student heights will be significantly higher than pre-ninth grade student heights. ________ There is no difference in Pre/Post ninth grade heights ______________________Is your experimental hypothesis looking for an effect in a certain direction? __________00Post-ninth grade student heights will be significantly higher than pre-ninth grade student heights. ________ There is no difference in Pre/Post ninth grade heights ______________________Is your experimental hypothesis looking for an effect in a certain direction? __________Here are the two hypotheses from your study. Which is the null and which is the experimental hypothesis? Introducing Central TendencyThe measure of central tendency of your data is the single value that best represents all of the data. It is the value that you would pick if you had to guess which of your data points somebody had chosen at random. This value often (but certainly not always) lies in the 'middle' of the data, in the sense that it has as many values above it as it has below. There are three main measures of central tendency: The mean is the result of adding all the values in your data together and dividing the total by the number of data points you have. The mean is the measure that people often refer to as the average. For example, the mean height is 180 meters; The median is the result of arranging the values in order and finding the middle value in the resulting list. For example, the median age is 45; The mode is the most commonly occurring value in your data. This corresponds to the highest bar in the frequency histogram. For example, the most common number of children in a family is 2.The most appropriate measure for a given data set depends on the data itself. Continuous values such as height are suitable for using the mean, for example 'The mean height is 32.5 cm'. The mode is not a good measure to use with continuous values measured to high accuracy as such data may not contain any repeated values. For example, if you measured the height of ten people to the nearest millimeter, you might get ten different values; Discrete values such as number of children are better suited to the mode or median, thus avoiding 'The average is 2.3 children'; Categorical data such as Color of cars sold should use the mode, for example, 'Red is the most common car color'.ApplicationYour experiment generated data describing two variables.The independent variable, separates your experimental samples into pre-test and post-test.The dependent variable, takes discrete numeric values.Standard DeviationThe average of your data summarizes it all in a single value. That certainly throws away a lot of information. If you were to know one more thing about the data, after the average, what would be the most useful thing? The range (largest and smallest) might be useful, but there is a different measure that is even better - the standard deviation. The standard deviation measures how much the data varies: A large number means the data varies a lot A small number means the data varies a little A standard deviation of zero indicates that all the values in the data are identicalThe standard deviation tells you something more about the average too, as it measures variation in terms of how far from the mean all the values in the sample fall. Values are further from the mean on average when standard deviation is large than they are when it is small.ApplicationThe standard deviation for the pre-test is 5.88 B. The standard deviation for the post-test is 7.88Which sample has the most variation between its values?Samples and PopulationsExplanationWhen You Cannot Measure EveryoneWe have already seen that data from experiments is generated by taking measurements from a number of different experimental units. For example, we might measure the height of 20 people or the acidity level in 30 soil samples. It is very rare indeed that we will have measured every possible unit (every person in the world or every bit of soil). To make a distinction between the few we have measured and all that we might measure, we use the following words: The population refers to every unit in existence; A sample refers to those units that we have measured.Here are the key points to remember about sampling: What makes up a population depends on the definition you choose for your study. It might be as broad as all people or as narrow as the male members of class 3B; A very common method of collecting samples, known as simple random sampling attempts to ensure that each member of the population has an equal chance of being picked as part of a sample; Descriptive statistics are used to describe certain aspects of a sample (for example, 'The sample mean is 5.4'); Inferential statistics are used to make statements about the whole population based only on what we know about a given sample. The difference between a statistic inferred from the sample and the true population statistic is known as sampling error; The larger a sample gets, the smaller sampling error is likely to get.ApplicationNow we will look at the data from your study. You have collected 30 paired measurements, so you have two samples of 30 each. One is from students in the pre-ninth grade sample and the other is from students in the post-ninth grade sample.Does the data you have collected represent a sample of a larger population, or have you collected a measurement from every possible student there is? ____________________________What kind of statistics would you use to describe the sample we have collected? Descriptive or InferentialWhat kind of statistics would you use to infer things about the population based on that sample? Descriptive or InferentialIf you doubled your sample size, how would that affect sampling error? Increase or Reduce the sampling error?Choosing a T-TestExplanationPaired or Independent t-test?There are two types of t-test, the paired t-test and the independent t-test. This page tells you how to pick the right one for your data. We have already seen that when comparing two samples, it is important to know whether or not the samples are paired. The section on experimental design covers this in more detail, but here is a quick recap: With paired (dependent) samples, it is possible to take each measurement in one sample and pair it sensibly with one measurement in the other sample. This might be because measurements were taken from the same group twice (repeated measures) or because there is some other way to join measurements, for example, comparing the IQ of older and younger brothers; With independent samples, there is no sensible way to pair off the measurements.One of the reasons that you need to identify the type of experimental design that you are dealing with is that you need to use the right t-test for the right design: The paired t-test is used when you have a paired design The independent t-test is used when you have an independent designThe other thing you need to decide at this point is easy to decide, but can be slightly harder to understand. You need to decide which of the following types of effect you expect to find: The first mean to be larger than the second The first mean to be smaller than the second The first mean to be different from the second in either directionYou will see this choice referred to in literature and textbooks as the number of tails of the test. The tail is the extreme end of the distribution of the data and your experiment can be one of two types: One tailed tests expect the effect to be in a certain direction, so the first two points above are examples of 1- tailed experiments Two tailed tests are used when you have no idea which sample will be larger than the other, but you are looking for any difference. The third point above is such a case.If you have stated your experimental hypothesis with care, it will tell you which type of effect you are looking for. For example, the hypothesis that "Coffee improves memory" is one tailed because you expect an improvement. The hypothesis, "Men weigh a different amount from women" suggests a two tailed test as no direction is implied. So remember, don't be vague with your hypothesis if you are looking for a specific effect! ExplorationHere are a few questions to test yourself to make sure you understand the choice of t-test type and tail number. An experiment measures people's lung capacity before and then after an exercise program to see if their fitness has improved. Which t-test would you use? ______________ How many tails does the test have? ________A different experiment measures the lung capacity of one group who took one exercise program and another group who took a different exercise program to see if there was a difference. Which t-test would you use? ____________ How many tails does the test have? __________ApplicationYour experiment compares pre-ninth grade heights with post-ninth grade heights and is measured from the same students under both conditions, Pre- and Post-. Your experimental hypothesis is "Post-ninth grade heights will be significantly higher than Pre-ninth grade heights.", so you know in which direction you expect the difference to be.What is your experimental design? Paired or IndependentWhich t-test should you choose? Paired t-test or Independent t-testHow many tails does your experiment have? 1- tailed or 2 - tailedPaired t-testExplanationWhat a Paired T-Test DoesA paired t-test compares two samples in cases where each value in one sample has a natural partner in the other. The concept of paired samples is covered in more detail in the section on choosing a t-test. What a Paired T-Test MeasuresA paired t-test looks at the difference between paired values in two samples, takes into account the variation of values within each sample, and produces a single number known as a t-value. You can find out how likely it is that two samples from the same population (i.e. where there should be no difference) would produce a t-value as big, or bigger, than yours. This value is called a p-value. So, a t-test measures how different two samples are (the t-value) and tells you how likely it is that such a difference would appear in two samples from the same population (the p-value). How to use a Paired T-TestYou will use a software package (EXCEL) to perform a t-test. Software can perform the calculations to produce t-values and p-values, but it is your responsibility to do the following: Pick the right kind of t-test, in this case, a paired t-test and the right direction of test (one or two tailed). See the sections on choosing a t-test for more on this; Ensure the distribution of your data is suitable for a t-test. See the sections on the normal distribution for more on this; Know how to interpret the results of doing a t-test. See the sections on t-values and p-values for more on this.One final practical point: each value in one sample is paired with a single value in the other. When you enter your data into a computer for analysis by a software package, make sure the paired values are lined up. This usually means having data in two columns where each row represents a single pair. The fact that values are paired is very important!center0T-TEST = A WAY TO COMPARE THE MEANS OF SETS OF DATA USING STATISTICS.p-test = STATISTICAL SIGNIFICANCE BETWEEN THE MEANS. IF THERE IS A STATISTICAL SIGNIFICANCE (p<0.05), the means are statistically significant. The standard benchmark is 5% (0.05) and this is called the significance level.00T-TEST = A WAY TO COMPARE THE MEANS OF SETS OF DATA USING STATISTICS.p-test = STATISTICAL SIGNIFICANCE BETWEEN THE MEANS. IF THERE IS A STATISTICAL SIGNIFICANCE (p<0.05), the means are statistically significant. The standard benchmark is 5% (0.05) and this is called the significance rmation from: are a statistical means of understanding the differences between measured means. Let’s say you are trying to see if on average your class is taller than another class. The best way to find out would be to measure the height of every student in each class, find the two means and compare them.But in the real world, when trying to compare two means, it is not possible or unreasonable to measure every possible sample in each of the sets of interest. For example, consider national polls. Every time one wishes to judge the public pulse on an issue, you would have to ask every citizen, on the scale of an election. To circumvent this, in statistics, there are powerful tools to assist us in understanding if the difference in the means (mean opinion in our example) is different enough that it is not just a fluke during measurement. This is called statistical significance. If the difference is statistically significant, it means that the means are different with such confidence that it wasn’t some random luck. Note that we say confidence because we can only say with very high confidence that the means are different. We cannot guarantee it because we did not ask every citizen of the United States.Concept:When comparing two means, conceptually a bell-curve distribution is imagined around one of the means. The shape of this curve is affected by the number of samples you take. The important concept is that the center of the curve is the first mean. Consider that as the actual mean. When you take measurements, let’s say when you flip a coin, you don’t always get 50% heads and 50% tails if you take 10 measurements. Even though the actual mean is 0.5 (50% of 1) for heads, there is a probability distribution around that 0.5 mean for the expected mean for multiple measurements. This distribution says that at the ends, the probabilities are very, very unlikely. For example, if you take 100 measurements of the coin flip, it is extremely unlikely (low probability) that you will get 0% heads or 100% heads. If for some reason you do, it means that the difference is significant enough for you to suspect that something else was affecting the coin flip, like a biased coin.Making Sense of the Difference:While the difference between the means is being stressed for explaining the concept, the actual nature of the difference can also be seen through this test. Whether the mean being compared is greater or lower than the actual (assumed) mean is statistically significant is of more importance in our projects in the website. For figuring this out, first find out the means of the individual lists and compare them to see which one is greater than the other. Now you would need to perform the t-test to see if this specific difference is significant. Follow the protocol listed below for finding the p-value to see if the difference is significant. If the p-value you get here (when you follow this protocol) is less than or equal to 0.05, it means that whatever specific difference (in terms of which mean is greater, not the actual difference in the mean) you observed between the means is statistically significant.Important Note: Just because your results don’t show statistical significance does not mean there is no statistical significance. Your methods might not have shown the significance. The interpretation of such a case is not binary, whether it is significant or not. If the probability of the mean being compared (p-value) is high, it gives confidence in your conclusion that there is no statistical significance between the two means. On the other hand, if there is statistical significance (p < 0.05), it means that for your methods the means are statistically significant.Variance:Going back to the heights of students in your class example, it would be easier for you to judge the difference by using a t-test. You would randomly select ‘n’ number of students from each of the classes and measure their heights. Now you have two lists of heights, one from each class, for n students each. The more students you sample, the better your results in the sense that you will be able to show significance if there is such a difference. WE HAVE TWO LISTS OF HEIGHTS; PRE-NINTH GRADE (SEPT) AND POST-NINTH GRADE (JUNE)Go to Microsoft Excel4524375365760Make sure you match the heights in September (pre-ninth grade) with the same person’s height in June (post-ninth grade) SEE THE DATA TABLE ON Mrs. Wright’s WEBSITE.00Make sure you match the heights in September (pre-ninth grade) with the same person’s height in June (post-ninth grade) SEE THE DATA TABLE ON Mrs. Wright’s WEBSITE.In Microsoft Excel, type (or copy and paste) in your data in the appropriate cell like so:48291752242185C2 – June Heights020000C2 – June Heights48006001273175B1 - HEIGHT (cm)00B1 - HEIGHT (cm)48196501714500B2 – Sept Heights020000B2 – Sept Heights11620501484630SEPT [ JUNE 00SEPT [ JUNE Save your file.To save your file for the first time, click on “File on the top of the screen to the left corner” and select the option “Save As”. SAVE IT IN YOUR FLASH DRIVE!It is a good idea to constantly save your file during the process so that you won’t lose your work if your computer crashes.For saving your file after you already saved it once, just press on the floppy icon on the top left corner in Excel.To find the mean of a list:Left click once on the cell where you want to have the mean displayed. (Below last entry)Double Click the left-click on your mouse in cell B53Type in that cell =average(B3:B52 So type in cell B53 =average(B3:B52Hit ENTER and your mean should show up for pre-ninth grade11715751364615SEPT [ JUNE 00SEPT [ JUNE 11620501370330SEPT [ JUNE 00SEPT [ JUNE 45053251579880Then press ENTER. The mean for that list should be displayed.00Then press ENTER. The mean for that list should be displayed.11811001426845SEPT [ JUNE 00SEPT [ JUNE Repeat the procedure to find the mean for JUNE HEIGHTS using C3 to start and C52 as your last cell11620501443990SEPT [ JUNE 00SEPT [ JUNE 4610099139700Look and compare the means. Notice that the two means are close to each other with the mean height of POST-NINTH GRADE IS GREATER THAN PRE-NINTH GRADE, like your prediction.To see if this difference is significant a t-test is needed.00Look and compare the means. Notice that the two means are close to each other with the mean height of POST-NINTH GRADE IS GREATER THAN PRE-NINTH GRADE, like your prediction.To see if this difference is significant a t-test is needed.To do the t-test:Type in t-test in cell B56, and type in p-value in B57In the next cell C57, double click.Type in ‘=ttest(B3:B52,C3:C52,1,3Hit ENTER and your p-value should be displayed.790575998855SEPT [ JUNE 00SEPT [ JUNE 609600864235SEPT [ JUNE 00SEPT [ JUNE 151447588138000347662595250Then Press Enter and your p-value will be displayed like so:Note the P-Value (of course yours will be different).4000020000Then Press Enter and your p-value will be displayed like so:Note the P-Value (of course yours will be different).-219075433705If it is equal to or less than 0.05, the difference in the means that the average height of students in your class is greater than that of the other class is statistically significant.In the example of class versus class, it is about 0.1. This means that the difference is not statistically significant. But your claim does have some confidence because the p-value is still relatively low. The confidence decreases with an increase in p-value. This does not mean that on average your class is not taller than the other class. It just means that for the given lists that you used, the difference was not big enough to be statistically significant. This could change if you used more students in the lists.00If it is equal to or less than 0.05, the difference in the means that the average height of students in your class is greater than that of the other class is statistically significant.In the example of class versus class, it is about 0.1. This means that the difference is not statistically significant. But your claim does have some confidence because the p-value is still relatively low. The confidence decreases with an increase in p-value. This does not mean that on average your class is not taller than the other class. It just means that for the given lists that you used, the difference was not big enough to be statistically significant. This could change if you used more students in the lists.-133350-200660Your experimental hypothesis is "Post-ninth grade heights will be significantly higher than Pre-ninth grade heights."Your NULL hypothesis is “There will be no significant difference between Pre and Post-ninth grade heights”Our 00Your experimental hypothesis is "Post-ninth grade heights will be significantly higher than Pre-ninth grade heights."Your NULL hypothesis is “There will be no significant difference between Pre and Post-ninth grade heights”Our GRAPHING THE RESULTSHighlight columns starting at B2 and C2 and drag to the end of the cells B52 AND C52. Do not include the mean.Click on INSERT, CHOOSE BAR, use the 2D first diagram.The graph should be displayed.Click on the PLUS SIGN TO THE RIGHT OF THE GRAPH. A series of choices will show up. Click ON the following: AXES, AXES TITLES, CHART TITLE, GRIDLINES, and LEGEND.Click on the words CHART TITLE on your graph and type in your graph title (use an appropriate title and then below the title put YOUR name)AXES TITLES: Make sure you label them correctly (click ON the graph as you did for the title)SAVE AND OPEN A WORD DOCUMENT. COPY THE GRAPH ON TO YOUR WORD DOCUMENT AND WRITE A CONCLUSION ANALYZING THE DATA. This can be typed BELOW THE GRAPH on your word document. Print out the word document with the graph.-95250780415T-TEST = A WAY TO COMPARE THE MEANS OF SETS OF DATA USING STATISTICS.P-VALUE = STATISTICAL SIGNIFICANCE BETWEEN THE MEANS. IF THERE IS A STATISTICAL SIGNIFICANCE (p<0.05), the means are statistically significant. The standard benchmark is 5% (0.05) and this is called the significance level.00T-TEST = A WAY TO COMPARE THE MEANS OF SETS OF DATA USING STATISTICS.P-VALUE = STATISTICAL SIGNIFICANCE BETWEEN THE MEANS. IF THERE IS A STATISTICAL SIGNIFICANCE (p<0.05), the means are statistically significant. The standard benchmark is 5% (0.05) and this is called the significance level.REMEMBER: One final practical point: each value in one sample is paired with a single value in the other. When you enter your data into a computer for analysis by a software package, make sure the paired values are lined up. This usually means having data in two columns where each row represents a single pair. The fact that values are paired is very important!PRODUCTS TO TURN IN - Once you have completed your graph and copied it on to a Word document you must type a concluding paragraph. Follow this format:Topic sentence should restate the hypotheses.Discuss the results (USE DATA) and analyze the data to draw a conclusion. When you draw a conclusion you should USE DATA to back it up. *Can you tell that a discussion of the DATA is needed?*Conclusion sentence that restates the results and analysis.Suggested terms that could be found in your conclusion [ Means, t-test, p-value, significance, tails, null, experimental] For advanced students you could try making a graph of just the pre and post means. ;-)2017 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download