Minitab Module 4: Nonlinear Regression (Part 2)



Minitab Module 4: Nonlinear Regression (Part 2)Logarithmic and Quadratic ModelingNonlinear functions arise in a variety of settings including many types of growth (populations, cells, immune system response, compounding interest rates), and decay (radioactive, population) and therefore are very useful. We will demonstrate in this section how to “linearize” these curvilinear functions so that the techniques from Module 3 are easily adapted for these more complex models.In the part 1, you were asked to find the scatterplot for several pairs of nonlinear data sets. Go to my website and download the bear data. Make a scatterplot of bear age (explanatory) versus bear length (response). 1. Which of the following two curves represent the plausible shape for this data set (bear age vs bear length)? Circle two. (see worksheet)exponential growth ; exponential decay ; logarithmic growth; or one of the quadratic functions—concave up or concave down . The scatterplot for the Bear data shows a __________________________________ shape, and from what we learned in Part 1, we can deal with this in the same manner as exponential decay (by taking the log10 of each of the x values (Bear Age)). We will examine how well the logarithmic model fits.Logarithm ModelMinitab produces the following regression curves with associated scatterplots and residual plots. Recall that you can display the original data with the logarithmic regression curve or the linearized data with a regression line by performing the following steps:To create the Logarithmic regression model,Stat > Regression > Fitted Line Plot and choose “Bear Length” as the response Y and “Bear Age” as the predictor X For the Type of Regression Model, choose LinearChoose OptionsUnder Transformations, check Log 10 of X Then to create the Residual Plot,Click GraphsResiduals for Plots should be automatically selected as RegularThen in the Residuals versus the variables box, choose the same variable you chose as your predictor (X)Click OK twiceInstead of the above graph with the fitted curve, if you want to create the linearized data set using the log transformation:Choose Options Under Transformations, check Log 10 of x and also check Display logscale for X-variableClick OK twiceThe two scatterplots represent the same data, with the second one showing the data linearized by taking the log10 of each x. They produce the same regression curve of Predicted Bear length (inches) = 19.13 + 26.19 log10 (Bear age (months)) Note the r2 of 75.4% is reasonably high; Se of 5.4 inches is low. The Residual Plot shows no major concerns that would cause us to reject this logarithmic model.2. Interpret the meaning of r2 in context:____________________.3. Interpret the meaning of Se in context:______________________.Since data that appears logarithmic is occasionally just the left side of the quadratic opening downward, it is a good idea to also assess the quadratic regression model for this data as well to see if it is a better fit. Quadratic ModelCreate the Fitted Line Plot and Residual PlotWARNING!!!: Be sure you deselect the Log 10 of x and also verify that you deselect Display logscale for X-variable before generating other models.Stat > Regression > Fitted Line Plot and choose “Bear length” as the response Y and “Bear age” as the predictor X For the Type of Regression Model, choose QuadraticThen to create the Residual Plot,Click GraphsResiduals for Plots should be automatically selected as RegularThen in the Residuals versus the variables box, choose the same variable you chose as your predictor (X)Click OK twiceThe scatterplot and Residual Plot follows: 4. Compare r2 and Se for the logarithmic and quadratic graph. Which is the better model of the two? What did you base it on? The results of the exponential curve are close to the logarithmic curve with a lower value of r2 of 68.8% and a slightly higher Se of 6.1 inches. The Residual Plot graph of the exponential model indicates the residuals have slightly more of a curvilinear pattern than the logarithmic model. In the final analysis, we would select the logarithmic model because it has the better values of r2 and Se. We should also be concerned about how the quadratic model implies a reduction in height after the age of approximately 115 months. (Therefore, the error would be high if we predict beyond 150 months.)When writing an analysis that assesses the best fitting model, it helps to create a chart that organizes the information. Since your exam and project will require you to include a linear model in the comparative analysis, the following are the scatterplot, regression line, and Residual Plot for the linear model. As a final refinement to the model selection process described above, if the linear model is reasonably close to the nonlinear model, then we choose the linear model. This general guideline, known as Occam’s Razor, states that "simpler explanations are, other things being equal, generally better than more complex ones.” This applies when deciding which statistical model you select. (For a nonlinear model to be chosen over the linear model, it should be at least 4% higher in the value of ).Let’s summarize the results in the following table:Organizational Chart for comparing Linear and Nonlinear ModelsModelResidual Plot (Look at the Residuals v. x-values)What do you see? (oval, band, fan shape, curvilinear pattern, influential outliers) you can also include number of crosses on the horizontal axis++(Addresses Criteria 2, 3)Linear**Curvilinearu-shape, opening downward51.7%7.5 inQuadraticCurvilinearu-shape, opening downward68.8%6.1 inLogarithmicCurvilinearSlight u-shape, opening downward75.4%5.4 in**According to Occam’s Razor, the linear model is considered the preferred model unless one of the nonlinear models is significantly better (over a 4% increase in the value of ).++ If one of your best fit options is the exponential growth model (when using transformations), is no longer meaningful and cannot be used to compare models. You can compare values of .Reminder: When assessing how well the linear or nonlinear regression model fits the data, we examine the following criteria:The linear (nonlinear) regression model must have two quantitative variables.The scatterplot does not contain any overly influential outliers. The form of the scatterplot is linear (nonlinear)When writing your analysis about which model provides the best, reasonable fit for the given data, please include the following:Write an introduction Describe the data set, clearly state the explanatory and response variables and explicitly state whether they are quantitative variables.After examining the scatterplot, describe the relationship between the two variables by addressing the four key features of the scatterplot (form, direction, strength, outliers). Identify any overly influential outliers in the form of an ordered pair.State which regression models are appropriate to analyze.For your conclusion, clearly state which model provides the best, reasonable fit for the given data and your reasons that support your decision. For your model you choose as the best one, --state the value of and write your interpretation of in the context of the data.--state the value of and write your interpretation of in the context of the data (if applicable)Be sure you include relevant observations about the Residual Plot, related criteria, and .5. With your partner: On a separate paper, write an analysis (short essay) of the Bear Data using the guidelines above. (Finish for homework.) ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download