Multiple Linear Regression Folio: Plots

The following plots may be available on the Analysis Plot tab of the multiple linear regression folio. Note that the contour plot and residual vs. factor plot are available only when there are at least two factors included in the analysis. For general information on working with plots, see Plot Utilities.

Effect Plots

Effect plots allow you to visually evaluate the effects of factors and factorial interactions on the selected response.

Residual Plots

Residuals are the differences between the observed response values and the response values predicted by the model at each combination of factor values. Residual plots help to determine the validity of the model for the currently selected response. When applicable, a residual plot allows the user to select the type of residual to be used:

  • Regular Residual is the difference between the observed Y and the predicted Y.

  • Standardized Residual is the regular residual divided by the constant standard deviation.

  • Studentized Residual is the regular residual divided by an estimate of its standard deviation.

  • External Studentized Residual is the regular residual divided by an estimate of its standard deviation, where the observation in question is omitted from the estimation.

The plots are described next.

  • The Residual Histogram* is used to demonstrate whether the residual is normally distributed by dividing the residuals into equally spaced groups and plotting the frequency of the groups. The Residual Histogram Settings area allows you to:

    • Select Custom Bins to specify the number of groups, or bins, into which the residuals will be divided. Otherwise, the software will automatically select a default number of bins based on the number of observations.

    • Select Superimpose pdf to display the probability density function line on top of the bins.

  • The Residual Autocorrelation* plot shows a measure of the correlation between the residual values for the series of runs (sorted by run order) and one or more lagged versions of the series of runs. The default number of lags is the number of observations, n, divided by 4. If you select Custom Lags in the Auto-Correlation Options area, you can specify up to n -1 lags. The correlation is calculated as follows:

where:

    • k is the lag.

    • is the mean value of the original series of runs.

For example, lag 1 shows the autocorrelation of the residuals when run 1 is compared with run 2, run 2 is compared with run 3 and so on. Lag 3 shows the autocorrelation of the residuals when run 1 is compared with run 4, run 2 is compared with run 5 and so on. Any lag that is displayed in red is considered to be significant; in other words, there is a correlation within the data set at that lag. This could be caused by a factor that is not included in the model or design, and may warrant further investigation.

Diagnostic Plots

* These plots are available only when there is error in the design, indicated by a positive value for sum of squares for Residual in the ANOVA table of the analysis results.

Related Topics and Links