An Introduction to Multiple Regression Flashcards Preview

Statistics & Data Analysis > An Introduction to Multiple Regression > Flashcards

Flashcards in An Introduction to Multiple Regression Deck (24)
Loading flashcards...

How do you calculate a residual?

It is the observed value - predicted value.


What is a Partial Correlation?

It is the correlation between two variables while controlling for a third.


What is a Semi-Partial Correlation?

It is the correlation between two variables while looking at the correlation between the third variable and one of those variables.


What are the 3 main things we can predict from a Multiple Regression model?

How well the model explains the outcome.
How much variance in the outcome our model explains.
The importance of each individual predictor.


What are the 3 main types of Multiple Regression?

Forced entry (all data in at once).
Hierarchical (researcher decides variable order).
Stepwise (SPSS decides variables order).


What program should you use to determine the sample size needed (which depends on the effect size)?



What is R-Squared?

It is the variance accounted for by the model (the amount of variance in the DV the model explains).


How do we know if our model generalises well?

The closer R-Squared is to the Adjusted R-Squared the more accurate our model is likely to be for other samples.


Why is R not useful?

This is because in Multiple Regression we have several variables.


Why is the Standardised Coefficients Beta important?

It allows us to compare predictors to decide which are the most important. The higher the number the more important the variable as a predictor.


When reporting the regression equation what are the coefficients also known as in SPSS?

Unstandardised B.


What are the three assumptions of Multiple Regression pre-experiment?

The outcome variable should be continuous.
The predictor variable should be continuous or dichotomous.
There should be reasonable theoretical ground for including variables.


What are the four assumptions of Multiple Regression post-experiment?

Normal distribution of residuals.
No multicollinearity.


What is meant by linearity?

There should be a linear relationship between each predictor and the outcome. Partial plots should be checked for this.


What is meant by homoscedascity?

The variance of the residuals should be constant for all values of the predicted values.


What shape indicates heteroscedasticity?

Funnel/cone shape.


What graph should be looked at when checking for homoscedascity?

Graph of standardised residuals by standardised predicted values (ZRESID by ZPRED).


What two graphs should be looked at when checking for normal distribution of residuals?

Histogram (should be bell shaped) + normal probability plot (points should be close to the diagonal).


What two statistics should you look at to check for no multicollinearity?

Tolerance + VIF statistic.


What are the tolerance + VIF statistic rules in order for there to be no multicollinearity?

VIF value should not be larger than 10.
Tolerance value should not be less than 0.1 (although 0.2 is already a concern).


Why is multicollinearity an issue?

A good predictor might be rejected.
It may lead to errors in estimation of regression coefficients.


What are two possible solutions for multicollinearity?

Combine predictors.
Remove one of the variables.


What is an alternative indication of multicollinearity (not including the VIF + tolerance statistics)?

A high R-Squared with non-significant beta coefficients.


Why must all the assumptions in Multiple Regression be met?

This is because otherwise they could affect the fit and generalisability of the model.