Multiple Correlation and Regression Flashcards Preview

Stats Test 3 > Multiple Correlation and Regression > Flashcards

Flashcards in Multiple Correlation and Regression Deck (19)
Loading flashcards...
1
Q

Multiple Regression and Multiple Correlation

A

Tools that can be used to examine combined relations between multiple predictors and a dependent variable

2
Q

Expand on the ideas of correlation and bivariate regression to now include

A

multiple independent variables in the prediction of a single dependent variable; Major advantage to using multiple IVs is that, if we choose our variables wisely, we can increase the amount of information available for understanding and predicting the dependent (criterion) variable

3
Q

Multiple Correlation Coefficient

A

Symbolized using a capital R to distinguish it from the bivariate Pearson r; Can only vary from 0 to 1.0 (as opposed to r -1.00 to +1.00)

4
Q

What does R = 0 indicate?

A

That the IVs have no relationship with the DV/Criterion

5
Q

What does R = 1.0 indicate?

A

That the X variables (IVs) have a perfect relationship with Y (DV/criterion)

6
Q

The squared multiple correlation coefficient (R2)

A

Typically more useful than the multiple correlation coefficient (R) because we can interpret R2 values as the multivariate coefficient of determination; if between Y and the X variables is .50, we can interpret that as 50% of the variance in Y is accounted for (or shared with) the X variables

7
Q

Partial Correlation Coefficient

A

Allows us to examine the relationship between Y and X1 after removing the influence of X2 from both Y and X1

8
Q

Multiple Regression Formula

A

Yp= a + b1X1 + b2X2… (b1, b2, b3,…, bk are the slope coefficients that give weight to the IV/predictor variables according to their relative contributions to the prediction of Y; k represents the number of predictors; a is a constant that is similar to the Y-intercept in bivariate regression)

9
Q

Multiple Regression

A

used to find the most satisfactory solution to the prediction of Y, which is the solution that produces the lowest standard error of the estimate (SEE); Each predictor variable is weighted so that the b values maximize the influence of each predictor in the overall equation

10
Q

Two different approaches for developing MR equations

A

Let the computer do the work using a pre-existing algorithm (Forward Selection, Backward Elimination, Stepwise)
Specifically tell the computer how to construct the equation
(Hierarchical multiple regression-set up a hierarchical order for inclusion on the IVs/Predictors; Useful when some IVs are easier to measure than other, or if some are more acceptable to use than others)

11
Q

Forward selection

A

A computer-generated method that starts with no variables in the model; X variables are then added one at a time into the regression equation (The bottom row displays the bivariate correlations between the DV and IV’s)

12
Q

At Step #1, SPSS will select the X variable to be entered into the equation first by picking the variable with the highest Pearson’s r with the DV
So what variable do we anticipate will be selected first?

A

xxx

13
Q

Analysis of Step #1 shows that the y-intercept (a) has a value of -27.92 Nm, and the slope of the coefficient (b) has a value of 2.18 Nm/kg
So prediction equation becomes: isokinetic torque (Nm) = -27.92 + 2.18 (weight)

A

xxx

14
Q
isokinetic torque (Nm) = -27.92 + 2.18 (weight)
At step #1, we do not yet have a MR equation because we only have 1 X variable
However, we’ve accounted for ~86% of the variance in isokinetic torque with body weight as the only predictor, so ~14% of the variance is left unexplained
A

xxx

15
Q

At step #2, the selection algorithm selects the variable that increases R2 the most (the one that adds the most unique variance to the equation)
So, after removing the effects of weight, how much variance in isokinetic torque is accounted for by fat-free weight, age, percent fat, etc.?
We find that it is Age that adds the most unique variance

A

xxx

16
Q

At step #2, the new equation is: isokinetic torque = -94.36 + 1.45 (weight) + 7.61 (age)
By adding age to the prediction model, we’ve increased R2 by .06 (6%) and decreased the standard error of the estimate by 3.7 Nm
We then need to determine whether adding this variable significantly improves the equation
To do this, by assessing the significance of the R2 change

A

xxx

17
Q

After step #2, the combination of weight and age has accounted for 92% of the variance in isokinetic torque, and 4 X variables are left
But the forward selection procedure stops at step #2 because none of the remaining X variables add enough unique information to significantly improve the equation

Additional Note: The order in which variables are entered into the prediction model does not indicate importance

A

xxx

18
Q

Methods of Multiple Regression – The Backward Elimination Method
Backward Elimination is a computer-generated method that starts by first forcing all of the IVs into the model
Then the algorithm picks for the variable that, when removed, decreases the R2 the least
If the decrease in R2 is not statistically significant, the variable is removed and algorithm moves to the next step
At this step, the algorithm then selects the next variable that, when removed, decreases the R2 the least
If the decrease in R2 is not statistically significant, the variable is removed and the process is repeated by looking for the third variable to remove
If the decrease in R2 is statistically significant, that variable is not removed and the process stops
“The backward elimination process peels independent variables off one at a time until peeling off the next variable makes the equation significantly worse”

A

xxx

19
Q

Stepwise follows the same procedure as the forward selection process, with the addition that, at each step, the algorithm can remove variables that were previously selected
Can occur when subsequent variables make a previously included variable no longer useful
i.e. the variance accounted for by the subsequently added variables overlap so much with the earlier variable that that its removal doesn’t significantly decrease the R2 of the equation

A

xxx