Flashcards in Midterm 2.... Deck (46)
Convert odds of 1:8 to a probability.
1/8 = .125 --> .125/(.125 + 1) = 11.11% probability. Because 1:8 means 1 out of every 9.
Odds EQ for Probability(for/against)
odds(for/against)/odds(for/against) + 1
What does a logistic regression model predict?
LogOdds! This will have a range of (-infinity, infinity)
How do you convert logOdds to odds?
e^logOdds = odds
How do you convert logOdds to probability?
e^logOdds/(e^logOdds + 1)
What does logOdds equal in terms of ln(x)?
ln(odds) = logOdds or log(odds) = logOdds
What is the range of odds (what are they bound by?)
What is the range of logOdds (what are they bound by?)
What type of estimation model is logistic regression, and why?
Class probability estimation model. It is using a numeric value to estimate the probability of a categorical variable! Ex. What is the chance Marc goes to class? 0.3
What loss function does support vector machine use?
Hinge loss (loss function)
An instance on the wrong side of the line does not incur a penalty. ONLY when it's on the wrong side and outside of the margin.
An instance incurs a loss of 0 for a correct decision and 1 for an incorrect decision.
Specifies a loss equal to the square of the distance from the boundary. A further instance would have a greater error. Usually used for numeric value prediction rather than classification.
Determines how much penalty should be assigned to an instance based on the model's predictive value
Finish this sentence. Accuracy of training data is sometimes called...
In-sample accuracy (train) vs. out-of sample accuracy (test)
When is logistic regression more accurate vs. decision tree and vice versa?
LR is more accurate with a smaller data set, DT on bigger sets
What's the point of regularization?
It gives a penalty to more complicated models because those are more prone to overfitting.
In a confusion matrix what are the column headers? Row headers?
Column: Actual y and n
Row: Predicted y and n
Predicted positive, actual negative
Predicted negative, actual positive
Predicted negative, actual negative
Predicted positive, actual positive
True positive rate
True positive / all actual positives (both true and false)
False positive rate
False positive / all actual negatives (both true and false)
Positive predictive value (PPV)
True positive / all predicted positive (both correct and incorrect)
What's the expected value of a game of roulette? Probability of hitting black = 48%. Bet = $100
EV = (0.48)(100) + (1-0.48)(-100) = - 4
What are the two uses for expected value?
1. Inform how to use our classifier for individual predictions.
2. Compare classifiers.
The proportion of positive and negative instances in your data set. Ex. 40 of 100 people would buy a new car next year if they could. p(p) = .4, p(n) = .6
Two critical conditions underlying profit calculations:
1. Class priors
2. Costs and benefits