Flashcards in Midterm 2.... Deck (46)

Loading flashcards...

1

## Convert odds of 1:8 to a probability.

### 1/8 = .125 --> .125/(.125 + 1) = 11.11% probability. Because 1:8 means 1 out of every 9.

2

## Odds EQ for Probability(for/against)

### odds(for/against)/odds(for/against) + 1

3

## What does a logistic regression model predict?

### LogOdds! This will have a range of (-infinity, infinity)

4

## How do you convert logOdds to odds?

### e^logOdds = odds

5

## How do you convert logOdds to probability?

### e^logOdds/(e^logOdds + 1)

6

## What does logOdds equal in terms of ln(x)?

### ln(odds) = logOdds or log(odds) = logOdds

7

## What is the range of odds (what are they bound by?)

### [0, infinity)

8

## What is the range of logOdds (what are they bound by?)

### (-infinity, infinity)

9

## What type of estimation model is logistic regression, and why?

### Class probability estimation model. It is using a numeric value to estimate the probability of a categorical variable! Ex. What is the chance Marc goes to class? 0.3

10

## What loss function does support vector machine use?

### Hinge loss

11

## Hinge loss (loss function)

### An instance on the wrong side of the line does not incur a penalty. ONLY when it's on the wrong side and outside of the margin.

12

## Zero-one loss

### An instance incurs a loss of 0 for a correct decision and 1 for an incorrect decision.

13

## Squared error

### Specifies a loss equal to the square of the distance from the boundary. A further instance would have a greater error. Usually used for numeric value prediction rather than classification.

14

## Loss function

### Determines how much penalty should be assigned to an instance based on the model's predictive value

15

## Finish this sentence. Accuracy of training data is sometimes called...

### In-sample accuracy (train) vs. out-of sample accuracy (test)

16

## When is logistic regression more accurate vs. decision tree and vice versa?

### LR is more accurate with a smaller data set, DT on bigger sets

17

## What's the point of regularization?

### It gives a penalty to more complicated models because those are more prone to overfitting.

18

## In a confusion matrix what are the column headers? Row headers?

###
Column: Actual y and n

Row: Predicted y and n

19

## False positive

### Predicted positive, actual negative

20

## False negative

### Predicted negative, actual positive

21

## True negative

### Predicted negative, actual negative

22

## True positive

### Predicted positive, actual positive

23

## True positive rate

### True positive / all actual positives (both true and false)

24

## False positive rate

### False positive / all actual negatives (both true and false)

25

## Positive predictive value (PPV)

### True positive / all predicted positive (both correct and incorrect)

26

## What's the expected value of a game of roulette? Probability of hitting black = 48%. Bet = $100

### EV = (0.48)(100) + (1-0.48)(-100) = - 4

27

## What are the two uses for expected value?

###
1. Inform how to use our classifier for individual predictions.

2. Compare classifiers.

28

## Class priors

### The proportion of positive and negative instances in your data set. Ex. 40 of 100 people would buy a new car next year if they could. p(p) = .4, p(n) = .6

29

## Two critical conditions underlying profit calculations:

###
1. Class priors

2. Costs and benefits

30