Probability/Statistical Significance Flashcards Preview

Question 1

Q

What are the two ways studies can screw up?

Answer

A

caused by chance = random error

2. Not caused by chance = bias or systematic error

Question 2

Q

What deals with random error in studies?

Answer

A

Statistical inference

Question 3

Q

If a study has a random error, is it likely to happen again if/when the study is repeated?

A

NO

Question 4

Q

An error that is inherent to the study method being used and results in a predictable and repeatable error for each observation is labeled a _____ error. What is it due to?

Answer

A

Systematic error due to bias

Question 5

Q

T/F: If you repeat a study that had a systematic error, it is likely to happen again

Answer

A

TRUE

these errors are not caused by chance and there is no formal method to deal with them.

Question 6

Q

What tests will estimate the likelihood that a study result was caused by chance?

Answer

A

Tests of statistical inference

**a study result is called “statistically significant” if it is unlikely to be caused by chance

Question 7

Q

Do if a study is statistically significant, is it clinically significant?

Answer

A

Not necessarily

Those terms have two different meanings

*even very small measures of association that are not large enough to matter can be statistically significant

Question 8

Q

What is a chance occurrence?

Answer

A

Something that happens unpredictably without discernible human intention or with no observable cause: caused by chance or random variation

Question 9

Q

What is random variation?

Answer

A

There is error in every measurement. If we measure something over and over again, we will get slightly different measurements each time AND a few measurements may be extreme

Question 10

Q

What is statistical inference?

Answer

A

Tells us: if we measure something only once, how sure are we that our measurement has been caused by chance

Question 11

Q

What two methods are used for estimating how much random variation there is in our study and whether our result was likely to have been caused by chance?

Answer

A

Confidence intervals

2. P-values

Question 12

Q

_______ estimates how much random variation there is in our measurement

Answer

A

Confidence intervals

-the range of values where the true value of our measurement could be found

Question 13

Q

_____ are used to estimate whether the measure was likely to have been caused by chance or not

A

P values

Question 14

Q

Will small sample sizes have a large 95% Confidence interval or small CI?

What about large sample sizes?

Answer

A

The larger the sample size, the smaller the confidence interval will be = more precise

small samples have large CIs
Large samples have small CIs

Question 15

Q

How do you interpret this statement?

“prevalence of disease was 8% (95% CI: 4%-12%)”

Answer

A

The estimate of the prevalence from the study was 8%, but we are 95% confident that the true prevalence lies somewhere between 4% and 12%

Question 16

Q

T/F: If the 95% CI for the odds ratio (OR) does NOT include one, the OR is statistically significant

Answer

A

TRUE

Ex: The odds ration was 3 (95% CI: 0.5 - 6)

**since this includes that the OR could have the value of ONE = it is NOT statistically significant

Question 17

Q

How do you interpret 95% confidence intervals (95% CI) for odds ratios (OR)?

Answer

A

OR greater than one, 95% CI does NOT include one : Positive association; statistically significant
OR greater than one, 95% CI includes one : NO association, NOT statistically significant
OR less than one, 95% CI does NOT include one : Negative association, statistically significant
OR less than one, 95% CI included one : No association, NOT statistically significant

Question 18

Q

If the 95% CI for the relative risk (RR) does NOT include one, the RR (is / is not) statistically significant

Answer

A

IS

*remember, when the RR = one, there is no association between the two test groups

Question 19

Q

How do you interpret a RR greater than one, combined with a 95% CI that does NOT include one?

Answer

A

Positive association

Statistically significant

Question 20

Q

How do you interpret a RR less than one, combined with a 95% CI that includes one?

Answer

A

No association

Not statistically significant

Question 21

Q

How do you interpret a RR less than one, combined with a 95% CI that does NOT include one?

Answer

A

Negative association

Statistically significant

Question 22

Q

T/F: P-value gives you information about the size of the test sample

Answer

A

FALSE

**it also does NOT give you any info about the range that you can expect to find the true value

Question 23

Q

To be statistically significant, the p-value must be less than _____

Answer

A

05
* if the p-value is greater than 0.06 - the association is NOT statistically significant and could have been caused by chance

Question 24

Q

How do you interpret p-values that are less than 0.05?

Answer

A

We are 95% confident that an association as large as the one in our study was NOT caused by chance

or

We have 95% confidence that an association this large could not have been caused by chance

Question 25

Q

How do you interpret the following value?

OR or RR or PR = 3.0 (p = 0.02)

Answer

A

Statistically significant. There is an association. We are 95% certain that an OR of 3.0 could NOT have been caused by chance.

Question 26

Q

T/F: No matter how large the RR or OR; if the p-value is greater than 0.05, we must say there is no association

A

TRUE

Question 27

Q

How are p-values calculated?

Answer

A

Using statistical tests - tests for statistical inference:

Chi-squared test
Student’s t test
Correlation

(need to know when/where to use these three tests - do not worry about calculations)

Question 28

Q

When testing a hypothesis, can you prove something is true, untrue, or both?

Answer

A

Untrue

You cannot prove that something is true

You can’t prove an association is true

But you can prove that either is NOT true –> Hence the use of a Null hypothesis

Question 29

Q

What is a “Null” hypothesis?

Answer

A

hypothesis that suggests NO association

Used to be proven untrue and rejected - to confirm associations

Question 30

Q

What is the alternative hypothesis?

Answer

A

The actual research question that we want the answer to (that there is an association)

Question 31

Q

What values do we used to accept or reject the null hypothesis?

Answer

A

P values OR Confidence intervals (CIs)

Question 32

Q

If a p-values is less than 0.05, do we accept or reject the null hypothesis?

Answer

A

We will REJECT the null hypothesis

this means that there is an association - the alternative hypothesis is accepted

Question 33

Q

What must a p-value be to accept the null hypothesis?

Answer

A

Greater than 0.05

Accepting the null hypothesis means that there is no association - the alternative hypothesis is therefore rejected

Question 34

Q

What is a type I error?

Answer

A

False positive: rejecting the null when it is NOT false (no association exists)

This is set at 0.05 (95% CI)

Simply put - *saying there is an association, when really there is not

Question 35

Q

What is a type II error?

Answer

A

False negative: not rejecting the null when it is false (an association truthfully exists)

This is set at 0.20

Simply put - *saying there is no association, but there actually is

Question 36

Q

___________ = the ability of a study to detect an association, if one does exist

Answer

A

Power

Power = 1 - Type II (0.08)

***larger sample sizes have more power

Question 37

Q

Categorical data can be broken up into what two discrete categories?

Answer

A

Nominal (named, not ordered) - dichotomous test results (ex: horse vs donkey; or male vs female)

Ordinal (named and ordered, but no constant value between ranks) Ex: neonate vs juvenile vs adult vs geriatric

Question 38

Q

What is continuous data?

Answer

A

The variable is numeric and can have one of many possible values

Ex: BG, weight, etc

Question 39

Q

Describing categorical data by _______ _______ will summarize the number of animals in each category, counts on proportions, and the use of two-by-two tablets

Answer

A

Frequency distribution

Question 40

Q

What methods can be used to describe categorical date?

Answer

A

frequency distribution
Tablets or bar charts
Statistical test like Chi-squared etc

Question 41

Q

What methods can be used to describe continuous date?

Answer

A

frequency distribution and histogram
- Central tendency
- Dispersion
Statistical tests (95% CI, t-test, correlation, etc)

Question 42

Q

What information can be obtained from describing categorical data using Central tendency?

Answer

A

Describes the center of the distribution and measures central tendency (mean, median, and mode)

mean = sum of all values/# of data points (very sensitive to extreme values)
Median = the value which is in the center with half the data points above and half below
mode = the most frequently occurring value or observation

Question 43

Q

What do you expect to see if there is a skewed distribution when analyzing categorical data with central tendency?

Answer

A

The mean and mode lines will not line up with the median

*symmetric distribution would have the mean and mode close the median (the same distance away)

Question 44

Q

What two values are used in measuring the dispersion of continuous data?

Answer

A

*this method describes how closely the values are gathered around the center of the distribution

Measures:
Range (the difference between minimum and maximum)
Standard deviation (the average distance between each measurement and the mean)

Question 45

Q

The chi-squared test is used to statistically evaluate what?

Answer

A

Difference in proportions

Used for categorical data

all two-by-two tables

Question 46

Q

The Student’s t-test is used to statistically evaluate what?

Answer

A

The difference in means

Compares the average of two groups

*used for continuous outcome data and categorical explanatory variable (independent variable)

Question 47

Q

What is correlation used for?

Answer

A

Statistical test that measures the strength and direction of a linear relationship between two continuous variables

Used for continuous data

Question 48

Q

What does the choice of statistical test depend on?

Answer

A

The nature of the explanatory and outcome variables

Question 49

Q

What statistical test is a test of independence between two categorical variables?

Answer

A

Chi squared

Used to answer: Does an association exist between the variables?

Used for two by two tables

Question 50

Q

What statistical test is used to compare the mean values of a continuous variable between two groups?

Answer

A

Student’s t test

Incorporates both the mean and the variance (dispersion) around the mean

*requires the value to be normally distributed in the population and similar variance in both groups

H0 = the means for the two groups are the same

Question 51

Q

What statistical test indicates the strength and direction of a linear relationship between two continuous variables?

Answer

A

Correlation coefficient (r)

often used for dose-response relationships

both variables are numerical, usually continuous

Question 52

Q

What value is considered a strong correlation value? And a weak correlation value?

Answer

A

Range: 0.0 < r < 1.0

STRONG = r is greater than 0.08

WEAK = r is less than 0.08

Probability/Statistical Significance Flashcards Preview

RUSVM Epi Summer 17 > Probability/Statistical Significance > Flashcards

Decks in RUSVM Epi Summer 17 Class (29):

Brainscape's Knowledge GenomeTM

Probability/Statistical Significance Flashcards Preview

RUSVM Epi Summer 17 > Probability/Statistical Significance > Flashcards

Brainscape's Knowledge Genome^TM