6+7: Data analysis I & II: descriptive & inferential statistics Flashcards Preview

ETH1: 3.3 EMM > 6+7: Data analysis I & II: descriptive & inferential statistics > Flashcards

Flashcards in 6+7: Data analysis I & II: descriptive & inferential statistics Deck (12)
Loading flashcards...
1

6(7) basic statistics applied to levels of measurement to measure 2 properties

  • nominal scales
    • central tendency --> mode
  • ordinal scales
    • central tendency --> median
    • dispersion --> range (but quartile distance in presence of extreme value!)
  • interval & ratio scales
    • central tendency --> mean
    • dispersion --> variance & standard deviation

2

statistical difference

  • def=
  • it is like...
  • a diff could...

  • A statistical difference is a function of the difference between means relative to the variability.
  • Like a signal-to-noise ratio.
  • a diff could be due to chance, esp. if variability is high

3

t-test def=

+ 3 associated values

The t-test assesses whether the means of two groups are statistically different from each other
 

+ 3 associated values

  • t-value = ( avg_T - avg_C ) / SE( avg_T - avg_C )
  • p-value = probability of t-value due to randomness only
  • alpha level = significance level, often 0.05, used as threshold

4

null VS alternative hypothesis

null H is conservative, alternative H represents a change as compared to the current state of knowledge

5

Type I error

& alpha level 

in hypothesis testing

= falsely rejecting a null H

w probability p = alpha level (commonly set at 5%)

 

6

  • test statistic 
  • used for
  • properties
  • sample types

  • a statistic that reflects the ratio of systematic over unsystematic variation
  • used for statistical tests
  • it has a known distribution, so that you can calculate the p-value
  • t, F statistics

7

3 important test types and when to use them

numerical dep. variable VS numerical indep. variable => regression analysis

 

numerical dep. variable VS categorical indep. variable

=> ANOVA or t-test

8

ANOVA meaning

ANalysis Of VAriance

9

regression analysis w a linear model:

  • model
  • method

  • = linear equation + probabilistic error term
  • => minimize distance from empirical measurements & predictions; typically, using least square distance(least squares method)

10

attention point w large datasets

sample size tends to make every parameter significant; to check for that, it is necessary to consider the R squared value.

11

standardizing variables =

what for?

subtract mean, then divide by stdev

to make them comparable!

12

ANOVA compares 2 elements:

+ name of independent variables in ANOVA

  • variation between the groups
  • variation within the groups

+ independent variables in ANOVA are called factors