4.1 Hypothesis Tests - Z-Test Flashcards

single observation, multiple observations, comparing the mean of populations, one-sided test, large sample size, z-test in R

1
Q

What are statistical tests used for?

A

-statistical test answer the question of whether or not data is compatible with a given statistical model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Z-Test

Definition

A

-let z be an observation of Z~N(μ,1)
-the z-test, with significance level α for the null hypothesis
Ho : μ=0
-with alternative:
H1 : μ≠0
-rejects Ho if and only if:
|z| > q_α/2
-where q_α/2 is the (1-α/2) quantile of N(0,1)
-i.e. the null hypothesis Ho is rejected if the observation z falls in the critical region at either the lower or upper end of the expected values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Z-Test

Wrong Rejection Probability Lemma

A
  • assume that Ho is true, i.e. that the observed data is a random sample from N(0,1)
  • then the z-test with significance level α wrongly rejects Ho with a probability of α
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Z-Test

Wrong Rejection Probability Proof

A

-assume Z~N(0,1)
-then:
P(Ho is rejected) = P(|Z|>q_α/2)
= P(Z < -q_α/2 OR Z > q_α/2)
= P(Z < -q_α/2) + P(Z > q_α/2)
-since N(0,1) is symmetrical, these two probabilities are equal so we can just write:
P(Ho is rejected) = 2P(Z>q_α/2)
= 2(1 - P(Z≤q_α/2))
= 2(1 - (1-α/2))
= α

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Z-Test

Test Statistic, Critical Value and Critical Region

A
  • for the z-test the modulus of the observation |z| is called the test statistic
  • the critical value is q_α/2
  • the interval (q_α/2,∞) is called the critical interval
  • using these terms, Ho is rejected if the test statistic exceeds the critical value or equivalently, if the test statistic falls into the critical region
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do we need to know to apply the z-test?

A

-the numerical values for the (1-α/2) quantiles of the standard normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Z-Test

Type I Errors

A
  • if Ho is true, the test might wrongly reject Ho
  • this outcome is a type I error
  • they occur with a probability of α
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Z-Test

Type II Errors

A
  • an error occurs if H1 is true but the test does not reject Ho (despite Ho being false)
  • this outcome is a type II error
  • the probability of this error depends on the value of μ
  • if μ≈0 these errors occur withe a probability of approximately 1-α
  • as μ gets further away from 0, the probability of hitting the interval [-q_α/2,q_α/2] and thus the probability of a type II error decreases to 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Z-Test

Errors and Choosing α

A
  • choosing small values of α reduces the probability of type I errors
  • but this also reduces the size of the critical region and increases the chance of type II errors
  • the two types of error probabilities must be balanced when choosing α
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Z-Test

Multiple Observations Description

A
  • data sets being tested in practice will consist of more than one sample
  • in this case we can apply the z-test by considering the sample average
  • assume that we have observed a sample x1,…,xn which can be described by the statistical model X1,…,Xn~N(μ,σ²) i.i.d for known variance σ²
  • we want to test the hypothesis Ho : μ=μo against the alternative H1:μ≠μo
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Z-Test

Multiple Observations Lemma

A

-let μ, μo ∈ R and X1,…,Xn~N(μ,σ²) be i.i.d
-define:
Z = 1/√n Σ Xi-μo/σ
-then:
Z~N(√n(μ-μo) , nσ²)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Z-Test

Multiple Observations Lemma Proof

A

-we have X1,…,Xn~N(μ,σ²)
-and we know then that:
Σ(Xi-μo)~N(n(μ-μo),nσ²)
-dividing by σ√n gives the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Z-Test

Multiple Observations Null Hypothesis

A
  • using the multiple observations lemma, we see that the hypothesis Ho:μ=μo is equivalent to the hypothesis that Z has mean 0
  • we also know that Z has variance 1 and thus we can apply the z-test on Z to decide whether to reject Ho or not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Z-Test

Multiple Observations Applying the Test to Z

A

-we reject Ho at confidence level α if and only if:
|Z| = |1/√n Σ Xi-μo/σ| > q_α/2
-where q_α/2 is the (1-α/2) quantile of the standard normal distribution
-for this test we need to know the value of the sample variance σ² in order to compute the test statistic |Z|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Z-Test

Multiple Observations Manually Computing the Test Statistic

A

-an alternative representation can make the test statistic easier to compute:
Z = 1/√n Σ Xi-μo/σ
= √n * (x^-μo)/σ
-where x^ is the sample average of xi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Z-Test

Single Observation Summary

A

data: z∈R
model: Z~N(μ,1)
test: Ho:μ=0 H1:μ≠0
test statistic: |z|
critical value: q_α/2, the (1-α/2)-quantile of N(0,1)

17
Q

Z-Test

Multiple Observations Summary

A

data: x1,…,xn∈R
model: X1,…,Xn~N(μ,σ²) i.i.d with known σ
test: Ho:μ=μo H1:μ≠μo
test statistic: |z| = 1/√n Σ(Xi-μo)/σ = √n * (x^-μo)/σ
critical value: q_α/2, the (1-α/2)-quantile of N(0,1)

18
Q

Z-Test

Comparing the Mean of Two Populations Overview

A
  • assume that we have observed data x1,…,xn∈R and y1,…,yn∈R
  • these are non-paired data and we write n and m for the sample sizes to allow for them to be different sizes
  • assume that the data can be described by the model X1,…,Xn~N(μx,σx²) and Y1,…,Ym~N(μy,σy²)
  • where the random variables X1,…,Xn,Y1,…,Ym are independent of each other and where the variances σx² and σy² are known
  • we want to test the hypothesis Ho:μx=μy against the alternative H1:μx≠μy
19
Q

Z-Test

Comparing the Mean of Two Populations Lemma

A

-let X1,…,Xn~N(μx,σx²) and Y1,…,Ym~N(μy,σy²) be independent
-define:
Z = (X^-Y^) / √(σx²/n + σy²/m)
-where X^ and Y^ are the sample means, then:
Z~N( (μx-μy)/√(σx²/n + σy²/m),1)

20
Q

Z-Test

Comparing the Mean of Two Populations Lemma Proof

A
E(X^ - Y^) = E(X^) +  E(Y^)
= μx-μy
-and since X^ and Y^ are independent:
Var(X^-Y^) = Var(X^) + Var(-Y^)
= Var(X^) + (-1)²Var(Y^)
= σx²/n + σy²/m
-since X^-Y^ is a linear combination of independent normally distributed random variables:
X^-Y^ ~ N(μx-μy , σx²/n + σy²/m)
-finally dividing by the standard deviation √(σx²/n + σy²/m) gives the result:
Z = (X^-Y^) / √(σx²/n + σy²/m)
~ N((μx-μy)/√(σx²/n + σy²/m),1)
21
Q

Z-Test

Comparing the Mean of Two Populations Summary

A

data: x1,…,xn,y1,…,ym∈R
model: X1,…,Xn~N(μx,σx²) , Y1,…,Yn~N(μy,σy²) independent
test: Ho:μx=μy H1:μx≠μy
test statistic: |z| = |x^-y^|/√(σx²/n + σy²/m)
critical value: q_α/2, the (1-α/2)-quantile of N(0,1)

22
Q

Z-Test

Comparing the Mean of Two Populations Paired Data Overview

A
  • so far we have looked at whether samples from two different populations have the same mean
  • instead we can also consider the case of paired samples, i.e. where we have observed two variables for each individual
  • the idea is to simply for paired data (xi,yi) we can consider zi=xi-yi, and then apply the test we already know
  • zi has mean μo=0 if and only if xi and yi have the same mean
23
Q

Z-Test

Comparing the Mean of Two Populations Paired Data Test Statistic

A

-for paired data (xi,yi) and zi=xi-yi, the test statistic is given by:
1/√n Σ(zi-μo)/σz = √n z^/σz
= (x^-y^) / √(σz²/n)
-the variance σz² can either be computed from the variances of x and y (taking any correlation into account) or estimated from the data

24
Q

One-Sided Z-Test

Definition

A
  • let z be an observation of Z~N(μ,1)
  • the one-sided z-test with significance level α for the hypothesis Ho:μ≤0 with alternative H1:μ>0 rejects Ho if and only if z>q_α
  • where q_α is the (1-α)-quantile of N(0,1)
25
Q

One-Sided Z-Test

Wrong Rejection of Ho Lemma

A
  • assume that Ho is true , i.e. that Z is a random sample from N(μ,1) with μ<0
  • then the one-sided z-test with significance level α wrongly rejects Ho with probability of at most α
26
Q

One-Sided Z-Test

Wrong Rejection of Ho Lemma Proof

A

-let μ≤0 and Z~N(μ,1)
-define Zo=Z-μ
-then Zo~N(0,1) and since μ≤0 we have Zo≥Z
-thus we find:
P(Z > q_α) ≤ P(Zo > q_α)
P(Zo > q_α) = 1 - P(Zo ≤ q_α)
= 1 - (1-α) = α

27
Q

One-Sided Z-Test

Summary

A

data: x1,…,xn∈R
model: X1,…,Xn~N(μ,σ²) i.i.d
test: Ho:μ≤μo H1:μ>μo
test statistic: z = 1/√n Σ(xi-μo)/σ = √n (x^-μo)/σ
critical value: q_α, the (1-α)-quantile of N(0,1)

28
Q

One-Sided Z-Test

Alternative Ho

A
  • we can derive the one-sided z-test for testing the hypothesis Ho:μ≥0 with alternative H1:μ<0 by exploiting symmetry
  • replacing z with -z, we see that for this case we should reject Ho if and only if z
29
Q

Z-Test for Large Sample Size Overview

A
  • assume that we have observed x1,…,xn from a model where X1,…,Xn are i.i.d but not necessarily normally distributed
  • we will see that, if n is sufficiently large, we can still apply the z-test
30
Q

Z-Test for Large Sample Sizes

Central Limit Theorem

A

-Let X1,…,Xn be i.i.d with E(Xi)=μ and Var(Xi)=σ²
-then:
Z = 1/√n Σ(Xi-μ)/σ -> N(0,1)
-as n ->∞, here -> indicates convergence in the distribution
-this statement of the central limit theorem implies that, for large sample size, the test statistic Z is approximately standard normal distributed even if the individual samples come from different distributions
-so we can still apply the z-test

31
Q

Z-Test for Large Sample Sizes

Summary

A

data: x1,…,xn∈R, where n is large
model: X1,…,Xn i.i.d with E(Xi)=μ and Var(Xi)=σ²
test: Ho:μ=μo H1:μ≠μo
test statistic: |z| = 1/√n Σ|xi-μo|/σ = √n |x^-μo|/σ
critical value: q_α/2, the (1-α/2)-quantile of N(0,1)

32
Q

Z-Test in R

A
  • to perform a z-test in R, only the test statistic and the critical value must be computed
  • this is easily done using standard R commands like mean() and qnorm()