Misc econometrics Flashcards Preview

Econometrics > Misc econometrics > Flashcards

Flashcards in Misc econometrics Deck (39)
Loading flashcards...


(In statistical hypothesis testing) The probability of obtaining test results at least as extreme as the results observed, assuming that the null hypothesis is correct.

In other words, The p-value is the largest significance level at which we could carry out our test and still fail to reject H₀

In still other words, the probability associated with our calculated test statistic (Z-statistic corresponding to our observed value (and the distribution assuming H₀ is true))



Z-statistic (or Z-score or Standard score) is a number representing how many standard deviations an observed value (raw score) is away from the mean (of what is being observed).

Raw scores above the mean have positive standard scores, while those below the mean have negative standard scores.

The Z-statistic is distributed normally with mean = 0 and variance = 1 (ie, it has a Standard Normal Distribution)

Z ~ N(0, 1)

So Z = (Observed Sample Value - Assumed Population Mean) / Standard Deviation of Sample Distribution

(Note: in hypothesis tests, the observed value is often the mean observed from our sample - in other words, we are testing whether the mean. The Z-statistic may also be used to estimate the probability that X could take a certain value (the observed value, x), given the assumed population mean value.)

(Note: Calculating z using this formula requires the population mean and the population standard deviation, not the sample mean or sample deviation. But knowing the true mean and standard deviation of a population is often unrealistic except in cases such as standardized testing, where the entire population is measured.

When the population mean and the population standard deviation are unknown, the standard score may be calculated using the sample mean and sample standard deviation as estimates of the population values.)


Z-statistic (conversion)

If X ~ N (μ, σ²) then

Z = (𝐗−𝛍)/𝛔 ~ N(𝟎, 𝟏)

In other words: For any continuous and randomly distributed variable X with mean μ and variance σ² (X ~ N (μ, σ²)), all probabilities can be converted to the Standard Normal Distribution using the Z Normal (0, 1) transformation: Z = (𝐗−𝛍)/𝛔

The Z-statistic is distributed normally with mean = 0 and variance = 1 (ie, it has a Standard Normal Distribution)

Therefore the Z Normal transformation, Z = (𝐗−𝛍)/𝛔 converts the variable X into a Standard Normal distribution. We can thus use standard normal tables to find relevant probabilities for P(X ≤ x).

Note: the Z Normal (0, 1) transformation is so called because it serves to transform the distribution of X to a normal distribution centered on 0 with a Variance of 1, by way of shifting the normal distribution leftward by μ units (aligning mean with x-axis) and compressing(/stretching if σ²<1) horizontally by σ units (setting σ²=1)

Thus, Z ~ N(0, 1)

So, to convert any value of X to its corresponding Z value, subtract the value of the mean and divide by the standard deviation.


Z-statistic (hypothesis tests)

is a number representing how many (of the sample distribution's) standard deviations the observed (sample) value is away from the assumed (population) mean.


Random Sample

A sample of n observations of a RV Y, denoted Y₁, Y₂, …, Yₙ is said to be a random sample if the n observations are drawn independently from the same population and each element in the population is equally as likely to be selected


Random Sample as 'A set of Independently and Identically Distributed (IID) RVs'

We describe such a sample as being a set of Independent and Identically distributed (IID) Random Variables (RVs)

So, if a random sample of n elements is taken,
the sample elements constitute a set of IID RVs, Y₁, Y₂, …, Yₙ, each of which have the same PDF as that of Y

The random nature of Y₁, Y₂, …, Yₙ reflects the fact that many different outcomes are possible before the sampling is actually carried out (ie, each element of the sample is a (IID) RV (with the same PDF as Y (population)) BECAUSE they are randomly (and independently) selected from the population, meaning that each element from the sample follows a PDF identical to the population


Sample data

Once the sample is obtained, we have a set of numbers, say y₁, y₂, …, yₙ which constitute the data we work with.

This are different types of data:
• Cross-sectional data
• Time-series data
• Panel data


Sample Statistics

A sample statistic is any quantity computed from values in a sample that is used for a statistical purposes.

(Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypothesis)

The two most often used sample statistics are the sample mean, denoted by Y̅, and the sample variance, denoted by S².


Sampling Distribution

A sample statistic (eg the sample mean) will have its own probability distribution called the sampling distribution.

Since each observation in a random sample is itself a RV, then any statistics calculated from a sample, called a sample statistic, is also a RV.

And since the sample statistic is an RVs, it will have its own probability distribution

The sampling distribution reflects the fact that a random sample (of size n) drawn from the population could materialise into a range of different manifestations, each with a corresponding probability. It is this probability distribution (that contains the information) of the all the possible samples that we could draw of size n from the population, that we call the sampling distribution (and we will see shortly that it distributes normally with mean μ and variance σ²/n)


Population > Sampling Distributions

So, to tie sampling distributions in with their wider context,

• There is a POPULATION (of size N)
• Y is a RV representing this population, with a PDF
• θ is an unknown population parameter (such as the expected value E(Y) or variance V(Y) (or) σ²)
• Note: these population parameters are unknown, fixed values
• A random sample (of n observations) of the RV Y is drawn, denoted Y₁, Y₂, …, Yₙ
• (Once the sample is obtained, we have a set of numbers, say y₁, y₂, …, yₙ, which constitute the data we work with)
• Each Yᵢ has a PDF (identical to the PDF of Y)
• From the sample we can calculate sample statistics
• (Two sample statistics of interest: sample mean, Y̅ and the sample variance, S²)
• Note: these sample statistics are RVs, with their own probability distribution, the SAMPLING DISTRIBUTIONS


The sampling distribution of the sample mean (Y̅)

Suppose Y ~ N (μ, σ²) and we have an IID sample of n observations from it: {Y₁, Y₂, …, Yₙ},

Then we say that Yᵢ ~ IIDN (μ, σ²)

In other words, each element of the sample is a RV with the same PDF as Y.

From these observations we can calculate the
sample mean, Y̅, as: Y̅ = 1/n Σ (Yᵢ)

Since Y̅ is a RV itself, it has a probability distribution.

It turns out that the sampling distribution of the sample mean is: Y̅ ~ N (μ, σ²/n)

(We'll break this down in the next three cards)


The mean (or expected value) of the sampling distribution of Y̅

The mean of the sampling distribution of Y̅ is defined as:

E[Y̅] = μ

If a sample of n random and independent observations are repeatedly and independently drawn from a population, then as the number of samples becomes very large (approaches infinity), the mean of the sample mean (Y̅) approaches the population mean


The variance of the sampling distribution of Y̅

The (population) variance of the sampling distribution of Y̅ is defined as:

V[Y̅] = σ²/n

As the sample size (n) increases, the variance of Y̅ decreases. So the sampling distribution of the sample mean will have lower variance the larger the sample size.


The Sampling Distribution (of Y̅ ~ )

Thus, if we assume that the samples are taken from a normal RV, Y, we can deduce that:

Y̅ ~ N (μ, σ²/n)


Standardisation of Y̅

We can compute the standard normal for Y̅ to calculate probabilities:

Z = [ ( Y̅ - μ ) / ( σ/√n ) ] ~ N(0, 1)


The Central Limit Theorem

What about the shape of the sampling distribution of Y̅ if the population from which it is constructed is not normally distributed?

Use the Central Limit Theorem (CLT): as sample size gets large enough, the sampling distribution of Y̅ can be approximated by the normal distribution even if population itself is not normal.

Therefore, given the CLT, we can apply rules about normal distribution to the sampling distribution of the sample mean even when the population is not distributed normally


Population variance/sample variance unknown

It thus follows that we can make inferences about the population mean based on the sample mean using the standard normal distribution (Z-statistic)
Z = [ ( Y̅ - μ ) / ( σ/√n ) ] ~ N(0, 1)

However, notice how the distribution of the sample mean depends on the population mean (= sample mean) but also on the population variance (divided by the sample size).

It is quite likely that we will not know the population variance

If this is the case we can use the sample variance, S², as an approximation, and it can be shown that:

T = [ ( Y̅ - μ ) / ( S/√n ) ] ~ t (n - 1)

Thus, we can use the sample variance and the tables from the t distribution to make inferences when σ² is unknown.



A sample statistic which is constructed to provide information about the unknown population parameters of a probability distribution is called an estimator and we denote it by θ ̂ (thetaHat)

(To place in context:)

Let Y be a RV representing a population with a PDF, f (y; θ), which depends on the unknown population parameter θ

Example: if Y ~ N ( μ, σ²); then = θ = ( μ; σ²)

Note: we will generally assume that there is only one parameter

If we can obtain certain random samples, then we can learn something about θ

(Refer to first point)

So a sample statistic is an estimator, and so the probability distribution of the estimator is the sampling distribution


Estimator as a rule

More generally, an estimator θ ̂ (thetahat) of a population parameter θ can be expressed as a mathematical formula (rule):

θ ̂ = g(Y₁, Y₂, … , Yₙ)

In other words, regardless of the outcome of the RVs (the sample that happens to be drawn from the population), we apply this same rule to estimate the population parameter

An estimator of θ is a rule that assigns each possible outcome of the sample a value of θ
(remember any sample drawn is one manifestation of the many possible samples that could have been drawn from the population (with corresponding probabilities))

For example: a natural estimator of µ (population mean) is Y̅ (sample mean)
where Y̅ = (1/n) Σ Yᵢ

• Given any outcome of the RVs {Y₁, Y₂, … , Yₙ } (ie: the sample drawn) the rule to estimate the population mean is the same: we simply take the average of {Y₁, Y₂, … , Yₙ }
• For a particular outcome of the RVs {y₁, y₂, …, yₙ }, the estimator is just the average in the sample y̅ = (1/n) Σ yᵢ


Quality of Estimate vs Quality of Estimator

Suppose that we want to estimate the average salary of university graduates in the UK. Suppose that we take one sample from the population and use the sample mean to estimate the average population salary. Suppose that we find that the sample mean is y̅ = £15,000. How close is this value (estimate) to the true population mean, µ?

We don't know, as µ is unknown!

➢ Instead of asking about the quality of the estimate, we should ask about the quality of the estimation procedure or estimator!
➢ ie How good is the sample mean as an estimator of the population mean?

➢ What are some (desirable) properties that an estimator may (or may not) possess?

Such properties, are most often divided into:
• small sample (or finite) properties - desirable properties for when the sample size is finite
• large sample (or asymptotic) properties - desirable properties for when the sample size becomes infinite

We will briefly consider the two main properties for estimators of 'Finite or Small Samples':
1) Unbiasedness
2) Minimum variance



An estimator is unbiased if:
E[θ ̂ ] = θ

So if the mean of the sampling distribution of the estimator (which reflects all the different possible values that the sample statistic could assume when the estimating procedure is applied to whatever sample happens to be drawn, with corresponding probabilities) is equal to the population mean, then the estimator is unbiased.

In other words, if you take independently draw a large number of random samples from the population, computing the sample statistic for each, and then find the mean of these sample statistics, for an unbiased estimator this will be equal to the population mean.

(Part 1; topic 5 shows really clear graph to demonstrate)


Minimum Variance Unbiased Estimator

Consider the set of all possible unbiased estimators for θ, which we will label θ ̂₁, θ ̂₂, …, θ ̂ₖ. One of these θ ̂ⱼ is said to be the Minimum, Variance Unbiased Estimator if:

V( θ ̂ⱼ ) < V( θ ̂ₖ )

for i = 1, … , k and i ≠ k

(Part 1; topic 5 shows really clear graph to demonstrate)



If an estimator is unbiased AND minimum variance, we say it is efficient (or the best)


How to construct estimators with good properties for unknown parameters?

There are various approaches based on observed samples. Three common methods are:
• Least Squares
• Method of Moments
• Maximum Likelihood

➢ In this course we will focus on the Least Squares Estimation.


Hypothesis testing

See LUBS2570; Part 1; topic 6 for really good notes


Critical value

The Z-score that cuts the distribution off at the significance level; the z-score that corresponds to the significance level of the test


Test statistic

The Z-statistic associated to the observed sample mean and assuming H₀ is correct

H₀ can be accepted or rejected solely by comparing the critical value with the test statistic


Decision rule can be based on one of 2 methods:

1) Reject H₀ if the calculated test statistic < critical value:

z < z꜀

2) Reject H₀ if the p-value associated with the test statistic is less than the significance level:

p_value < α


Hypothesis testing when σ is unknown

a) the population variance σ is unknown AND
b) the sample size is small

the t distribution must be used rather than the normal Z distribution, so a t-test statistics should be conducted instead of a Z test.

➢ The hypothesis tests with small samples and unknown population variances is similar as before, but now we need to consult the t distribution to obtain the critical values.

➢ For large samples t is typically not required; the Z-test can be used instead.

(Recall that if n is large, the t distribution approaches the standard normal distribution (diagram in LecNotes))


Confidence Interval

There are two ways in which an estimate of a population parameter (using random samples) can be presented:

1. As a Point Estimate: a single value is used to estimate an unknown population parameter (like how we have seen that we can use the sample mean to estimate the population mean)

2. As an Interval Estimate or Confidence Interval: a range of values is used to estimate an unknown population parameter. This range of values is probably where the population parameter lies.

(Part 1; Topic 7 explains confidence intervals of population mean µ (when σ is known), the general random interval estimator, interpretations, and confidence intervals of population mean µ when σ is not known)