Flashcards in Stat - Exam #1 Deck (134)

Loading flashcards...

91

## What is the methods to find the “kth” percentile?

###
1. RANK, from low to high (MUST);

2. Table, fill in (Index, Position, Value)

-index: i = (k/100) x n

-position:

— i = Integer = avg. i and (i+1) data points;

— i = decimal = next larger data point

3. Value: form position in the ranked set of data

92

## What do the INDEX and POSTION show for the “kth" percentile?

###
-INDEX takes just below the desired percentile;

— POSTION takes the rest of the way

93

## What are Quartiles?

###
-Quartiles are the most common percentiles;

-Divide a set of ranked data points into four equal parts, each part of the set of data contains 25% of the data points

94

## What is the First Quartile?

###
-The number such that 25% of the ranked data points are smaller, and 75% are greater;

-Denoted: Q1

95

## What is the Third Quartile?

###
-The number such that 75% of the ranked data points are smaller, 25% are greater;

-Denoted: Q3

96

## What is the Second Quartiles?

### -Actually the MEDIAN (and called such)

97

## What is the main disadvantage for using RANGE for a summary number for spread?

###
-It is NOT a RESISTANT stat and is strongly affected by extreme data points;

-May not represent the bulk of the data, especially if the two extremes are considerable outliers;

-Need to correct by measuring the range of only 50% f the data points and not allow the influence of extremes = IQR

98

## What are Fences?

###
-Check for extreme observations or outliers;

-Do NOT automatically kick-out, but check-it out;

-Use the fences to determine;

-OUTLIERS = Smaller than LOWER fence or larger than UPPER fence

99

## Formula for LOWER Fence?

### Q1 - 1.5(IQR)

100

## Formular for UPPER Fence?

### Q3 + 1.5(IQR)

101

## What are the important RESISTANT measures of spread?

###
1. Median = resistant for location;

2. IQR = resistant for spread

102

## What 3 numbers give the most information about data?

###
(Q1, M, Q3);

-Only missing the tails of data (min and max)

103

## What is the Five-Number Summary?

###
-A set of numbers consisting of the smallest data (min), Q1, median (M), Q3, and the largest data value (max)

-{Min, Q1, M, Q3, Max}

104

## What is a Boxplot?

### -Picture of the 5-number summary

105

## How can the SHAPE of a column of data be seen from a boxplot?

###
(Shape - Median - Tails)

1. Symmetric = center median = equal tails;

2. Skew Right = median left = right tail longer;

3. Skew Left = median right = left tail longer

106

## Properties of Normal Distribution

###
-Probability = Area under curve;

-z-Transformation: z = (x-u)/sigma — value/pop. avg/pop. stan.dev.;

-Normal probability plot

107

## What are the data characteristics for Normal Distribution?

###
Data type = Continuous

Data Distribution = Normal

108

## What is Probability Density Function (PDF)?

###
-Equation of a curve used to compute probabilities of a continuous, random variable, which satisfies 2 conditions:

1. Area under ENTIRE curve must equal 1;

2. Curve must be greater than, or equal to, zero at every point — CANNOT be negative

109

## What is the Normal Probability Density Function?

###
-Equation (don’t have to integrate);

-Describes asymmetric, bell-shaped curve;

-Completely defined by the mean and variance (standard deviation)

110

## What defines the shape of normal curves?

### -Defined by the equation and the o only difference will ever be the LOCATION or the SPREAD

111

## What are the properties of Normal Distribution?

###
1. Symmetric about the mean (u) =

— Mode, median, and mean are the same point;

— Area under the curve to RIGHT of the mean (u) is equal to the area under the curve to the left of the mean (area=0.5);

2. Curve approaches but never touches zero;

3. Area under the curve is exactly 1 by definition

112

## What are the Two Symmetries?

###
-These properties lead to two symmetries of the normal curve =

1. If the area under the curve to the left of point -a is A;

2. Then the area under the curve to the right of:

Symmetry 1: Point -a is (1-A);

Symmetry 2: Point a is (A)

113

## What does the Area Under the Curve give?

### -Area under the curve for an event gives the probability of the event happening, if the curve is a PDF.

114

## What are the types of Probability?

###
1. PROPORTION of population described by the event;

2. PROBABILITY that a randomly selected individual from the population will be described by the event

115

## What is the Empirical Rule?

###
For any NORMAL Curve:

-Between pop.mean(u) +/- 1SD = 68% Area

-Between pop.mean(u) +/- 2SD = 95% Area

-Between pop.mean(u) +/- 3SD= 99.7% Area;

Also, for the quartiles:

-Between pop.mean(u) +/-0.67= 50% Area

116

## How will we find the are under a certain part of any curve?

### -Convert any other normal curve into the STANDARD NORMAL CURVE and use one table

117

## What is Standardizing of a normal random variable?

### -Means to convert a column of data from x-values (normal distribution) to z-scores (standard normal distribution) = Z-transformation

118

## P (z < a)

###
Probability that a standard normal random variable is...

LESS than a

119

## P (a < z)

###
Probability that a standard normal random variable is...

GREATER than a

120