Generalised Linear Models Flashcards Preview

MATH5824M Generalised Linear and Additive Models > Generalised Linear Models > Flashcards

Flashcards in Generalised Linear Models Deck (22)
Loading flashcards...
1

GLM Components

1. Random part
2. Systematic part
3. Link function

2

Random Part
Definition

f(y;θ,φ) = exp[ (yθ-b(θ)) / φ + c(y,φ) ]

θ = canonical (or natural) parameter
φ > 0 , the scale parameter

3

Systematic Part
Definition

-the linear predictor:
η = Σ βj xj
-sum from j=1 to j=p where p is the number of observations

4

Link Function
Definition

η = g(μ) where μ=E(y)
=>
μ = g^(-1) (η) = h(η)
-where g is the link function and h is the inverse link function

5

How to find the random part?

-write out the probability distribution function,f
-rewrite in terms of an exponential
-use the form of the random part to find θ, φ, b and c

6

GLM
Expectation

E(y) = b'(θ)

7

GLM
Variance

Var(y) = φ b''(θ)

8

GLM
Expectation Proof

-by definition of the probability density function:
1 = ∫ f(yi,θ) dy
-differentiate wrt θ
-sub in f for a GLM to do the differentiation
-revert back to f
-separate into two integrals
-sub in E(y) from the definition of expectation into the first
-use 1 = ∫ f(yi,θ) dy in the second

9

GLM
Variance Proof

-by definition of the probability density function:
1 = ∫ f(yi,θ) dy
-differentiate wrt θ
-sub in f for a GLM to do the differentiation
-revert back to f
-differentiate wrt θ a second time
-sub in definition of Var(y) and 1 = ∫ f(yi,θ) dy

10

Systematic Part in Matrix Form

η = Σ βj xj
-but for each observation, yi, the explanatory variables may differ:
ηi = Σ βj xij
-or in matrix notation:
η = X β
-where η = (η1,...,ηn)^T is a vector of linear predictor variables, β = (β1,...,βp)^T is a vector of regression parameters and X is an nxp design matrix

11

How to find the canonical link function?

-the canonical link function is a mathematically and computationally convenient choice of link function
-set θ=η
-since E(y) = b'(θ) = μ
=>
θ = b'^(-1) (μ) = η

12

Canonical Link Function, g'(μ)

g'(μ) = 1/b''(θ)

13

Maximum Likelihood Estimation for GLMs

f(y;θ,φ) = exp[ (yθ-b(θ)) / φ + c(y,φ) ]
-then the likelihood is:
L[f(y;θ,φ)] = Π f(yi;θ,φ)
-take natural log:
l[f(y;θ,φ)] = Σ [ yiθ-b(θ) / φ + c(yi,φ) ]
= n * [θy^ - b(θ) / φ + const.
-maximise wrt θ
-where y^ is the mean

14

The Saturated Model
Definition

-when the data are fitted exactly to the model
μi^ = b'(θi^) = yi

15

Deviance
Definition

D = 2φ [ l(θ~,y,φ) - l(θ^,y,φ) ]
-where the first likelihood term l(θ~,y,φ) refers to the saturated model and the second, l(θ^,y,φ), refers to the fitted model

16

Deviance
Steps to Find

-express f(y) as an exponential
-log-likelihood is ln(f(yi)) summed over i
-find μi = b'(θi) to express parameters in terms of the mean estimate
-for the fitted model term, replace parameters with mean
-do the same for the saturated model term only with μi=yi
-sub into formula for D
-cancel any remaining terms that can be cacelled

17

Deviance
Distribution, φ=1 - goodness of fit

-goodness of fit
D ~ χ²_(n-r)

18

Deviance
Distribution, φ=1 - model comparison

-model comparison of M1 and M2
M1 = 1 + E
M2 = M2 + E + F
-i.e. model two considers one extra explanatory variable / effect
(D1 - D2) ~ χ²_{(r2-r1)

19

Deviance
Distribution, φ unknown

-estimate φ with model 3:
φ = D3 / (n-r3)
-then test M1 against M2 with:
[ (D1-D2)/(r2-r1) ] / φ ~ F_(r2-r1, n-r3)

20

Raw (response) Residuals

ei = yi - μi^

21

Standardised Residuals

ei = (yi-μi^) / √b''(θi)

22

Deviance Residuals

ei = sign(yi-μi^) √di
-where di is the deviance for individual i