5 - Supervised Learning, Classification Flashcards Preview

Advanced Business Analytics > 5 - Supervised Learning, Classification > Flashcards

Flashcards in 5 - Supervised Learning, Classification Deck (26)
Loading flashcards...
1
Q

Classification: Assign instances to predefined classes

A

Data basis:

  • several (independent) attributes
  • one (dependent) attribute

Condition:
- a priori knowledge of classification for some instances (supervised learning)

Model building:

  • generate rules from classified instances
  • > first: generate the best fit
  • > then: prune based on validation set

Generalization:
- apply rules to new instances

2
Q

Classification: Assign instances tp predefined classes

Exemplary methods

A
  • logistic regression
  • support vector machines
  • decision trees, regression trees
  • random forest
  • neural networks, nearest neighbor
3
Q

Classification Examples

Can you think of binary vs. nominal classes for these examples?

  • credit scoring
  • marketing responses
  • geo-temporal events
A

Credit scoring:
- nominal

Marketing responses:
- binary: response vs. no response

Geo-Temporal events:
- can be both

4
Q

Decision Tree Terminology

Which types of trees are there?

A

Binary tree:
- each node splits data at most in 2 sets

Classification tree:
- splits can lead to more than 2 branches

Decision tree:
- classes are nominal (categorical) or ordinal

Regression tree:
- classes are cardinal (continuous values)

5
Q

Decision Tree Terminology

Input

A

instance pool ((x1, …, xn), c)

with
x = set of independent attributes
c = class attribute
6
Q

Decision Tree Terminology

Output

A

Full tree

7
Q

Decision Tree Terminology

Objective

A

Formulate rules of the type:

If (condition 1) AND … AND (condition n) then c

8
Q

Decision Tree Terminology

Rule

A

Path from root to leaf

9
Q

Generating a decision tree

Algorithm steps

A
  1. Start: all objects are in a single node
  2. search for the best classification criterion
  3. classify all objects according to this criterion
  4. Recursively apply steps 2 and 3 until stop
  5. Go back and prune the tree
10
Q

Generating a decision tree

Algorithm design varies in …

A
  • stop criteria: e.g. number of instances per class, tree depths, homogeneity measurements (“Gini Index”)
  • pruning strategy
  • choice of attributes as classification criterion (split quality)
  • number of splits per node
  • scales of measurement
11
Q

Which decision tree algorithms are there?

A

(CH)AID
- (chi-squared) automatic interaction detection

CART
- classification and regression trees

ID3
- iterative dichotomizer

12
Q

Which decision tree algorithms are there?

(CH)AID

A

(chi-squared) automatic interaction detection

  • objective: find significantly different subsets of data
  • select attributes that generate significantly different subsets
13
Q

Which decision tree algorithms are there?

CART

A

Classification and regression trees

  • objective: maximize the information content I
  • select attributes that split the data with the best success quota
  • only binary trees
14
Q

Which decision tree algorithms are there?

ID3

A

Interactive dichotomizer 3

  • objective: minimize entropy
  • split on attribute that produces subsets with minimal entropy
15
Q

ID3: classification by entropy

A
  • Compute the entropy (i.e. the homogeneity) of a given set of instances according to the target attribute c
  • entropy measures the homogeneity according to c
16
Q

Evaluating splits by information entropy

A
  • compute the information entropy of a split according to the weighted sum over the entropies of the new subsets
  • choose the split that results in the lowest entropy (-> this provides the highest information gain)
17
Q

Decision tree pruning

What is pruning?

A

Simplify complicated decision trees to increase efficiency and avoid over-fitting oder under-fitting

18
Q

Decision tree pruning

What kinds of pruning are there?

A

Top-down-pruning:
-> stopping criteria when building trees

Bottom-up-pruning:

  • > ex post pruning
  • pruning of splits that do not increase subset homogeneity sufficiently
  • pruning to undo over-fitting based on validation set: prune tree parts that do not significantly increase the success quota
19
Q

Decision tree quality

Which indicators do you use to evaluate decision tree quality?

A

number of leaves
- number of generated rules (can be too many or too few)

depths of tree
- maximum rule length

external path length
- sum of all path lengths from root to leaf, determines memory requirements

weighted external path length
- sum of path lengths from root to leaf multiplied by the number of instances represented, measures classification costs

20
Q

Decision trees: conclusion

A
  • rules are easily understood and interpreted, relatively easy to determine
  • can model non-linear relationships
  • rule set can become very complex, therefore careful pruning is important
21
Q

Random forests

What are random forests?

A

Random forests are an example for approaches that generate several randomized instances of a model and classify data based on the aggregated results from this model set

22
Q

Random forests

How do random forests work?

A

Generation: Generate k trees by

  • drawing a number of instances (with replacement) to generate a specific training set
  • generate the model based on the specific training set

Generalisation: For each instance in the test set

  • classify the instance according to each of the k trees
  • assign the most frequently determined class (or assign probabilities)
23
Q

Gradient boosted trees

Steps

A
  1. Initialise a prediction regression model with a constant value
  2. compute so-called pseudo-residuals
  3. extend the model by creating a regression tree to predict the pseudo residuals
    - for any region defined by a rule, the tree applies a constant set of coefficient to predict the outcome
    - classical boosting: find a multiplication to adjust the coefficients in order to minimize the loss function
    - TreeBoost algorithm: one multiplicator per region
  4. apply the model and repeat from step 2 for M iterations
24
Q

Support Vector Machines

A
  • build a linear discriminant function to separate two classes as widely as possible
  • critical boundary instances are termed support vectors
25
Q

Neural Networks

What are neural networks?

A
  • neural networks are a tool of “artificial intelligence”, as they imitate some concepts of the brain

Neural networks connect several simple models in a hierarchical structure

  • the simple models are termed units, perceptrons, or artificial neurons
  • neurons are massively interconnected, decomposing problems and forwarding and altering signals
26
Q

Neural Networks

Structure

A

The structure of the network and the weights can be learned via backpropagation
- modify weights based on the unit’s contribution to accurate solutions

We differentiate neural networks also by their number of layers involved