less than 1 minute read

Tags: , ,

  • Decision Trees: Introduction

要如何訓練出一棵樹?

  • Goodness function
    • Used to select the attribute to be aplit at a tree node during the tree generation phase
  • Different algorithms may use different goodness functions:

    • Information gain (used in ID3)

    • Gain ratio (used in C4.5)

    • Gini index (used in CART)

  • Expected Infor maiton (Entropy)

 Entropy: 亂度的指標

Imgur

Information gain

Imgur

Gain ratio

gain ratio 可想成做過 normalization 的 information gain

Gini index

Decision Trees Overfitting

Imgur

misc

Quiz

  • If all features are numerical, which of the following classifier requires more time in the prediction phase? KNN, logistic regression, or decision tree classifier


  • 解釋:
    KNN 在預測的時候要看過所有資料,所以需要比較久的時間
    logistic regrssion 只要把trianing set 看過一次,如果data量不多的話,logistic regression 是不錯快的
    Decision tree 可能比logistic regression 略慢一些,因為在每一個node狀態下,同一種feature 有機會會被問多次!