1 minute read


昨日 2019/06/24 因為 2019/06/23 的 遊行,在現場淋雨 5小!!! 進度有點順延~ 哈哈 今日努力補回兒:)


Gradient Descent in practice II - Learning Rate : article

確保 gradient descent 正常兒運作 :


  • For sufficiently small α, J(θ) should decrease on every iteration.
  • But if α is too small, gradient descent can be slow to converge.



  • If α is too small: slow convergence.
  • If α is too large: J(θ) may not decrease on every iteration and thus may not converge.

Feature and Polynomial Regression : article

好… 說實話! 不懂...這篇主要要做啥 XDD 那就先貼比大神的筆記吧~

  • We can improve our features and the form of our hypothesis function in a couple different ways.

We can combine multiple features into one. For example, we can combine x1 and x2 into a new feature x3 by taking x1 * x2.

Polynomial Regression:

  • 就是當 hypothesis function 跑出來的曲線和預測的有出入時,就要來喬事情啦~(喬王:老柯&老王)~!

We can change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).

  • EX: if our hypothesis function is
     hθ(x) = θ0 + θ1x1 
    • then we can create additional features based on x1, to get quadratic function :
       hθ(x) = θ0 + θ1x1 + θ2x1^2
    • the cubic function
       hθ(x) = θ0 + θ1x1 + θ2x1^2 + θ3x1^3

      in the cubic version, we have created new features x2 and x3 where x2 = x1^2 and x3 = x1^3

    • square root function, we can do:
       hθ(x) = θ0 + θ1x1 + θ2√x1	

      注意瞜!!如果把 features 喬成這副德性 –> features scaling becomes very importnat! ex: if x1 has range 1- 1000 then range of x1^2 becomes 1 - 100000 and that of x1^3 becomes 1 - 1000000000