# 機器學習 - multivariate linear regression 2

## Gradient Descent in practice II - Learning Rate : article

• For sufficiently small α, J(θ) should decrease on every iteration.
• But if α is too small, gradient descent can be slow to converge.

Summary:

• If α is too small: slow convergence.
• If α is too large: J(θ) ￼may not decrease on every iteration and thus may not converge.

## Feature and Polynomial Regression : article

• We can improve our features and the form of our hypothesis function in a couple different ways.

We can combine multiple features into one. For example, we can combine x1 and x2 into a new feature x3 by taking x1 * x2.

Polynomial Regression:

• 就是當 hypothesis function 跑出來的曲線和預測的有出入時，就要來喬事情啦~(喬王：老柯＆老王)~！

We can change the behavior or curve of our hypothesis function by making it a quadratic, cubic or square root function (or any other form).

• EX: if our hypothesis function is
`````` hθ(x) = θ0 + θ1x1
``````
• then we can create additional features based on x1, to get quadratic function :
`````` hθ(x) = θ0 + θ1x1 + θ2x1^2
``````
• the cubic function
`````` hθ(x) = θ0 + θ1x1 + θ2x1^2 + θ3x1^3
``````

in the cubic version, we have created new features x2 and x3 where x2 = x1^2 and x3 = x1^3

• square root function, we can do:
`````` hθ(x) = θ0 + θ1x1 + θ2√x1
``````

注意瞜！！如果把 features 喬成這副德性　–> features scaling becomes very importnat! ex: if x1 has range 1- 1000 then range of x1^2 becomes 1 - 100000 and that of x1^3 becomes 1 - 1000000000

Updated: