2 minute read

Tags: ,

馬步扎穩!! 我發現這練習真的太棒了!

EX1: gradient descent

  • Octave
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.

    % predictions = X * theta
    % errors = predictions - y
    % delta = (1/m) * X' * errors
    % theta = theta - alpha * delta


    x = X(:,2);
    h = theta(1) + (theta(2)*x);

    theta_zero = theta(1) - alpha * (1/m) * sum(h-y);
    theta_one  = theta(2) - alpha * (1/m) * sum((h - y) .* x);

    theta = [theta_zero; theta_one];

    % ============================================================

    % Save the cost J in every iteration    
    J_history(iter) = computeCost(X, y, theta);


  • python
def gradientDescent(X, y, theta, alpha, num_iters):
    Performs gradient descent to learn `theta`. Updates theta by taking `num_iters`
    gradient steps with learning rate `alpha`.
    X : array_like
        The input dataset of shape (m x n+1).
    y : arra_like
        Value at given features. A vector of shape (m, ).
    theta : array_like
        Initial values for the linear regression parameters. 
        A vector of shape (n+1, ).
    alpha : float
        The learning rate.
    num_iters : int
        The number of iterations for gradient descent. 
    theta : array_like
        The learned linear regression parameters. A vector of shape (n+1, ).
    J_history : list
        A python list for the values of the cost function after each iteration.
    Peform a single gradient step on the parameter vector theta.

    While debugging, it can be useful to print out the values of 
    the cost function (computeCost) and gradient here.
    # Initialize some useful values
    m = y.shape[0]  # number of training examples
    # make a copy of theta, to avoid changing the original array, since numpy arrays
    # are passed by reference to functions
    theta = theta.copy()
    J_history = [] # Use a python list to save cost in every iteration
    for i in range(num_iters):
        # ==================== YOUR CODE HERE =================================
        theta = theta - (alpha/m) * np.sum((np.dot(X, theta)-y)[:, None]*X, axis=0)
        J_history.insert(i, computeCost(X, y, theta))
        print('Cost function: ', J_history[i])
        # =====================================================================
        # save the cost J in every iteration
        # J_history.append(computeCost(X, y, theta))
    return theta, J_history