欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

吴恩达机器学习 EX1 作业 第二部分多变量线性回归

程序员文章站 2022-06-16 20:18:12
...

2多变量线性回归

2.1作业介绍

在本部分中,您将使用多个变量实现线性回归来预测房价。假设你在卖房子,你想知道一个好的市场价格是多少。其中一种方法是首先收集最近售出房屋的信息,并建立一个房价模型。
文件ex1data2.txt(数据集请到网上自行下载)包含俄勒冈州波特兰市的房价训练集。第一栏是房子的大小(以平方英尺为单位),第二栏是卧室的数量,第三栏是房子的价格

2.2 导入模块

import matplotlib.pyplot as plt
import numpy as np
from featureNormalize import * #正则化模块
from gradientDescent import * # 批量梯度下降模块
from normalEqn import * # 正规方程模块

2.3 导入数据

plt.ion()

# ===================== Part 1: Feature Normalization =====================
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size

2.4 查看前10条训练样本和输出样本

# Print out some data points
print('First 10 examples from the dataset: ')
for i in range(0, 10):
    print('x = {}, y = {}'.format(X[i], y[i]))
First 10 examples from the dataset: 
x = [2104    3], y = 399900
x = [1600    3], y = 329900
x = [2400    3], y = 369000
x = [1416    2], y = 232000
x = [3000    4], y = 539900
x = [1985    4], y = 299900
x = [1534    3], y = 314900
x = [1427    3], y = 198999
x = [1380    3], y = 212000
x = [1494    3], y = 242500

2.5 正则化函数(featureNormalize.py)

import numpy as np

def feature_normalize(X):
    n = X.shape[1]  # the number of features
    X_norm = X
    mu = np.zeros(n)
    sigma = np.zeros(n)


    mu = np.mean(X, axis=0) # 计算X轴方向样本的平均值
    sigma = np.std(X, axis=0) # 计算X轴方向样本的标准差
    X_norm = (X - mu) / sigma # 对样本进行正则化

    return X_norm, mu, sigma
    

2.6 对样本进行标准化处理

a、标准化处理不包括偏置(bias)单元,标准化处理后再增加偏置单元。
b、标准化处理只处理训练样本,不对输出样本进行标准化处理

# Scale features and set them to zero mean
X, mu, sigma = feature_normalize(X)
X = np.c_[np.ones(m), X]  # Add a column of ones to X

2.7 用批量梯度下降算法计算代价值和更新theta

单变量批量梯度下降和多变量批量梯度下降算法相同,代价函数算法相同,详见ex1 第一部分相关内容

# Choose some alpha value
alpha = 0.03
num_iters = 400

# Init theta and Run Gradient Descent
theta = np.zeros(3)
theta, J_history = gradient_descent_multi(X, y, theta, alpha, num_iters)

2.8 绘制迭代训练过程代价值

# Plot the convergence graph
plt.figure()
plt.plot(np.arange(J_history.size), J_history)
plt.xlabel('Number of iterations')
plt.ylabel('Cost J')

吴恩达机器学习 EX1 作业 第二部分多变量线性回归
Text(0,0.5,‘Cost J’)

2.9 打印批量梯度下降更新后的theta

# Display gradient descent's result
print('Theta computed from gradient descent : \n{}'.format(theta))
Theta computed from gradient descent : 
[340410.91897274 109162.68848142  -6293.24735132]

2.10 用更新后的theta预测房价

正则化预测样本

# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
# Recall that the first column of X is all-ones. Thus, it does
# not need to be normalized.
x_p = np.array([1650, 3])
x_p_nor = (x_p - mu) / sigma

预测样本加偏置单元(1)进行预测

price = np.dot(np.r_[1, x_p_nor], theta[:, np.newaxis]) # You should change this

打印预测房价

print('Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : %0.3f' % (price))
Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : 293142.433

2.11 正规方程计算theta

# Load data
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size
# Add intercept term to X
X = np.c_[np.ones(m), X]

2.12 正规方程函数

只要特征变量的数目并不大,标准方程是一个很好的计算参数theta的替代方法。具体地说,只要特征变量数量小于一万,通常使用标准方程法,而不使用梯度下降法
正规方程公式如下:
吴恩达机器学习 EX1 作业 第二部分多变量线性回归

import numpy as np

def normal_eqn(X, y):
    theta = np.zeros((X.shape[1], 1))

    theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

    return theta

用正规方程计算theta

theta = normal_eqn(X, y)

# Display normal equation's result
print('Theta computed from the normal equations : \n{}'.format(theta))
Theta computed from the normal equations : 
[89597.9095428    139.21067402 -8738.01911233]

用正规方程计算的theta预测房价,和批量梯度下降算法计算的theta预测房价差不多

# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
price = np.dot(np.array([1, 1650, 3]), theta.T)
# ==========================================================

print('Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : {:0.3f}'.format(price))
Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : 293081.464

前一篇 EX1第一部分单变量线性回归
后一篇 EX2第一部分逻辑回归