TensorFlow实战 2 多层感知机（Multi-Layer Precepton MLP）

程序员文章站 2022-07-12 12:02:27

...

机器学习的常见基本问题及解决办法：

1、过拟合：模型预测准确率在训练集上提高，但在测试集上反而下降。即模型泛化能力不强，只是记忆了当前数据特征，不具备推广能力；

【解决办法】通过使用dropout方法，随机丟除某一层或多层的输出节点，通过创建新的随机样本，增大样本数量，减少特征数量防止过拟合；

2、参数调试：神经网络通常不是凸优化问题，处处充满局部最优解，不同的学习率对准确率的影响较大，不同的机器学习问题对于参数设置也不尽相同。

【解决办法】一般情况下使用默认参数，需要调整时，借助经验及调试信息微调。

3、梯度弥散：对于多层神经网络，sigmod函数在反向传播过程中梯度值会逐渐减小，经过多层网络后呈指数级急剧减小，神经网络参数更新缓慢。

【解决办法】引入relu函数：max(0, x).relu对sigmod的主要变化有：1、单层抑制；2、相对宽阔的兴奋边界；3、稀疏**性

注：神经网络输出层一般是sigmod函数，因其更符合概率分布。在CNN， RNN等模型中**函数多为sigmod、tanh, hard sigmod

使用TensorFlow实现多层感知，实现mnist手写字体识别率为98%，代码如下：

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#load data sets
mnist = input_data.read_data_sets("MNIST_data/", one_hot = True)

#define input nodes and hidden nodes
n_input = 784
n_hidden_1 = 300

#define parameters
batch_size = 100
training_epochs = 10
display_step = 1

#define input placeholder
x = tf.placeholder(tf.float32, [None, n_input])

#define rates of dropout
keep_prop = tf.placeholder(tf.float32)

#define weights and biases dict
weights = {
	'W_1' : tf.Variable(tf.truncated_normal([n_input, n_hidden_1], stddev = 0.1)),
	'W_2' : tf.Variable(tf.zeros([n_hidden_1, 10])),
}

biases = {
	'b_1' : tf.Variable(tf.zeros([n_hidden_1])),
	'b_2' : tf.Variable(tf.zeros([10])),
}

#build model
layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['W_1']), biases['b_1']))
layer_2 = tf.nn.dropout(layer_1, keep_prop)
y_pred = tf.nn.softmax(tf.add(tf.matmul(layer_1, weights['W_2']), biases['b_2']))

#define output placeholder
y_true = tf.placeholder(tf.float32, [None, 10])

#define cost function[cross_enerty] and optimizer
cost = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_pred), reduction_indices = [1]))
optimizer = tf.train.AdagradOptimizer(0.3).minimize(cost)

#create session and initialize variables
sess = tf.InteractiveSession()
init = tf.global_variables_initializer()
sess.run(init)

#train the model
total_batch = int(mnist.train.num_examples / batch_size)
for eoch in range(training_epochs):
	for i in range(total_batch):
		batch_xs, batch_ys = mnist.train.next_batch(batch_size)
		_, c = sess.run([optimizer, cost], feed_dict = {x:batch_xs, y_true:batch_ys, keep_prop:0.75})
	if eoch % display_step == 0:
		print("Eoch : ", '%04d'%(eoch + 1), "Cost = ", '{:.9f}'.format(c))
print("optimization finished ! ")

#calaulate accuracy
correct_pred = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

#test model accuracy
print(accuracy.eval({x:mnist.test.images, y_true:mnist.test.labels, keep_prop:1.0}))

practice makes perfect！

TensorFlow实战 2 多层感知机（Multi-Layer Precepton MLP）

TensorFlow实战 2 多层感知机（Multi-Layer Precepton MLP）

TensorFlow实现MLP多层感知机模型

TensorFlow实现MLP多层感知机模型