欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

TensorFlow实战 2 多层感知机(Multi-Layer Precepton MLP)

程序员文章站 2022-07-12 12:02:27
...

机器学习的常见基本问题及解决办法:

        1、过拟合:模型预测准确率在训练集上提高,但在测试集上反而下降。即模型泛化能力不强,只是记忆了当前数据特征,不具备推广能力;

      【解决办法】通过使用dropout方法,随机丟除某一层或多层的输出节点,通过创建新的随机样本,增大样本数量,减少特征数量防止过拟合;

        2、参数调试:神经网络通常不是凸优化问题,处处充满局部最优解,不同的学习率对准确率的影响较大,不同的机器学习问题对于参数设置也不尽相同。

      【解决办法】一般情况下使用默认参数,需要调整时,借助经验及调试信息微调。

        3、梯度弥散:对于多层神经网络,sigmod函数在反向传播过程中梯度值会逐渐减小,经过多层网络后呈指数级急剧减小,神经网络参数更新缓慢。

      【解决办法】引入relu函数:max(0, x).relu对sigmod的主要变化有:1、单层抑制;2、相对宽阔的兴奋边界;3、稀疏**性

注:神经网络输出层一般是sigmod函数,因其更符合概率分布。在CNN, RNN等模型中**函数多为sigmod、tanh, hard sigmod

使用TensorFlow实现多层感知,实现mnist手写字体识别率为98%,代码如下:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#load data sets
mnist = input_data.read_data_sets("MNIST_data/", one_hot = True)

#define input nodes and hidden nodes
n_input = 784
n_hidden_1 = 300

#define parameters
batch_size = 100
training_epochs = 10
display_step = 1

#define input placeholder
x = tf.placeholder(tf.float32, [None, n_input])

#define rates of dropout
keep_prop = tf.placeholder(tf.float32)

#define weights and biases dict
weights = {
	'W_1' : tf.Variable(tf.truncated_normal([n_input, n_hidden_1], stddev = 0.1)),
	'W_2' : tf.Variable(tf.zeros([n_hidden_1, 10])),
}

biases = {
	'b_1' : tf.Variable(tf.zeros([n_hidden_1])),
	'b_2' : tf.Variable(tf.zeros([10])),
}

#build model
layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights['W_1']), biases['b_1']))
layer_2 = tf.nn.dropout(layer_1, keep_prop)
y_pred = tf.nn.softmax(tf.add(tf.matmul(layer_1, weights['W_2']), biases['b_2']))

#define output placeholder
y_true = tf.placeholder(tf.float32, [None, 10])

#define cost function[cross_enerty] and optimizer
cost = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_pred), reduction_indices = [1]))
optimizer = tf.train.AdagradOptimizer(0.3).minimize(cost)

#create session and initialize variables
sess = tf.InteractiveSession()
init = tf.global_variables_initializer()
sess.run(init)

#train the model
total_batch = int(mnist.train.num_examples / batch_size)
for eoch in range(training_epochs):
	for i in range(total_batch):
		batch_xs, batch_ys = mnist.train.next_batch(batch_size)
		_, c = sess.run([optimizer, cost], feed_dict = {x:batch_xs, y_true:batch_ys, keep_prop:0.75})
	if eoch % display_step == 0:
		print("Eoch : ", '%04d'%(eoch + 1), "Cost = ", '{:.9f}'.format(c))
print("optimization finished ! ")

#calaulate accuracy
correct_pred = tf.equal(tf.argmax(y_pred, 1), tf.argmax(y_true, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

#test model accuracy
print(accuracy.eval({x:mnist.test.images, y_true:mnist.test.labels, keep_prop:1.0}))

practice makes perfect!