欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

TensorFlow mnist 批量归一化和多通道卷积

程序员文章站 2022-07-13 11:28:07
...

原文链接: TensorFlow mnist 批量归一化和多通道卷积

上一篇: TensorFlow cifar 反卷积

下一篇: TensorFlow concat 函数

多通道卷积

多通道卷积在单个卷积层中加入若干不同尺寸的过滤器,这样会使生成的feature map 特征更加多样性

W_conv2_5x5 = weight_variable([5, 5, 32, 32]) 
b_conv2_5x5 = bias_variable([32]) 
W_conv2_7x7 = weight_variable([7, 7, 32, 32]) 
b_conv2_7x7 = bias_variable([32])
h_conv2_5x5 = tf.nn.relu(conv2d(h_pool1, W_conv2_5x5) + b_conv2_5x5)
h_conv2_7x7 = tf.nn.relu(conv2d(h_pool1, W_conv2_7x7) + b_conv2_7x7)
h_conv2 = tf.concat([h_conv2_5x5, h_conv2_7x7], 3)

h_pool2 = max_pool_2x2(h_conv2)

5X5,7X7 的卷积操作中输入都是 h_pool1,每个卷积操作后都生成 32 个feature map,在使用concat函数将他们合在一起变成一个

(batch, 7, 7, 64),因为有两个滤波器,

批量归一化BN

tf.contrib.layers.batch_norm(
    inputs,
    decay=0.999,
    center=True,
    scale=False,
    epsilon=0.001,
    activation_fn=None,
    param_initializers=None,
    param_regularizers=None,
    updates_collections=tf.GraphKeys.UPDATE_OPS,
    is_training=True,
    reuse=None,
    variables_collections=None,
    outputs_collections=None,
    trainable=True,
    batch_weights=None,
    fused=None,
    data_format=DATA_FORMAT_NHWC,
    zero_debias_moving_mean=False,
    scope=None,
    renorm=False,
    renorm_clipping=None,
    renorm_decay=0.99,
    adjustment=None
)


1 inputs: 输入

2 decay :衰减系数。合适的衰减系数值接近1.0,特别是含多个9的值:0.999,0.99,0.9。如果训练集表现很好而验证/测试集表现得不好,选择小的系数(推荐使用0.9)。但值太小会导致均值和方差更新太快,而值太大又会导致几乎没有衰减,容易出现过拟合,如果想要提高稳定性,zero_debias_moving_mean设为True,

3 center:如果为True,有beta偏移量;如果为False,无beta偏移量

4 scale:如果为True,则乘以gamma。如果为False,gamma则不使用。当下一层是线性的时(例如nn.relu),由于缩放可以由下一层完成,所以可以禁用该层。

5 epsilon:避免被零除,一般加上一个极小值

6 activation_fn:用于**,默认为线性**函数

7 param_initializers : beta, gamma, moving mean and moving variance的优化初始化

8 param_regularizers : beta and gamma正则化优化

9 updates_collections :Collections来收集计算的更新操作。updates_ops需要使用train_op来执行。默认设置为每次当前批次训练完成后才更新均值和方差,这样导致当前数据总是使用前一次的均值和方差,如果为None,则会添加控件依赖项以确保更新已计算到,让均值和方差即时更新,性能稍慢,但对训练有很大帮助。

10 is_training:图层是否处于训练模式。在训练模式下,它将积累转入的统计量moving_mean并 moving_variance使用给定的指数移动平均值 decay。当它不是在训练模式,那么它将使用的数值moving_mean和moving_variance。当测试的时候要设为false,这样就会使用测试样0本集的均值和方差

11 scope:可选范围variable_scope


注意:训练时,需要更新moving_mean和moving_variance。默认情况下,更新操作被放入tf.GraphKeys.UPDATE_OPS,所以需要添加它们作为依赖项train_op。例如:


使用tf的卷积函数简化代码

import tensorflow as tf
import numpy as np
from tensorflow.contrib.layers.python.layers import batch_norm
# 导入 MINST 数据集
from tensorflow.examples.tutorials.mnist import input_data

data_dir = "MNIST_data/"
batch_size = 100
train_size = 10000
mnist = input_data.read_data_sets(data_dir, one_hot=True)
decaylearning_rate = tf.train.exponential_decay(0.04, 20000, 1000, 0.9)

# tf Graph Input
in_x = tf.placeholder(tf.float32, [None, 784])  # mnist data维度 28*28=784
in_image = tf.reshape(in_x, [-1, 28, 28, 1])
in_y = tf.placeholder(tf.float32, [None, 10])  # 0-9 数字=> 10 classes
is_training = tf.placeholder(tf.bool)
# h_conv1 = tf.contrib.layers.conv2d(x_image, 64, [5, 5], 1, 'SAME', activation_fn=tf.nn.relu)
conv1 = tf.contrib.layers.conv2d(in_image, 32, [5, 5], 1, "SAME", activation_fn=tf.nn.relu, )
pool1 = tf.contrib.layers.max_pool2d(conv1, [2, 2], stride=2, padding="SAME")

conv2 = tf.contrib.layers.conv2d(pool1, 32, [5, 5], 1, "SAME", activation_fn=tf.nn.relu)
pool2 = tf.contrib.layers.max_pool2d(conv2, [2, 2], stride=2, padding="SAME")

conv3_3x3 = tf.contrib.layers.conv2d(pool2, 32, [3, 3], 1, "SAME", activation_fn=tf.nn.relu)
conv3_5x5 = tf.contrib.layers.conv2d(pool2, 32, [5, 5], 1, "SAME", activation_fn=tf.nn.relu)
conv3_7x7 = tf.contrib.layers.conv2d(pool2, 32, [7, 7], 1, "SAME", activation_fn=tf.nn.relu)
conv3 = tf.concat([conv3_3x3, conv3_5x5, conv3_7x7], 3)
# 加入BN
conv3_bn = batch_norm(conv3, decay=0.9, updates_collections=None, is_training=is_training)
pool3 = tf.contrib.layers.avg_pool2d(conv3_bn, [6, 6], stride=7, padding='SAME')

# pool3 = tf.contrib.layers.avg_pool2d(conv3, [6, 6], stride=7, padding='SAME')

pool3_flat = tf.reshape(pool3, [-1, 96])
out_y = tf.contrib.layers.fully_connected(pool3_flat, 10, activation_fn=tf.nn.softmax)

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=in_y, logits=out_y)
train_step = tf.train.AdamOptimizer(decaylearning_rate).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(out_y, 1), tf.argmax(in_y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    batch = mnist.train.next_batch(batch_size)
    print(sess.run(conv1, feed_dict={in_x: batch[0], in_y: batch[1], }).shape)
    print(sess.run(pool1, feed_dict={in_x: batch[0], in_y: batch[1], }).shape)
    print(sess.run(conv2, feed_dict={in_x: batch[0], in_y: batch[1], }).shape)
    print(sess.run(pool2, feed_dict={in_x: batch[0], in_y: batch[1], }).shape)
    print(sess.run(conv3, feed_dict={in_x: batch[0], in_y: batch[1], }).shape)
    print(sess.run(pool3, feed_dict={in_x: batch[0], in_y: batch[1], is_training: True}).shape)
    print(sess.run(pool3_flat, feed_dict={in_x: batch[0], in_y: batch[1], is_training: True}).shape)
    print(sess.run(out_y, feed_dict={in_x: batch[0], in_y: batch[1], is_training: True}).shape)

    for i in range(train_size):
        batch = mnist.train.next_batch(batch_size)
        train_step.run(feed_dict={
            in_x: batch[0],
            in_y: batch[1],
            is_training: True,
        })

        if (i + 1) % 100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                in_x: batch[0],
                in_y: batch[1],
                is_training: False,
            })
            print("step %d, train %g" % (i + 1, train_accuracy),
                  "test %g" % accuracy.eval(
                      feed_dict={
                          in_x: mnist.test.images,
                          in_y: mnist.test.labels,
                          is_training: False
                      }))
各层shape
(100, 28, 28, 32)
(100, 14, 14, 32)
(100, 14, 14, 32)
(100, 7, 7, 32)
(100, 7, 7, 96)
(100, 1, 1, 96)
(100, 96)
(100, 10)

step 9800, train 0.99 test 0.9861
step 9900, train 0.97 test 0.9804
step 10000, train 0.98 test 0.9859

BN
step 9800, train 1 test 0.9939
step 9900, train 1 test 0.9929
step 10000, train 1 test 0.9933

貌似没有使用dropout。。。。好吧。。。强行用感同样的时间,基本上准确率差了一个数量级...

mnist各层shape

h_pool1 (50, 14, 14, 32)
h_conv2_5x5 (50, 14, 14, 32)
h_conv2_7x7 (50, 14, 14, 32)
h_pool2 (50, 7, 7, 64)
nt_hpool3 (50, 1, 1, 10)
import tensorflow as tf
import numpy as np
from tensorflow.contrib.layers.python.layers import batch_norm
# 导入 MINST 数据集
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)


def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)


def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                          strides=[1, 2, 2, 1], padding='SAME')


def avg_pool_7x7(x):
    return tf.nn.avg_pool(x, ksize=[1, 7, 7, 1],
                          strides=[1, 7, 7, 1], padding='SAME')


def batch_norm_layer(value, train=None, name='batch_norm'):
    if train is not None:
        return batch_norm(value, decay=0.9, updates_collections=None, is_training=True)
    else:
        return batch_norm(value, decay=0.9, updates_collections=None, is_training=False)


# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784])  # mnist data维度 28*28=784
y = tf.placeholder(tf.float32, [None, 10])  # 0-9 数字=> 10 classes
train = tf.placeholder(tf.float32)

W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(batch_norm_layer((conv2d(x_image, W_conv1) + b_conv1), train))
h_pool1 = max_pool_2x2(h_conv1)
######################################################多卷积核
W_conv2_5x5 = weight_variable([5, 5, 32, 32])
b_conv2_5x5 = bias_variable([32])
W_conv2_7x7 = weight_variable([7, 7, 32, 32])
b_conv2_7x7 = bias_variable([32])
h_conv2_5x5 = tf.nn.relu(batch_norm_layer((conv2d(h_pool1, W_conv2_5x5) + b_conv2_5x5), train))
h_conv2_7x7 = tf.nn.relu(batch_norm_layer((conv2d(h_pool1, W_conv2_7x7) + b_conv2_7x7), train))
h_conv2 = tf.concat([h_conv2_5x5, h_conv2_7x7], 3)

h_pool2 = max_pool_2x2(h_conv2)
#########################################################new 池化

W_conv3 = weight_variable([5, 5, 64, 10])
b_conv3 = bias_variable([10])
h_conv3 = tf.nn.relu(conv2d(h_pool2, W_conv3) + b_conv3)

nt_hpool3 = avg_pool_7x7(h_conv3)  # 10
nt_hpool3_flat = tf.reshape(nt_hpool3, [-1, 10])
y_conv = tf.nn.softmax(nt_hpool3_flat)

# keep_prob = tf.placeholder("float")

cross_entropy = -tf.reduce_sum(y * tf.log(y_conv))

decaylearning_rate = tf.train.exponential_decay(0.04, 20000, 1000, 0.9)
train_step = tf.train.AdamOptimizer(decaylearning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

# 启动session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10000):  # 20000
        batch = mnist.train.next_batch(50)

        train_step.run(feed_dict={
            x: batch[0],
            y: batch[1],
            # keep_prob: 0.5,
            train: 1.,
        })

        if (i + 1) % 100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                x: batch[0],
                y: batch[1],
                # keep_prob: 1.0,

            })
            print("step %d, training accuracy %g" % (i, train_accuracy))

            print("test accuracy %g" % accuracy.eval(feed_dict={
                x: mnist.test.images,
                y: mnist.test.labels,
                # keep_prob: 1.0
            }))

可以达到很好的效果

step 19700, training accuracy 1
step 19800, training accuracy 1
step 19900, training accuracy 1
test accuracy 0.9944

归一化对比,这里使用的tf.layers.batch_normalization

import tensorflow as tf
import numpy as np
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# 载入数据集
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


def get_net(x):
    print(x.shape)
    x = tf.reshape(x, (-1, 28, 28, 1))
    print(x.shape)
    net = tf.layers.conv2d(x, 16, 3, 2, "SAME")
    print(net.shape)
    net = tf.layers.conv2d(net, 16, 3, 2, "SAME")
    print(net.shape)
    net = tf.layers.conv2d(net, 10, 7, 7, "SAME")
    print(net.shape)
    net = tf.layers.flatten(net)
    print(net.shape)
    return net


def get_net2(x):
    print(x.shape)
    x = tf.reshape(x, (-1, 28, 28, 1))
    x = tf.layers.batch_normalization(x, training=True)
    print(x.shape)
    net = tf.layers.conv2d(x, 16, 3, 2, "SAME")
    net = tf.layers.batch_normalization(net, training=True)
    print(net.shape)
    net = tf.layers.conv2d(net, 16, 3, 2, "SAME")
    net = tf.layers.batch_normalization(net, training=True)
    print(net.shape)
    net = tf.layers.conv2d(net, 10, 7, 7, "SAME")
    print(net.shape)
    net = tf.layers.flatten(net)
    print(net.shape)
    return net


train_num = 10000
batch_size = 100
show_num = 200
learning_rate = .0001
in_x = tf.placeholder(tf.float32, (None, 784))
in_y = tf.placeholder(tf.float32, (None, 10))

out_y = get_net(in_x)
loss = tf.nn.softmax_cross_entropy_with_logits(labels=in_y, logits=out_y)
train_opt = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
predict = tf.equal(tf.argmax(out_y, 1), tf.argmax(in_y, 1))
accuracy = tf.reduce_mean(tf.cast(predict, tf.float32))

out_y2 = get_net2(in_x)
loss2 = tf.nn.softmax_cross_entropy_with_logits(labels=in_y, logits=out_y2)
train_opt2 = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss2)
predict2 = tf.equal(tf.argmax(out_y2, 1), tf.argmax(in_y, 1))
accuracy2 = tf.reduce_mean(tf.cast(predict2, tf.float32))

with tf.Session()  as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(train_num):
        batch = mnist.train.next_batch(batch_size)

        sess.run(
            [train_opt, train_opt2], feed_dict={
                in_x: batch[0],
                in_y: batch[1],
            })

        if not (i + 1) % show_num:
            acc_train1, acc_train2 = sess.run(
                [accuracy, accuracy2], feed_dict={
                    in_x: mnist.train.images,
                    in_y: mnist.train.labels,
                }
            )
            acc_test1, acc_test2 = sess.run(
                [accuracy, accuracy2], feed_dict={
                    in_x: mnist.test.images,
                    in_y: mnist.test.labels,
                }
            )
            print(acc_train1, acc_test1, acc_train2, acc_test2)

可以看到批量归一化比普通的收敛更快,并且结果更好点

0.6965455 0.7094 0.7228182 0.7342
0.81363636 0.8296 0.82674545 0.8371
0.8600364 0.8716 0.8582364 0.8691
0.8773636 0.8867 0.8737091 0.8842
0.8854909 0.8929 0.88281816 0.893
0.8921273 0.8992 0.8897091 0.8996
0.89574546 0.9034 0.8935818 0.903
0.89843637 0.905 0.89763635 0.9053
0.90056366 0.9066 0.89967275 0.9083
0.9021818 0.9078 0.9018546 0.9109
0.9038 0.9087 0.90425456 0.9112
0.9055455 0.9103 0.90590906 0.9129