记一次使用Tensorflow搭建神经网络模型经历

时间：2019-01-01 19:57:20 阅读：196 评论：0 收藏：0 [点我收藏+]

标签：oat als axis alpha 参数 new collect Alexnet fir

隐去背景, 作者最近第一次用Tensorflow实现训练了几个模型, 其中遇到了一些错误, 把它记录下来

前言

以下提到的所有代码, 都可以在github上面找到. 仓库地址 https://github.com/spxcds/neural_network_code/
这个仓库里提到的几段代码, 分别实现在从最简单的lr, 到全连接神经网络, 再到卷积神经网络. 从最简单的自己实现交叉熵损失函数, 计算L2正则化, 到后来直接调用库函数, 由简到难, 由浅入深, 截止目前为止, 只实现了MLR, MLP, LeNet-5, AlexNet, VGG-16等几个算法

网络结构

LeNet-5
技术分享图片

AlexNet
技术分享图片

代码实现

几个重要的函数

卷积操作

def conv(self, input_tensor, name, kh, kw, dh, dw, n_output, padding=‘SAME‘):
    n_input = input_tensor.get_shape()[-1].value

    kernel = tf.get_variable(
        name=name + ‘kernel‘,
        shape=[kh, kw, n_input, n_output],
        dtype=tf.float32,
        initializer=tf.truncated_normal_initializer(stddev=0.05))
    bias = tf.get_variable(
        name=name + ‘bias‘, shape=[n_output], dtype=tf.float32, initializer=tf.constant_initializer(0.0))

    c = tf.nn.conv2d(input_tensor, kernel, (1, dh, dw, 1), padding=padding) # SAME, VALID
    return tf.nn.relu(tf.nn.bias_add(c, bias), name=name)

全连接操作

def fc(self, input_tensor, name, n_output):
    n_input = input_tensor.get_shape()[-1].value
    weights = tf.get_variable(
        name=name + ‘weights‘,
        shape=[n_input, n_output],
        dtype=tf.float32,
        initializer=tf.truncated_normal_initializer(stddev=0.05))
    tf.add_to_collection(‘losses‘, tf.nn.l2_loss(weights)) # l2_lambda * tf.add_n(tf.get_collection(‘losses‘))
    bias = tf.get_variable(
        name=name + ‘bias‘, shape=[n_output], dtype=tf.float32, initializer=tf.constant_initializer(0.0))

    return tf.nn.bias_add(tf.matmul(input_tensor, weights), bias)

交叉熵

cost_cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(tf.clip_by_value(p, 1e-10, 1.0)), axis=1))

画图

def plot(self, save_path):
    df = pd.DataFrame(self.train_history, columns=[‘iterations‘, ‘train_acc‘, ‘val_acc‘, ‘train_loss‘, ‘val_loss‘])

    # loss曲线
    fig = plt.figure(figsize=(20, 10))
    ax = fig.add_subplot(121)
    ax.grid(True)
    ax.plot(df.iterations, df.train_loss, ‘k‘, label=‘训练集损失‘, linewidth=1.2, alpha=0.4)
    ax.plot(df.iterations, df.val_loss, ‘k--‘, label=‘验证集损失‘, linewidth=2)
    ax.legend(fontsize=16)
    ax.set_xlabel(‘Iterations‘, fontsize=16)
    ax.set_ylabel(‘Loss‘, fontsize=16)
    ax.set_xlim(np.min(df.iterations), np.max(df.iterations) + 0.1, auto=True)
    ax.tick_params(axis=‘both‘, which=‘major‘)
    ax.set_title(‘损失曲线‘, fontsize=22)

    # 混淆矩阵
    fig_matrix_confusion = plt.figure(figsize=(10, 10))
    ax = fig_matrix_confusion.add_subplot(111)
    confusion_matrix = self.get_confusion_matrix(mnist.test.images, mnist.test.labels)
    sns.heatmap(
        confusion_matrix,
        fmt=‘‘,
        cmap=plt.cm.Greys,
        square=True,
        cbar=False,
        ax=ax,
        annot=True,
        xticklabels=np.arange(10),
        yticklabels=np.arange(10),
        annot_kws={‘fontsize‘: 20})
    ax.set_xlabel(‘Predicted‘, fontsize=16)
    ax.set_ylabel(‘True‘, fontsize=16)
    ax.tick_params(labelsize=14)
    ax.set_title(‘混淆矩阵‘, fontsize=22)
    plt.savefig(save_path + ‘_confusion_matrix‘)
    plt.close()

碰到的问题

网络loss几乎不收敛
- 学习率设置的不对, 稍微调大一点学习率就可以了
- batch_size设置的太大
- 优化算法选一个更高级的, 原先我使用的是tf.train.GradientDescentOptimizer优化算法, 跑了几千个batch才有效果, 换成tf.train.AdamOptimizer, 几十个batch就开始收敛了
训练一段时间后, 网络loss变为NaN
- 梯度爆炸, 使学习过程偏离了正常的学习轨迹, 这个时候调低学习率就可以了
- 计算交叉熵的时候, 出现了log(0)*0的;情况, 使用tf.clip_by_value(t=value,clip_value_min=1e-8,clip_value_min=1.0)避免这种情况
训练集和验证集accuracy维持在0.1左右不变, 可能正则化参数l2_lambda设置大了, 设成1e-4左右试一下
全连接层的最后一层输出层就不要加relu了, 直接加一个softmax即可

未经允许禁止转载 http://spxcds.com/2019/01/01/first_deep_learning/

记一次使用Tensorflow搭建神经网络模型经历

标签：oat als axis alpha 参数 new collect Alexnet fir

原文地址：https://www.cnblogs.com/spxcds/p/10205562.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行