基于卷积神经网络CNN的电影推荐系统

时间：2019-04-09 14:03:02 阅读：173 评论：0 收藏：0 [点我收藏+]

标签：his fat 测试 ica jpg href pen embedding layer

本项目使用文本卷积神经网络，并使用MovieLens数据集完成电影推荐的任务。

推荐系统在日常的网络应用中无处不在，比如网上购物、网上买书、新闻app、社交网络、音乐网站、电影网站等等等等，有人的地方就有推荐。根据个人的喜好，相同喜好人群的习惯等信息进行个性化的内容推荐。比如打开新闻类的app，因为有了个性化的内容，每个人看到的新闻首页都是不一样的。

这当然是很有用的，在信息爆炸的今天，获取信息的途径和方式多种多样，人们花费时间最多的不再是去哪获取信息，而是要在众多的信息中寻找自己感兴趣的，这就是信息超载问题。为了解决这个问题，推荐系统应运而生。

协同过滤是推荐系统应用较广泛的技术，该方法搜集用户的历史记录、个人喜好等信息，计算与其他用户的相似度，利用相似用户的评价来预测目标用户对特定项目的喜好程度。优点是会给用户推荐未浏览过的项目，缺点呢，对于新用户来说，没有任何与商品的交互记录和个人喜好等信息，存在冷启动问题，导致模型无法找到相似的用户或商品。

为了解决冷启动的问题，通常的做法是对于刚注册的用户，要求用户先选择自己感兴趣的话题、群组、商品、性格、喜欢的音乐类型等信息，比如豆瓣FM：

技术图片

下载数据集

运行下面代码把数据集下载下来

import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
from collections import Counter
import tensorflow as tf

import os
import pickle
import re
from tensorflow.python.ops import math_ops

from urllib.request import urlretrieve
from os.path import isfile, isdir
from tqdm import tqdm
import zipfile
import hashlib

def _unzip(save_path, _, database_name, data_path):
    """
    解压
    :param save_path: The path of the gzip files
    :param database_name: Name of database
    :param data_path: Path to extract to
    :param _: HACK - Used to have to same interface as _ungzip
    """
    print('Extracting {}...'.format(database_name))
    with zipfile.ZipFile(save_path) as zf:
        zf.extractall(data_path)

def download_extract(database_name, data_path):
    """
    下载提取数据
    :param database_name: Database name
    """
    DATASET_ML1M = 'ml-1m'

    if database_name == DATASET_ML1M:
        url = 'http://files.grouplens.org/datasets/movielens/ml-1m.zip'
        hash_code = 'c4d9eecfca2ab87c1945afe126590906'
        extract_path = os.path.join(data_path, 'ml-1m')
        save_path = os.path.join(data_path, 'ml-1m.zip')
        extract_fn = _unzip

    if os.path.exists(extract_path):
        print('Found {} Data'.format(database_name))
        return

    if not os.path.exists(data_path):
        os.makedirs(data_path)

    if not os.path.exists(save_path):
        with DLProgress(unit='B', unit_scale=True, miniters=1, desc='Downloading {}'.format(database_name)) as pbar:
            urlretrieve(
                url,
                save_path,
                pbar.hook)

    assert hashlib.md5(open(save_path, 'rb').read()).hexdigest() == hash_code,         '{} file is corrupted.  Remove the file and try again.'.format(save_path)

    os.makedirs(extract_path)
    try:
        extract_fn(save_path, extract_path, database_name, data_path)
    except Exception as err:
        shutil.rmtree(extract_path)  # Remove extraction folder if there is an error
        raise err

    print('Done.')
    # Remove compressed data
#     os.remove(save_path)

class DLProgress(tqdm):
    """
    下载时处理进度条
    """
    last_block = 0

    def hook(self, block_num=1, block_size=1, total_size=None):
        """
        A hook function that will be called once on establishment of the network connection and
        once after each block read thereafter.
        :param block_num: A count of blocks transferred so far
        :param block_size: Block size in bytes
        :param total_size: The total size of the file. This may be -1 on older FTP servers which do not return
                            a file size in response to a retrieval request.
        """
        self.total = total_size
        self.update((block_num - self.last_block) * block_size)
        self.last_block = block_num

data_dir = './'
download_extract('ml-1m', data_dir)

Extracting ml-1m...
Done.

先来看看数据

本项目使用的是MovieLens 1M 数据集，包含6000个用户在近4000部电影上的1亿条评论。

数据集分为三个文件：

用户数据users.dat
电影数据movies.dat
评分数据ratings.dat

用户数据

用户ID
性别
年龄
职业ID
邮编

数据中的格式：UserID::Gender::Age::Occupation::Zip-code

Gender is denoted by a "M" for male and "F" for female
Age is chosen from the following ranges:
- 1: "Under 18"
- 18: "18-24"
- 25: "25-34"
- 35: "35-44"
- 45: "45-49"
- 50: "50-55"
- 56: "56+"
Occupation is chosen from the following choices:
- 0: "other" or not specified
- 1: "academic/educator"
- 2: "artist"
- 3: "clerical/admin"
- 4: "college/grad student"
- 5: "customer service"
- 6: "doctor/health care"
- 7: "executive/managerial"
- 8: "farmer"
- 9: "homemaker"
- 10: "K-12 student"
- 11: "lawyer"
- 12: "programmer"
- 13: "retired"
- 14: "sales/marketing"
- 15: "scientist"
- 16: "self-employed"
- 17: "technician/engineer"
- 18: "tradesman/craftsman"
- 19: "unemployed"
- 20: "writer"

users_title = ['UserID', 'Gender', 'Age', 'OccupationID', 'Zip-code']
users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
users.head()

	UserID	Gender	Age	OccupationID	Zip-code
0	1	F	1	10	48067
1	2	M	56	16	70072
2	3	M	25	15	55117
3	4	M	45	7	02460
4	5	M	25	20	55455

可以看出UserID、Gender、Age和Occupation都是类别字段，其中邮编字段是我们不使用的。

电影数据

电影ID
电影名
电影风格

数据中的格式：MovieID::Title::Genres

Titles are identical to titles provided by the IMDB (including
year of release)
Genres are pipe-separated and are selected from the following genres:
- Action
- Adventure
- Animation
- Children‘s
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western

movies_title = ['MovieID', 'Title', 'Genres']
movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
movies.head()

	MovieID	Title	Genres
0	1	Toy Story (1995)	Animation\|Children‘s\|Comedy
1	2	Jumanji (1995)	Adventure\|Children‘s\|Fantasy
2	3	Grumpier Old Men (1995)	Comedy\|Romance
3	4	Waiting to Exhale (1995)	Comedy\|Drama
4	5	Father of the Bride Part II (1995)	Comedy

MovieID是类别字段，Title是文本，Genres也是类别字段

评分数据

用户ID
电影ID
评分
时间戳

数据中的格式：UserID::MovieID::Rating::Timestamp

UserIDs range between 1 and 6040
MovieIDs range between 1 and 3952
Ratings are made on a 5-star scale (whole-star ratings only)
Timestamp is represented in seconds since the epoch as returned by time(2)
Each user has at least 20 ratings

ratings_title = ['UserID','MovieID', 'Rating', 'timestamps']
ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
ratings.head()

	UserID	MovieID	Rating	timestamps
0	1	1193	5	978300760
1	1	661	3	978302109
2	1	914	3	978301968
3	1	3408	4	978300275
4	1	2355	5	978824291

评分字段Rating就是我们要学习的targets，时间戳字段我们不使用。

来说说数据预处理

UserID、Occupation和MovieID不用变。
Gender字段：需要将‘F’和‘M’转换成0和1。
Age字段：要转成7个连续数字0~6。
Genres字段：是分类字段，要转成数字。首先将Genres中的类别转成字符串到数字的字典，然后再将每个电影的Genres字段转成数字列表，因为有些电影是多个Genres的组合。
Title字段：处理方式跟Genres字段一样，首先创建文本到数字的字典，然后将Title中的描述转成数字的列表。另外Title中的年份也需要去掉。
Genres和Title字段需要将长度统一，这样在神经网络中方便处理。空白部分用‘< PAD >’对应的数字填充。

实现数据预处理

def load_data():
    """
    从文件中加载数据集
    """
    # 读取User数据
    users_title = ['UserID', 'Gender', 'Age', 'JobID', 'Zip-code']
    users = pd.read_table('./ml-1m/users.dat', sep='::', header=None, names=users_title, engine = 'python')
    users = users.filter(regex='UserID|Gender|Age|JobID')
    users_orig = users.values
    
    # 改变User数据中性别和年龄
    gender_map = {'F':0, 'M':1}
    users['Gender'] = users['Gender'].map(gender_map)

    age_map = {val:ii for ii,val in enumerate(set(users['Age']))}
    users['Age'] = users['Age'].map(age_map)

    # 读取Movie数据集
    movies_title = ['MovieID', 'Title', 'Genres']
    movies = pd.read_table('./ml-1m/movies.dat', sep='::', header=None, names=movies_title, engine = 'python')
    movies_orig = movies.values
    # 将Title中的年份去掉
    pattern = re.compile(r'^(.*)\((\d+)\)$')

    title_map = {val:pattern.match(val).group(1) for ii,val in enumerate(set(movies['Title']))}
    movies['Title'] = movies['Title'].map(title_map)

    # 电影类型转数字字典
    genres_set = set()
    for val in movies['Genres'].str.split('|'):
        genres_set.update(val)

    genres_set.add('<PAD>')
    genres2int = {val:ii for ii, val in enumerate(genres_set)}

    # 将电影类型转成等长数字列表，长度是18
    genres_map = {val:[genres2int[row] for row in val.split('|')] for ii,val in enumerate(set(movies['Genres']))}

    for key in genres_map:
        for cnt in range(max(genres2int.values()) - len(genres_map[key])):
            genres_map[key].insert(len(genres_map[key]) + cnt,genres2int['<PAD>'])
    
    movies['Genres'] = movies['Genres'].map(genres_map)

    # 电影Title转数字字典
    title_set = set()
    for val in movies['Title'].str.split():
        title_set.update(val)
    
    title_set.add('<PAD>')
    title2int = {val:ii for ii, val in enumerate(title_set)}

    # 将电影Title转成等长数字列表，长度是15
    title_count = 15
    title_map = {val:[title2int[row] for row in val.split()] for ii,val in enumerate(set(movies['Title']))}
    
    for key in title_map:
        for cnt in range(title_count - len(title_map[key])):
            title_map[key].insert(len(title_map[key]) + cnt,title2int['<PAD>'])
    
    movies['Title'] = movies['Title'].map(title_map)

    # 读取评分数据集
    ratings_title = ['UserID','MovieID', 'ratings', 'timestamps']
    ratings = pd.read_table('./ml-1m/ratings.dat', sep='::', header=None, names=ratings_title, engine = 'python')
    ratings = ratings.filter(regex='UserID|MovieID|ratings')

    # 合并三个表
    data = pd.merge(pd.merge(ratings, users), movies)
    
    # 将数据分成X和y两张表
    target_fields = ['ratings']
    features_pd, targets_pd = data.drop(target_fields, axis=1), data[target_fields]
    
    features = features_pd.values
    targets_values = targets_pd.values
    
    return title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig

加载数据并保存到本地

title_count：Title字段的长度（15）
title_set：Title文本的集合
genres2int：电影类型转数字的字典
features：是输入X
targets_values：是学习目标y
ratings：评分数据集的Pandas对象
users：用户数据集的Pandas对象
movies：电影数据的Pandas对象
data：三个数据集组合在一起的Pandas对象
movies_orig：没有做数据处理的原始电影数据
users_orig：没有做数据处理的原始用户数据

# 加载数据
title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = load_data()

# 存入文件中
pickle.dump((title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig), open('preprocess.p', 'wb'))

预处理后的数据

users.head()

	UserID	Gender	Age	JobID
0	1	0	0	10
1	2	1	5	16
2	3	1	6	15
3	4	1	2	7
4	5	1	6	20

movies.head()

	MovieID	Title	Genres
0	1	[310, 2184, 634, 634, 634, 634, 634, 634, 634,...	[0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
1	2	[1182, 634, 634, 634, 634, 634, 634, 634, 634,...	[3, 18, 8, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
2	3	[5011, 4744, 2629, 634, 634, 634, 634, 634, 63...	[7, 9, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
3	4	[4095, 1535, 1886, 634, 634, 634, 634, 634, 63...	[7, 5, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17,...
4	5	[3563, 1725, 3790, 3727, 838, 343, 634, 634, 6...	[7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17...

movies.values[0]

array([1,
       list([310, 2184, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634, 634]),
       list([0, 18, 7, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17])],
      dtype=object)

从本地读取数据

title_count, title_set, genres2int, features, targets_values, ratings, users, movies, data, movies_orig, users_orig = pickle.load(open('preprocess.p', mode='rb'))

模型设计

技术图片

通过研究数据集中的字段类型，我们发现有一些是类别字段，通常的处理是将这些字段转成one hot编码，但是像UserID、MovieID这样的字段就会变成非常的稀疏，输入的维度急剧膨胀，这是我们不愿意见到的，毕竟我这小笔记本不像大厂动辄能处理数以亿计维度的输入：）

所以在预处理数据时将这些字段转成了数字，我们用这个数字当做嵌入矩阵的索引，在网络的第一层使用了嵌入层，维度是（N，32）和（N，16）。

电影类型的处理要多一步，有时一个电影有多个电影类型，这样从嵌入矩阵索引出来是一个（n，32）的矩阵，因为有多个类型嘛，我们要将这个矩阵求和，变成（1，32）的向量。

电影名的处理比较特殊，没有使用循环神经网络，而是用了文本卷积网络，下文会进行说明。

从嵌入层索引出特征以后，将各特征传入全连接层，将输出再次传入全连接层，最终分别得到（1，200）的用户特征和电影特征两个特征向量。

我们的目的就是要训练出用户特征和电影特征，在实现推荐功能时使用。得到这两个特征以后，就可以选择任意的方式来拟合评分了。我使用了两种方式，一个是上图中画出的将两个特征做向量乘法，将结果与真实评分做回归，采用MSE优化损失。因为本质上这是一个回归问题，另一种方式是，将两个特征作为输入，再次传入全连接层，输出一个值，将输出值回归到真实评分，采用MSE优化损失。

实际上第二个方式的MSE loss在0.8附近，第一个方式在1附近，5次迭代的结果。

文本卷积网络

网络看起来像下面这样

技术图片

图片来自Kim Yoon的论文：Convolutional Neural Networks for Sentence Classification

将卷积神经网络用于文本的文章建议你阅读Understanding Convolutional Neural Networks for NLP

网络的第一层是词嵌入层，由每一个单词的嵌入向量组成的嵌入矩阵。下一层使用多个不同尺寸（窗口大小）的卷积核在嵌入矩阵上做卷积，窗口大小指的是每次卷积覆盖几个单词。这里跟对图像做卷积不太一样，图像的卷积通常用2x2、3x3、5x5之类的尺寸，而文本卷积要覆盖整个单词的嵌入向量，所以尺寸是（单词数，向量维度），比如每次滑动3个，4个或者5个单词。第三层网络是max pooling得到一个长向量，最后使用dropout做正则化，最终得到了电影Title的特征。

辅助函数

import tensorflow as tf
import os
import pickle

def save_params(params):
    """
    保存参数到文件中
    """
    pickle.dump(params, open('params.p', 'wb'))


def load_params():
    """
    从文件中加载参数
    """
    return pickle.load(open('params.p', mode='rb'))

编码实现

# 嵌入矩阵的维度
embed_dim = 32
# 用户ID个数
uid_max = max(features.take(0,1)) + 1 # 6040
# 性别个数
gender_max = max(features.take(2,1)) + 1 # 1 + 1 = 2
# 年龄类别个数
age_max = max(features.take(3,1)) + 1 # 6 + 1 = 7
# 职业个数
job_max = max(features.take(4,1)) + 1# 20 + 1 = 21

# 电影ID个数
movie_id_max = max(features.take(1,1)) + 1 # 3952
# 电影类型个数
movie_categories_max = max(genres2int.values()) + 1 # 18 + 1 = 19
# 电影名单词个数
movie_title_max = len(title_set) # 5216

# 对电影类型嵌入向量做加和操作的标志，考虑过使用mean做平均，但是没实现mean
combiner = "sum"

# 电影名长度
sentences_size = title_count # = 15
# 文本卷积滑动窗口，分别滑动2, 3, 4, 5个单词
window_sizes = {2, 3, 4, 5}
# 文本卷积核数量
filter_num = 8

# 电影ID转下标的字典，数据集中电影ID跟下标不一致，比如第5行的数据电影ID不一定是5
movieid2idx = {val[0]:i for i, val in enumerate(movies.values)}

超参

# Number of Epochs
num_epochs = 5
# Batch Size
batch_size = 256

dropout_keep = 0.5
# Learning Rate
learning_rate = 0.0001
# Show stats for every n number of batches
show_every_n_batches = 20

save_dir = './save'

输入

定义输入的占位符

def get_inputs():
    uid = tf.placeholder(tf.int32, [None, 1], name="uid")
    user_gender = tf.placeholder(tf.int32, [None, 1], name="user_gender")
    user_age = tf.placeholder(tf.int32, [None, 1], name="user_age")
    user_job = tf.placeholder(tf.int32, [None, 1], name="user_job")
    
    movie_id = tf.placeholder(tf.int32, [None, 1], name="movie_id")
    movie_categories = tf.placeholder(tf.int32, [None, 18], name="movie_categories")
    movie_titles = tf.placeholder(tf.int32, [None, 15], name="movie_titles")
    targets = tf.placeholder(tf.int32, [None, 1], name="targets")
    LearningRate = tf.placeholder(tf.float32, name = "LearningRate")
    dropout_keep_prob = tf.placeholder(tf.float32, name = "dropout_keep_prob")
    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, LearningRate, dropout_keep_prob

构建神经网络

定义User的嵌入矩阵

def get_user_embedding(uid, user_gender, user_age, user_job):
    with tf.name_scope("user_embedding"):
        uid_embed_matrix = tf.Variable(tf.random_uniform([uid_max, embed_dim], -1, 1), name = "uid_embed_matrix")
        uid_embed_layer = tf.nn.embedding_lookup(uid_embed_matrix, uid, name = "uid_embed_layer")
    
        gender_embed_matrix = tf.Variable(tf.random_uniform([gender_max, embed_dim // 2], -1, 1), name= "gender_embed_matrix")
        gender_embed_layer = tf.nn.embedding_lookup(gender_embed_matrix, user_gender, name = "gender_embed_layer")
        
        age_embed_matrix = tf.Variable(tf.random_uniform([age_max, embed_dim // 2], -1, 1), name="age_embed_matrix")
        age_embed_layer = tf.nn.embedding_lookup(age_embed_matrix, user_age, name="age_embed_layer")
        
        job_embed_matrix = tf.Variable(tf.random_uniform([job_max, embed_dim // 2], -1, 1), name = "job_embed_matrix")
        job_embed_layer = tf.nn.embedding_lookup(job_embed_matrix, user_job, name = "job_embed_layer")
    return uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer

将User的嵌入矩阵一起全连接生成User的特征

def get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer):
    with tf.name_scope("user_fc"):
        #第一层全连接
        uid_fc_layer = tf.layers.dense(uid_embed_layer, embed_dim, name = "uid_fc_layer", activation=tf.nn.relu)
        gender_fc_layer = tf.layers.dense(gender_embed_layer, embed_dim, name = "gender_fc_layer", activation=tf.nn.relu)
        age_fc_layer = tf.layers.dense(age_embed_layer, embed_dim, name ="age_fc_layer", activation=tf.nn.relu)
        job_fc_layer = tf.layers.dense(job_embed_layer, embed_dim, name = "job_fc_layer", activation=tf.nn.relu)
        
        #第二层全连接
        user_combine_layer = tf.concat([uid_fc_layer, gender_fc_layer, age_fc_layer, job_fc_layer], 2)  #(?, 1, 128)
        user_combine_layer = tf.contrib.layers.fully_connected(user_combine_layer, 200, tf.tanh)  #(?, 1, 200)
    
        user_combine_layer_flat = tf.reshape(user_combine_layer, [-1, 200])
    return user_combine_layer, user_combine_layer_flat

定义Movie ID的嵌入矩阵

def get_movie_id_embed_layer(movie_id):
    with tf.name_scope("movie_embedding"):
        movie_id_embed_matrix = tf.Variable(tf.random_uniform([movie_id_max, embed_dim], -1, 1), name = "movie_id_embed_matrix")
        movie_id_embed_layer = tf.nn.embedding_lookup(movie_id_embed_matrix, movie_id, name = "movie_id_embed_layer")
    return movie_id_embed_layer

对电影类型的多个嵌入向量做加和

def get_movie_categories_layers(movie_categories):
    with tf.name_scope("movie_categories_layers"):
        movie_categories_embed_matrix = tf.Variable(tf.random_uniform([movie_categories_max, embed_dim], -1, 1), name = "movie_categories_embed_matrix")
        movie_categories_embed_layer = tf.nn.embedding_lookup(movie_categories_embed_matrix, movie_categories, name = "movie_categories_embed_layer")
        if combiner == "sum":
            movie_categories_embed_layer = tf.reduce_sum(movie_categories_embed_layer, axis=1, keep_dims=True)
    #     elif combiner == "mean":

    return movie_categories_embed_layer

Movie Title的文本卷积网络实现

def get_movie_cnn_layer(movie_titles):
    #从嵌入矩阵中得到电影名对应的各个单词的嵌入向量
    with tf.name_scope("movie_embedding"):
        movie_title_embed_matrix = tf.Variable(tf.random_uniform([movie_title_max, embed_dim], -1, 1), name = "movie_title_embed_matrix")
        movie_title_embed_layer = tf.nn.embedding_lookup(movie_title_embed_matrix, movie_titles, name = "movie_title_embed_layer")
        movie_title_embed_layer_expand = tf.expand_dims(movie_title_embed_layer, -1)
    
    #对文本嵌入层使用不同尺寸的卷积核做卷积和最大池化
    pool_layer_lst = []
    for window_size in window_sizes:
        with tf.name_scope("movie_txt_conv_maxpool_{}".format(window_size)):
            filter_weights = tf.Variable(tf.truncated_normal([window_size, embed_dim, 1, filter_num],stddev=0.1),name = "filter_weights")
            filter_bias = tf.Variable(tf.constant(0.1, shape=[filter_num]), name="filter_bias")
            
            conv_layer = tf.nn.conv2d(movie_title_embed_layer_expand, filter_weights, [1,1,1,1], padding="VALID", name="conv_layer")
            relu_layer = tf.nn.relu(tf.nn.bias_add(conv_layer,filter_bias), name ="relu_layer")
            
            maxpool_layer = tf.nn.max_pool(relu_layer, [1,sentences_size - window_size + 1 ,1,1], [1,1,1,1], padding="VALID", name="maxpool_layer")
            pool_layer_lst.append(maxpool_layer)

    #Dropout层
    with tf.name_scope("pool_dropout"):
        pool_layer = tf.concat(pool_layer_lst, 3, name ="pool_layer")
        max_num = len(window_sizes) * filter_num
        pool_layer_flat = tf.reshape(pool_layer , [-1, 1, max_num], name = "pool_layer_flat")
    
        dropout_layer = tf.nn.dropout(pool_layer_flat, dropout_keep_prob, name = "dropout_layer")
    return pool_layer_flat, dropout_layer

将Movie的各个层一起做全连接

def get_movie_feature_layer(movie_id_embed_layer, movie_categories_embed_layer, dropout_layer):
    with tf.name_scope("movie_fc"):
        #第一层全连接
        movie_id_fc_layer = tf.layers.dense(movie_id_embed_layer, embed_dim, name = "movie_id_fc_layer", activation=tf.nn.relu)
        movie_categories_fc_layer = tf.layers.dense(movie_categories_embed_layer, embed_dim, name = "movie_categories_fc_layer", activation=tf.nn.relu)
    
        #第二层全连接
        movie_combine_layer = tf.concat([movie_id_fc_layer, movie_categories_fc_layer, dropout_layer], 2)  #(?, 1, 96)
        movie_combine_layer = tf.contrib.layers.fully_connected(movie_combine_layer, 200, tf.tanh)  #(?, 1, 200)
    
        movie_combine_layer_flat = tf.reshape(movie_combine_layer, [-1, 200])
    return movie_combine_layer, movie_combine_layer_flat

构建计算图

tf.reset_default_graph()
train_graph = tf.Graph()
with train_graph.as_default():
    #获取输入占位符
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob = get_inputs()
    #获取User的4个嵌入向量
    uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer = get_user_embedding(uid, user_gender, user_age, user_job)
    #得到用户特征
    user_combine_layer, user_combine_layer_flat = get_user_feature_layer(uid_embed_layer, gender_embed_layer, age_embed_layer, job_embed_layer)
    #获取电影ID的嵌入向量
    movie_id_embed_layer = get_movie_id_embed_layer(movie_id)
    #获取电影类型的嵌入向量
    movie_categories_embed_layer = get_movie_categories_layers(movie_categories)
    #获取电影名的特征向量
    pool_layer_flat, dropout_layer = get_movie_cnn_layer(movie_titles)
    #得到电影特征
    movie_combine_layer, movie_combine_layer_flat = get_movie_feature_layer(movie_id_embed_layer, 
                                                                            movie_categories_embed_layer, 
                                                                            dropout_layer)
    #计算出评分，要注意两个不同的方案，inference的名字（name值）是不一样的，后面做推荐时要根据name取得tensor
    with tf.name_scope("inference"):
        #将用户特征和电影特征作为输入，经过全连接，输出一个值的方案
#         inference_layer = tf.concat([user_combine_layer_flat, movie_combine_layer_flat], 1)  #(?, 200)
#         inference = tf.layers.dense(inference_layer, 1,
#                                     kernel_initializer=tf.truncated_normal_initializer(stddev=0.01), 
#                                     kernel_regularizer=tf.nn.l2_loss, name="inference")
        #简单的将用户特征和电影特征做矩阵乘法得到一个预测评分
#        inference = tf.matmul(user_combine_layer_flat, tf.transpose(movie_combine_layer_flat))
        inference = tf.reduce_sum(user_combine_layer_flat * movie_combine_layer_flat, axis=1)
        inference = tf.expand_dims(inference, axis=1)

    with tf.name_scope("loss"):
        # MSE损失，将计算值回归到评分
        cost = tf.losses.mean_squared_error(targets, inference )
        loss = tf.reduce_mean(cost)
    # 优化损失 
#     train_op = tf.train.AdamOptimizer(lr).minimize(loss)  #cost
    global_step = tf.Variable(0, name="global_step", trainable=False)
    optimizer = tf.train.AdamOptimizer(lr)
    gradients = optimizer.compute_gradients(loss)  #cost
    train_op = optimizer.apply_gradients(gradients, global_step=global_step)

WARNING:tensorflow:From <ipython-input-20-559a1ee9ce9e>:6: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead

inference

<tf.Tensor 'inference/ExpandDims:0' shape=(?, 1) dtype=float32>

取得batch

def get_batches(Xs, ys, batch_size):
    for start in range(0, len(Xs), batch_size):
        end = min(start + batch_size, len(Xs))
        yield Xs[start:end], ys[start:end]

训练网络

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import time
import datetime

losses = {'train':[], 'test':[]}

with tf.Session(graph=train_graph) as sess:
    
    #搜集数据给tensorBoard用
    # Keep track of gradient values and sparsity
    grad_summaries = []
    for g, v in gradients:
        if g is not None:
            grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name.replace(':', '_')), g)
            sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name.replace(':', '_')), tf.nn.zero_fraction(g))
            grad_summaries.append(grad_hist_summary)
            grad_summaries.append(sparsity_summary)
    grad_summaries_merged = tf.summary.merge(grad_summaries)
        
    # Output directory for models and summaries
    timestamp = str(int(time.time()))
    out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))
    print("Writing to {}\n".format(out_dir))
     
    # Summaries for loss and accuracy
    loss_summary = tf.summary.scalar("loss", loss)

    # Train Summaries
    train_summary_op = tf.summary.merge([loss_summary, grad_summaries_merged])
    train_summary_dir = os.path.join(out_dir, "summaries", "train")
    train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)

    # Inference summaries
    inference_summary_op = tf.summary.merge([loss_summary])
    inference_summary_dir = os.path.join(out_dir, "summaries", "inference")
    inference_summary_writer = tf.summary.FileWriter(inference_summary_dir, sess.graph)

    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver()
    for epoch_i in range(num_epochs):
        
        #将数据集分成训练集和测试集，随机种子不固定
        train_X,test_X, train_y, test_y = train_test_split(features,  
                                                           targets_values,  
                                                           test_size = 0.2,  
                                                           random_state = 0)  
        
        train_batches = get_batches(train_X, train_y, batch_size)
        test_batches = get_batches(test_X, test_y, batch_size)
    
        #训练的迭代，保存训练损失
        for batch_i in range(len(train_X) // batch_size):
            x, y = next(train_batches)

            categories = np.zeros([batch_size, 18])
            for i in range(batch_size):
                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])
            for i in range(batch_size):
                titles[i] = x.take(5,1)[i]

            feed = {
                uid: np.reshape(x.take(0,1), [batch_size, 1]),
                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
                user_age: np.reshape(x.take(3,1), [batch_size, 1]),
                user_job: np.reshape(x.take(4,1), [batch_size, 1]),
                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
                movie_categories: categories,  #x.take(6,1)
                movie_titles: titles,  #x.take(5,1)
                targets: np.reshape(y, [batch_size, 1]),
                dropout_keep_prob: dropout_keep, #dropout_keep
                lr: learning_rate}

            step, train_loss, summaries, _ = sess.run([global_step, loss, train_summary_op, train_op], feed)  #cost
            losses['train'].append(train_loss)
            train_summary_writer.add_summary(summaries, step)  #
            
            # Show every <show_every_n_batches> batches
            if (epoch_i * (len(train_X) // batch_size) + batch_i) % show_every_n_batches == 0:
                time_str = datetime.datetime.now().isoformat()
                print('{}: Epoch {:>3} Batch {:>4}/{}   train_loss = {:.3f}'.format(
                    time_str,
                    epoch_i,
                    batch_i,
                    (len(train_X) // batch_size),
                    train_loss))
                
        #使用测试数据的迭代
        for batch_i  in range(len(test_X) // batch_size):
            x, y = next(test_batches)
            
            categories = np.zeros([batch_size, 18])
            for i in range(batch_size):
                categories[i] = x.take(6,1)[i]

            titles = np.zeros([batch_size, sentences_size])
            for i in range(batch_size):
                titles[i] = x.take(5,1)[i]

            feed = {
                uid: np.reshape(x.take(0,1), [batch_size, 1]),
                user_gender: np.reshape(x.take(2,1), [batch_size, 1]),
                user_age: np.reshape(x.take(3,1), [batch_size, 1]),
                user_job: np.reshape(x.take(4,1), [batch_size, 1]),
                movie_id: np.reshape(x.take(1,1), [batch_size, 1]),
                movie_categories: categories,  #x.take(6,1)
                movie_titles: titles,  #x.take(5,1)
                targets: np.reshape(y, [batch_size, 1]),
                dropout_keep_prob: 1,
                lr: learning_rate}
            
            step, test_loss, summaries = sess.run([global_step, loss, inference_summary_op], feed)  #cost

            #保存测试损失
            losses['test'].append(test_loss)
            inference_summary_writer.add_summary(summaries, step)  #

            time_str = datetime.datetime.now().isoformat()
            if (epoch_i * (len(test_X) // batch_size) + batch_i) % show_every_n_batches == 0:
                print('{}: Epoch {:>3} Batch {:>4}/{}   test_loss = {:.3f}'.format(
                    time_str,
                    epoch_i,
                    batch_i,
                    (len(test_X) // batch_size),
                    test_loss))

    # Save Model
    saver.save(sess, save_dir)  #, global_step=epoch_i
    print('Model Trained and Saved')

Writing to F:\jupyter\work\movie_recommender-master\runs\1554780412

2019-04-09T11:26:53.633627: Epoch   0 Batch    0/3125   train_loss = 8.810
2019-04-09T11:26:54.052240: Epoch   0 Batch   20/3125   train_loss = 3.457
2019-04-09T11:26:54.466181: Epoch   0 Batch   40/3125   train_loss = 2.563
2019-04-09T11:26:54.890814: Epoch   0 Batch   60/3125   train_loss = 1.962
2019-04-09T11:26:55.315803: Epoch   0 Batch   80/3125   train_loss = 1.852
2019-04-09T11:26:55.730125: Epoch   0 Batch  100/3125   train_loss = 1.826
2019-04-09T11:26:56.146734: Epoch   0 Batch  120/3125   train_loss = 1.781
2019-04-09T11:26:56.559145: Epoch   0 Batch  140/3125   train_loss = 1.630
2019-04-09T11:26:56.971689: Epoch   0 Batch  160/3125   train_loss = 1.652
2019-04-09T11:26:57.394125: Epoch   0 Batch  180/3125   train_loss = 1.361
2019-04-09T11:26:57.810824: Epoch   0 Batch  200/3125   train_loss = 1.715
2019-04-09T11:26:58.227455: Epoch   0 Batch  220/3125   train_loss = 1.430
2019-04-09T11:26:58.643714: Epoch   0 Batch  240/3125   train_loss = 1.342
2019-04-09T11:26:59.056816: Epoch   0 Batch  260/3125   train_loss = 1.512
2019-04-09T11:26:59.468409: Epoch   0 Batch  280/3125   train_loss = 1.678
2019-04-09T11:26:59.882126: Epoch   0 Batch  300/3125   train_loss = 1.482
2019-04-09T11:27:00.294685: Epoch   0 Batch  320/3125   train_loss = 1.463
2019-04-09T11:27:00.826546: Epoch   0 Batch  340/3125   train_loss = 1.333
2019-04-09T11:27:01.239302: Epoch   0 Batch  360/3125   train_loss = 1.318
2019-04-09T11:27:01.652219: Epoch   0 Batch  380/3125   train_loss = 1.253
2019-04-09T11:27:02.067588: Epoch   0 Batch  400/3125   train_loss = 1.155
2019-04-09T11:27:02.483490: Epoch   0 Batch  420/3125   train_loss = 1.341
2019-04-09T11:27:02.892079: Epoch   0 Batch  440/3125   train_loss = 1.429
2019-04-09T11:27:03.305331: Epoch   0 Batch  460/3125   train_loss = 1.315
2019-04-09T11:27:03.721028: Epoch   0 Batch  480/3125   train_loss = 1.351
2019-04-09T11:27:04.130622: Epoch   0 Batch  500/3125   train_loss = 1.043
2019-04-09T11:27:04.549775: Epoch   0 Batch  520/3125   train_loss = 1.340
2019-04-09T11:27:04.963936: Epoch   0 Batch  540/3125   train_loss = 1.258
2019-04-09T11:27:05.378772: Epoch   0 Batch  560/3125   train_loss = 1.474
2019-04-09T11:27:05.790245: Epoch   0 Batch  580/3125   train_loss = 1.399
2019-04-09T11:27:06.202342: Epoch   0 Batch  600/3125   train_loss = 1.374
2019-04-09T11:27:06.616239: Epoch   0 Batch  620/3125   train_loss = 1.429
2019-04-09T11:27:07.027259: Epoch   0 Batch  640/3125   train_loss = 1.346
2019-04-09T11:27:07.443480: Epoch   0 Batch  660/3125   train_loss = 1.377
2019-04-09T11:27:07.857450: Epoch   0 Batch  680/3125   train_loss = 1.191
2019-04-09T11:27:08.269326: Epoch   0 Batch  700/3125   train_loss = 1.302
2019-04-09T11:27:08.685203: Epoch   0 Batch  720/3125   train_loss = 1.171
2019-04-09T11:27:09.098769: Epoch   0 Batch  740/3125   train_loss = 1.403
2019-04-09T11:27:09.519383: Epoch   0 Batch  760/3125   train_loss = 1.369
2019-04-09T11:27:09.931100: Epoch   0 Batch  780/3125   train_loss = 1.402
2019-04-09T11:27:10.343018: Epoch   0 Batch  800/3125   train_loss = 1.250
2019-04-09T11:27:10.755994: Epoch   0 Batch  820/3125   train_loss = 1.292
2019-04-09T11:27:11.169596: Epoch   0 Batch  840/3125   train_loss = 1.215
2019-04-09T11:27:11.583017: Epoch   0 Batch  860/3125   train_loss = 1.201
2019-04-09T11:27:11.997121: Epoch   0 Batch  880/3125   train_loss = 1.189
2019-04-09T11:27:12.411392: Epoch   0 Batch  900/3125   train_loss = 1.240
2019-04-09T11:27:12.824492: Epoch   0 Batch  920/3125   train_loss = 1.220
2019-04-09T11:27:13.238173: Epoch   0 Batch  940/3125   train_loss = 1.414
2019-04-09T11:27:13.649014: Epoch   0 Batch  960/3125   train_loss = 1.332
2019-04-09T11:27:14.058947: Epoch   0 Batch  980/3125   train_loss = 1.345
2019-04-09T11:27:14.491861: Epoch   0 Batch 1000/3125   train_loss = 1.275
2019-04-09T11:27:14.920000: Epoch   0 Batch 1020/3125   train_loss = 1.341
2019-04-09T11:27:15.337096: Epoch   0 Batch 1040/3125   train_loss = 1.281
2019-04-09T11:27:15.760618: Epoch   0 Batch 1060/3125   train_loss = 1.478
2019-04-09T11:27:16.174406: Epoch   0 Batch 1080/3125   train_loss = 1.158
2019-04-09T11:27:16.591839: Epoch   0 Batch 1100/3125   train_loss = 1.268
2019-04-09T11:27:17.013498: Epoch   0 Batch 1120/3125   train_loss = 1.270
2019-04-09T11:27:17.438626: Epoch   0 Batch 1140/3125   train_loss = 1.280
2019-04-09T11:27:17.852226: Epoch   0 Batch 1160/3125   train_loss = 1.205
2019-04-09T11:27:18.273478: Epoch   0 Batch 1180/3125   train_loss = 1.274
2019-04-09T11:27:18.696339: Epoch   0 Batch 1200/3125   train_loss = 1.284
2019-04-09T11:27:19.117179: Epoch   0 Batch 1220/3125   train_loss = 1.155
2019-04-09T11:27:19.524543: Epoch   0 Batch 1240/3125   train_loss = 1.143
2019-04-09T11:27:19.938738: Epoch   0 Batch 1260/3125   train_loss = 1.247
2019-04-09T11:27:20.350656: Epoch   0 Batch 1280/3125   train_loss = 1.223
2019-04-09T11:27:20.761388: Epoch   0 Batch 1300/3125   train_loss = 1.267
2019-04-09T11:27:21.177496: Epoch   0 Batch 1320/3125   train_loss = 1.183
2019-04-09T11:27:21.590091: Epoch   0 Batch 1340/3125   train_loss = 1.047
2019-04-09T11:27:22.004788: Epoch   0 Batch 1360/3125   train_loss = 1.149
2019-04-09T11:27:22.414416: Epoch   0 Batch 1380/3125   train_loss = 1.114
2019-04-09T11:27:22.827015: Epoch   0 Batch 1400/3125   train_loss = 1.282
2019-04-09T11:27:23.236719: Epoch   0 Batch 1420/3125   train_loss = 1.256
2019-04-09T11:27:23.645758: Epoch   0 Batch 1440/3125   train_loss = 1.174
2019-04-09T11:27:24.063386: Epoch   0 Batch 1460/3125   train_loss = 1.251
2019-04-09T11:27:24.477184: Epoch   0 Batch 1480/3125   train_loss = 1.180
2019-04-09T11:27:24.890286: Epoch   0 Batch 1500/3125   train_loss = 1.322
2019-04-09T11:27:25.300422: Epoch   0 Batch 1520/3125   train_loss = 1.277
2019-04-09T11:27:25.709640: Epoch   0 Batch 1540/3125   train_loss = 1.270
2019-04-09T11:27:26.122241: Epoch   0 Batch 1560/3125   train_loss = 1.122
2019-04-09T11:27:26.534862: Epoch   0 Batch 1580/3125   train_loss = 1.138
2019-04-09T11:27:26.947461: Epoch   0 Batch 1600/3125   train_loss = 1.274
2019-04-09T11:27:27.359900: Epoch   0 Batch 1620/3125   train_loss = 1.169
2019-04-09T11:27:27.769969: Epoch   0 Batch 1640/3125   train_loss = 1.235
2019-04-09T11:27:28.180519: Epoch   0 Batch 1660/3125   train_loss = 1.282
2019-04-09T11:27:28.592653: Epoch   0 Batch 1680/3125   train_loss = 1.174
2019-04-09T11:27:29.003519: Epoch   0 Batch 1700/3125   train_loss = 1.009
2019-04-09T11:27:29.414262: Epoch   0 Batch 1720/3125   train_loss = 1.149
2019-04-09T11:27:29.828869: Epoch   0 Batch 1740/3125   train_loss = 1.221
2019-04-09T11:27:30.238773: Epoch   0 Batch 1760/3125   train_loss = 1.288
2019-04-09T11:27:30.648342: Epoch   0 Batch 1780/3125   train_loss = 1.067
2019-04-09T11:27:31.188925: Epoch   0 Batch 1800/3125   train_loss = 1.196
2019-04-09T11:27:31.603231: Epoch   0 Batch 1820/3125   train_loss = 1.142
2019-04-09T11:27:32.010926: Epoch   0 Batch 1840/3125   train_loss = 1.256
2019-04-09T11:27:32.425741: Epoch   0 Batch 1860/3125   train_loss = 1.345
2019-04-09T11:27:32.839345: Epoch   0 Batch 1880/3125   train_loss = 1.215
2019-04-09T11:27:33.248900: Epoch   0 Batch 1900/3125   train_loss = 1.048
2019-04-09T11:27:33.663116: Epoch   0 Batch 1920/3125   train_loss = 1.211
2019-04-09T11:27:34.074400: Epoch   0 Batch 1940/3125   train_loss = 1.070
2019-04-09T11:27:34.484302: Epoch   0 Batch 1960/3125   train_loss = 1.131
2019-04-09T11:27:34.894396: Epoch   0 Batch 1980/3125   train_loss = 1.196
2019-04-09T11:27:35.306864: Epoch   0 Batch 2000/3125   train_loss = 1.347
2019-04-09T11:27:35.722043: Epoch   0 Batch 2020/3125   train_loss = 1.297
2019-04-09T11:27:36.135143: Epoch   0 Batch 2040/3125   train_loss = 1.180
2019-04-09T11:27:36.543475: Epoch   0 Batch 2060/3125   train_loss = 1.025
2019-04-09T11:27:36.953066: Epoch   0 Batch 2080/3125   train_loss = 1.265
2019-04-09T11:27:37.370478: Epoch   0 Batch 2100/3125   train_loss = 1.094
2019-04-09T11:27:37.782974: Epoch   0 Batch 2120/3125   train_loss = 1.069
2019-04-09T11:27:38.190560: Epoch   0 Batch 2140/3125   train_loss = 1.132
2019-04-09T11:27:38.604746: Epoch   0 Batch 2160/3125   train_loss = 1.122
2019-04-09T11:27:39.019245: Epoch   0 Batch 2180/3125   train_loss = 1.166
2019-04-09T11:27:39.431946: Epoch   0 Batch 2200/3125   train_loss = 1.137
2019-04-09T11:27:39.847258: Epoch   0 Batch 2220/3125   train_loss = 1.118
2019-04-09T11:27:40.256398: Epoch   0 Batch 2240/3125   train_loss = 1.011
2019-04-09T11:27:40.665478: Epoch   0 Batch 2260/3125   train_loss = 1.160
2019-04-09T11:27:41.078758: Epoch   0 Batch 2280/3125   train_loss = 1.164
2019-04-09T11:27:41.489744: Epoch   0 Batch 2300/3125   train_loss = 1.163
2019-04-09T11:27:41.901845: Epoch   0 Batch 2320/3125   train_loss = 1.288
2019-04-09T11:27:42.312713: Epoch   0 Batch 2340/3125   train_loss = 1.177
2019-04-09T11:27:42.725320: Epoch   0 Batch 2360/3125   train_loss = 1.130
2019-04-09T11:27:43.132848: Epoch   0 Batch 2380/3125   train_loss = 1.163
2019-04-09T11:27:43.541373: Epoch   0 Batch 2400/3125   train_loss = 1.231
2019-04-09T11:27:43.947189: Epoch   0 Batch 2420/3125   train_loss = 1.133
2019-04-09T11:27:44.355782: Epoch   0 Batch 2440/3125   train_loss = 1.272
2019-04-09T11:27:44.768420: Epoch   0 Batch 2460/3125   train_loss = 1.128
2019-04-09T11:27:45.177740: Epoch   0 Batch 2480/3125   train_loss = 1.184
2019-04-09T11:27:45.584471: Epoch   0 Batch 2500/3125   train_loss = 1.161
2019-04-09T11:27:45.993960: Epoch   0 Batch 2520/3125   train_loss = 1.055
2019-04-09T11:27:46.402164: Epoch   0 Batch 2540/3125   train_loss = 1.108
2019-04-09T11:27:46.812056: Epoch   0 Batch 2560/3125   train_loss = 0.977
2019-04-09T11:27:47.230169: Epoch   0 Batch 2580/3125   train_loss = 1.101
2019-04-09T11:27:47.639261: Epoch   0 Batch 2600/3125   train_loss = 1.141
2019-04-09T11:27:48.047294: Epoch   0 Batch 2620/3125   train_loss = 1.098
2019-04-09T11:27:48.457188: Epoch   0 Batch 2640/3125   train_loss = 1.096
2019-04-09T11:27:48.870683: Epoch   0 Batch 2660/3125   train_loss = 1.241
2019-04-09T11:27:49.282413: Epoch   0 Batch 2680/3125   train_loss = 1.001
2019-04-09T11:27:49.690957: Epoch   0 Batch 2700/3125   train_loss = 1.266
2019-04-09T11:27:50.103555: Epoch   0 Batch 2720/3125   train_loss = 1.158
2019-04-09T11:27:50.514897: Epoch   0 Batch 2740/3125   train_loss = 1.210
2019-04-09T11:27:50.924909: Epoch   0 Batch 2760/3125   train_loss = 1.234
2019-04-09T11:27:51.336251: Epoch   0 Batch 2780/3125   train_loss = 1.121
2019-04-09T11:27:51.748175: Epoch   0 Batch 2800/3125   train_loss = 1.377
2019-04-09T11:27:52.164028: Epoch   0 Batch 2820/3125   train_loss = 1.417
2019-04-09T11:27:52.583020: Epoch   0 Batch 2840/3125   train_loss = 1.146
2019-04-09T11:27:53.001214: Epoch   0 Batch 2860/3125   train_loss = 1.067
2019-04-09T11:27:53.413084: Epoch   0 Batch 2880/3125   train_loss = 1.160
2019-04-09T11:27:53.830194: Epoch   0 Batch 2900/3125   train_loss = 1.134
2019-04-09T11:27:54.242290: Epoch   0 Batch 2920/3125   train_loss = 1.188
2019-04-09T11:27:54.657395: Epoch   0 Batch 2940/3125   train_loss = 1.103
2019-04-09T11:27:55.066253: Epoch   0 Batch 2960/3125   train_loss = 1.222
2019-04-09T11:27:55.476481: Epoch   0 Batch 2980/3125   train_loss = 1.197
2019-04-09T11:27:55.891054: Epoch   0 Batch 3000/3125   train_loss = 1.123
2019-04-09T11:27:56.299092: Epoch   0 Batch 3020/3125   train_loss = 1.213
2019-04-09T11:27:56.709737: Epoch   0 Batch 3040/3125   train_loss = 1.128
2019-04-09T11:27:57.121834: Epoch   0 Batch 3060/3125   train_loss = 1.174
2019-04-09T11:27:57.537893: Epoch   0 Batch 3080/3125   train_loss = 1.253
2019-04-09T11:27:57.945981: Epoch   0 Batch 3100/3125   train_loss = 1.169
2019-04-09T11:27:58.355315: Epoch   0 Batch 3120/3125   train_loss = 1.011
2019-04-09T11:27:58.525868: Epoch   0 Batch    0/781   test_loss = 1.003
2019-04-09T11:27:58.655211: Epoch   0 Batch   20/781   test_loss = 1.118
2019-04-09T11:27:58.785057: Epoch   0 Batch   40/781   test_loss = 0.975
2019-04-09T11:27:58.914903: Epoch   0 Batch   60/781   test_loss = 1.317
2019-04-09T11:27:59.043746: Epoch   0 Batch   80/781   test_loss = 1.261
2019-04-09T11:27:59.172589: Epoch   0 Batch  100/781   test_loss = 1.333
2019-04-09T11:27:59.301431: Epoch   0 Batch  120/781   test_loss = 1.186
2019-04-09T11:27:59.429434: Epoch   0 Batch  140/781   test_loss = 1.192
2019-04-09T11:27:59.557775: Epoch   0 Batch  160/781   test_loss = 1.259
2019-04-09T11:27:59.685114: Epoch   0 Batch  180/781   test_loss = 1.189
2019-04-09T11:27:59.813455: Epoch   0 Batch  200/781   test_loss = 1.093
2019-04-09T11:27:59.939791: Epoch   0 Batch  220/781   test_loss = 0.963
2019-04-09T11:28:00.066629: Epoch   0 Batch  240/781   test_loss = 1.173
2019-04-09T11:28:00.194468: Epoch   0 Batch  260/781   test_loss = 1.160
2019-04-09T11:28:00.321306: Epoch   0 Batch  280/781   test_loss = 1.354
2019-04-09T11:28:00.448551: Epoch   0 Batch  300/781   test_loss = 1.140
2019-04-09T11:28:00.576892: Epoch   0 Batch  320/781   test_loss = 1.270
2019-04-09T11:28:00.705735: Epoch   0 Batch  340/781   test_loss = 0.836
2019-04-09T11:28:00.832572: Epoch   0 Batch  360/781   test_loss = 1.297
2019-04-09T11:28:00.961415: Epoch   0 Batch  380/781   test_loss = 1.141
2019-04-09T11:28:01.090257: Epoch   0 Batch  400/781   test_loss = 1.135
2019-04-09T11:28:01.217095: Epoch   0 Batch  420/781   test_loss = 0.986
2019-04-09T11:28:01.344936: Epoch   0 Batch  440/781   test_loss = 1.153
2019-04-09T11:28:01.472184: Epoch   0 Batch  460/781   test_loss = 1.084
2019-04-09T11:28:01.599021: Epoch   0 Batch  480/781   test_loss = 1.101
2019-04-09T11:28:01.726862: Epoch   0 Batch  500/781   test_loss = 0.917
2019-04-09T11:28:01.854702: Epoch   0 Batch  520/781   test_loss = 1.127
2019-04-09T11:28:01.980536: Epoch   0 Batch  540/781   test_loss = 1.025
2019-04-09T11:28:02.108377: Epoch   0 Batch  560/781   test_loss = 1.267
2019-04-09T11:28:02.235214: Epoch   0 Batch  580/781   test_loss = 1.131
2019-04-09T11:28:02.362552: Epoch   0 Batch  600/781   test_loss = 1.179
2019-04-09T11:28:02.490387: Epoch   0 Batch  620/781   test_loss = 1.140
2019-04-09T11:28:02.617224: Epoch   0 Batch  640/781   test_loss = 1.194
2019-04-09T11:28:02.744563: Epoch   0 Batch  660/781   test_loss = 1.135
2019-04-09T11:28:02.875411: Epoch   0 Batch  680/781   test_loss = 1.403
2019-04-09T11:28:03.002248: Epoch   0 Batch  700/781   test_loss = 1.109
2019-04-09T11:28:03.130089: Epoch   0 Batch  720/781   test_loss = 1.243
2019-04-09T11:28:03.256926: Epoch   0 Batch  740/781   test_loss = 1.118
2019-04-09T11:28:03.383769: Epoch   0 Batch  760/781   test_loss = 1.098
2019-04-09T11:28:03.510695: Epoch   0 Batch  780/781   test_loss = 1.155
2019-04-09T11:28:04.289124: Epoch   1 Batch   15/3125   train_loss = 1.266
2019-04-09T11:28:04.711410: Epoch   1 Batch   35/3125   train_loss = 1.142
2019-04-09T11:28:05.124010: Epoch   1 Batch   55/3125   train_loss = 1.165
2019-04-09T11:28:05.539135: Epoch   1 Batch   75/3125   train_loss = 1.079
2019-04-09T11:28:05.955033: Epoch   1 Batch   95/3125   train_loss = 0.929
2019-04-09T11:28:06.374924: Epoch   1 Batch  115/3125   train_loss = 1.166
2019-04-09T11:28:06.784549: Epoch   1 Batch  135/3125   train_loss = 1.015
2019-04-09T11:28:07.202663: Epoch   1 Batch  155/3125   train_loss = 1.129
2019-04-09T11:28:07.622296: Epoch   1 Batch  175/3125   train_loss = 1.051
2019-04-09T11:28:08.044004: Epoch   1 Batch  195/3125   train_loss = 1.215
2019-04-09T11:28:08.464873: Epoch   1 Batch  215/3125   train_loss = 1.127
2019-04-09T11:28:08.882758: Epoch   1 Batch  235/3125   train_loss = 1.092
2019-04-09T11:28:09.302399: Epoch   1 Batch  255/3125   train_loss = 1.211
2019-04-09T11:28:09.718143: Epoch   1 Batch  275/3125   train_loss = 1.005
2019-04-09T11:28:10.135755: Epoch   1 Batch  295/3125   train_loss = 0.973
2019-04-09T11:28:10.556105: Epoch   1 Batch  315/3125   train_loss = 1.039
2019-04-09T11:28:10.968219: Epoch   1 Batch  335/3125   train_loss = 0.990
2019-04-09T11:28:11.382497: Epoch   1 Batch  355/3125   train_loss = 1.110
2019-04-09T11:28:11.792475: Epoch   1 Batch  375/3125   train_loss = 1.187
2019-04-09T11:28:12.203571: Epoch   1 Batch  395/3125   train_loss = 1.056
2019-04-09T11:28:12.616848: Epoch   1 Batch  415/3125   train_loss = 1.314
2019-04-09T11:28:13.031510: Epoch   1 Batch  435/3125   train_loss = 1.136
2019-04-09T11:28:13.442848: Epoch   1 Batch  455/3125   train_loss = 1.054
2019-04-09T11:28:13.860246: Epoch   1 Batch  475/3125   train_loss = 1.144
2019-04-09T11:28:14.274154: Epoch   1 Batch  495/3125   train_loss = 1.056
2019-04-09T11:28:14.692507: Epoch   1 Batch  515/3125   train_loss = 1.161
2019-04-09T11:28:15.109092: Epoch   1 Batch  535/3125   train_loss = 1.140
2019-04-09T11:28:15.524725: Epoch   1 Batch  555/3125   train_loss = 1.257
2019-04-09T11:28:15.938088: Epoch   1 Batch  575/3125   train_loss = 1.070
2019-04-09T11:28:16.350862: Epoch   1 Batch  595/3125   train_loss = 1.285
2019-04-09T11:28:16.761759: Epoch   1 Batch  615/3125   train_loss = 1.101
2019-04-09T11:28:17.182378: Epoch   1 Batch  635/3125   train_loss = 1.138
2019-04-09T11:28:17.599235: Epoch   1 Batch  655/3125   train_loss = 1.057
2019-04-09T11:28:18.019362: Epoch   1 Batch  675/3125   train_loss = 0.876
2019-04-09T11:28:18.438108: Epoch   1 Batch  695/3125   train_loss = 1.045
2019-04-09T11:28:18.849900: Epoch   1 Batch  715/3125   train_loss = 1.098
2019-04-09T11:28:19.261195: Epoch   1 Batch  735/3125   train_loss = 0.914
2019-04-09T11:28:19.812365: Epoch   1 Batch  755/3125   train_loss = 1.162
2019-04-09T11:28:20.222217: Epoch   1 Batch  775/3125   train_loss = 0.998
2019-04-09T11:28:20.645987: Epoch   1 Batch  795/3125   train_loss = 1.218
2019-04-09T11:28:21.064302: Epoch   1 Batch  815/3125   train_loss = 1.102
2019-04-09T11:28:21.482799: Epoch   1 Batch  835/3125   train_loss = 1.071
2019-04-09T11:28:21.907954: Epoch   1 Batch  855/3125   train_loss = 1.297
2019-04-09T11:28:22.327483: Epoch   1 Batch  875/3125   train_loss = 1.248
2019-04-09T11:28:22.741550: Epoch   1 Batch  895/3125   train_loss = 1.080
2019-04-09T11:28:23.157659: Epoch   1 Batch  915/3125   train_loss = 1.059
2019-04-09T11:28:23.571202: Epoch   1 Batch  935/3125   train_loss = 1.163
2019-04-09T11:28:23.984586: Epoch   1 Batch  955/3125   train_loss = 1.102
2019-04-09T11:28:24.396511: Epoch   1 Batch  975/3125   train_loss = 1.100
2019-04-09T11:28:24.824835: Epoch   1 Batch  995/3125   train_loss = 0.890
2019-04-09T11:28:25.242948: Epoch   1 Batch 1015/3125   train_loss = 1.077
2019-04-09T11:28:25.659444: Epoch   1 Batch 1035/3125   train_loss = 1.090
2019-04-09T11:28:26.076601: Epoch   1 Batch 1055/3125   train_loss = 1.154
2019-04-09T11:28:26.489531: Epoch   1 Batch 1075/3125   train_loss = 1.004
2019-04-09T11:28:26.897455: Epoch   1 Batch 1095/3125   train_loss = 1.012
2019-04-09T11:28:27.320553: Epoch   1 Batch 1115/3125   train_loss = 1.165
2019-04-09T11:28:27.739517: Epoch   1 Batch 1135/3125   train_loss = 1.029
2019-04-09T11:28:28.156628: Epoch   1 Batch 1155/3125   train_loss = 1.117
2019-04-09T11:28:28.570595: Epoch   1 Batch 1175/3125   train_loss = 1.103
2019-04-09T11:28:28.980586: Epoch   1 Batch 1195/3125   train_loss = 1.250
2019-04-09T11:28:29.393619: Epoch   1 Batch 1215/3125   train_loss = 0.930
2019-04-09T11:28:29.809238: Epoch   1 Batch 1235/3125   train_loss = 1.077
2019-04-09T11:28:30.219331: Epoch   1 Batch 1255/3125   train_loss = 1.089
2019-04-09T11:28:30.627580: Epoch   1 Batch 1275/3125   train_loss = 1.000
2019-04-09T11:28:31.035136: Epoch   1 Batch 1295/3125   train_loss = 1.006
2019-04-09T11:28:31.448626: Epoch   1 Batch 1315/3125   train_loss = 1.210
2019-04-09T11:28:31.948769: Epoch   1 Batch 1335/3125   train_loss = 1.045
2019-04-09T11:28:32.356933: Epoch   1 Batch 1355/3125   train_loss = 1.058
2019-04-09T11:28:32.771030: Epoch   1 Batch 1375/3125   train_loss = 1.110
2019-04-09T11:28:33.184133: Epoch   1 Batch 1395/3125   train_loss = 1.008
2019-04-09T11:28:33.596132: Epoch   1 Batch 1415/3125   train_loss = 1.086
2019-04-09T11:28:34.007114: Epoch   1 Batch 1435/3125   train_loss = 1.221
2019-04-09T11:28:34.419967: Epoch   1 Batch 1455/3125   train_loss = 1.241
2019-04-09T11:28:34.829988: Epoch   1 Batch 1475/3125   train_loss = 1.154
2019-04-09T11:28:35.241458: Epoch   1 Batch 1495/3125   train_loss = 1.102
2019-04-09T11:28:35.650228: Epoch   1 Batch 1515/3125   train_loss = 0.990
2019-04-09T11:28:36.060708: Epoch   1 Batch 1535/3125   train_loss = 0.907
2019-04-09T11:28:36.472293: Epoch   1 Batch 1555/3125   train_loss = 1.079
2019-04-09T11:28:36.880701: Epoch   1 Batch 1575/3125   train_loss = 0.986
2019-04-09T11:28:37.298235: Epoch   1 Batch 1595/3125   train_loss = 1.052
2019-04-09T11:28:37.710706: Epoch   1 Batch 1615/3125   train_loss = 1.025
2019-04-09T11:28:38.118793: Epoch   1 Batch 1635/3125   train_loss = 1.146
2019-04-09T11:28:38.533452: Epoch   1 Batch 1655/3125   train_loss = 1.123
2019-04-09T11:28:38.948779: Epoch   1 Batch 1675/3125   train_loss = 0.976
2019-04-09T11:28:39.359489: Epoch   1 Batch 1695/3125   train_loss = 1.035
2019-04-09T11:28:39.766989: Epoch   1 Batch 1715/3125   train_loss = 0.945
2019-04-09T11:28:40.179589: Epoch   1 Batch 1735/3125   train_loss = 1.174
2019-04-09T11:28:40.590375: Epoch   1 Batch 1755/3125   train_loss = 1.027
2019-04-09T11:28:40.998865: Epoch   1 Batch 1775/3125   train_loss = 1.026
2019-04-09T11:28:41.408017: Epoch   1 Batch 1795/3125   train_loss = 0.981
2019-04-09T11:28:41.821620: Epoch   1 Batch 1815/3125   train_loss = 0.966
2019-04-09T11:28:42.229169: Epoch   1 Batch 1835/3125   train_loss = 1.074
2019-04-09T11:28:42.642918: Epoch   1 Batch 1855/3125   train_loss = 0.959
2019-04-09T11:28:43.154530: Epoch   1 Batch 1875/3125   train_loss = 1.213
2019-04-09T11:28:43.560385: Epoch   1 Batch 1895/3125   train_loss = 0.935
2019-04-09T11:28:43.974210: Epoch   1 Batch 1915/3125   train_loss = 0.973
2019-04-09T11:28:44.393618: Epoch   1 Batch 1935/3125   train_loss = 1.016
2019-04-09T11:28:44.808725: Epoch   1 Batch 1955/3125   train_loss = 1.006
2019-04-09T11:28:45.224542: Epoch   1 Batch 1975/3125   train_loss = 1.036
2019-04-09T11:28:45.638372: Epoch   1 Batch 1995/3125   train_loss = 1.130
2019-04-09T11:28:46.050876: Epoch   1 Batch 2015/3125   train_loss = 1.092
2019-04-09T11:28:46.466638: Epoch   1 Batch 2035/3125   train_loss = 1.163
2019-04-09T11:28:46.877782: Epoch   1 Batch 2055/3125   train_loss = 0.961
2019-04-09T11:28:47.297977: Epoch   1 Batch 2075/3125   train_loss = 1.154
2019-04-09T11:28:47.707362: Epoch   1 Batch 2095/3125   train_loss = 1.007
2019-04-09T11:28:48.119961: Epoch   1 Batch 2115/3125   train_loss = 1.150
2019-04-09T11:28:48.536958: Epoch   1 Batch 2135/3125   train_loss = 1.026
2019-04-09T11:28:48.955579: Epoch   1 Batch 2155/3125   train_loss = 1.008
2019-04-09T11:28:49.371992: Epoch   1 Batch 2175/3125   train_loss = 1.028
2019-04-09T11:28:49.785513: Epoch   1 Batch 2195/3125   train_loss = 1.013
2019-04-09T11:28:50.199116: Epoch   1 Batch 2215/3125   train_loss = 1.034
2019-04-09T11:28:50.609969: Epoch   1 Batch 2235/3125   train_loss = 1.184
2019-04-09T11:28:51.023581: Epoch   1 Batch 2255/3125   train_loss = 1.135
2019-04-09T11:28:51.436197: Epoch   1 Batch 2275/3125   train_loss = 0.936
2019-04-09T11:28:51.854318: Epoch   1 Batch 2295/3125   train_loss = 1.230
2019-04-09T11:28:52.266593: Epoch   1 Batch 2315/3125   train_loss = 1.180
2019-04-09T11:28:53.027310: Epoch   1 Batch 2335/3125   train_loss = 1.068
2019-04-09T11:28:53.443572: Epoch   1 Batch 2355/3125   train_loss = 1.021
2019-04-09T11:28:53.859233: Epoch   1 Batch 2375/3125   train_loss = 1.241
2019-04-09T11:28:54.268702: Epoch   1 Batch 2395/3125   train_loss = 1.022
2019-04-09T11:28:54.684586: Epoch   1 Batch 2415/3125   train_loss = 1.062
2019-04-09T11:28:55.104188: Epoch   1 Batch 2435/3125   train_loss = 0.978
2019-04-09T11:28:55.517661: Epoch   1 Batch 2455/3125   train_loss = 1.075
2019-04-09T11:28:55.940375: Epoch   1 Batch 2475/3125   train_loss = 0.997
2019-04-09T11:28:56.355446: Epoch   1 Batch 2495/3125   train_loss = 0.991
2019-04-09T11:28:56.767784: Epoch   1 Batch 2515/3125   train_loss = 1.057
2019-04-09T11:28:57.185487: Epoch   1 Batch 2535/3125   train_loss = 1.064
2019-04-09T11:28:57.599402: Epoch   1 Batch 2555/3125   train_loss = 0.883
2019-04-09T11:28:58.012436: Epoch   1 Batch 2575/3125   train_loss = 0.914
2019-04-09T11:28:58.427098: Epoch   1 Batch 2595/3125   train_loss = 0.934
2019-04-09T11:28:58.836389: Epoch   1 Batch 2615/3125   train_loss = 1.151
2019-04-09T11:28:59.262074: Epoch   1 Batch 2635/3125   train_loss = 1.017
2019-04-09T11:28:59.680762: Epoch   1 Batch 2655/3125   train_loss = 1.036
2019-04-09T11:29:00.094884: Epoch   1 Batch 2675/3125   train_loss = 0.960
2019-04-09T11:29:00.510614: Epoch   1 Batch 2695/3125   train_loss = 1.031
2019-04-09T11:29:00.925679: Epoch   1 Batch 2715/3125   train_loss = 1.011
2019-04-09T11:29:01.343105: Epoch   1 Batch 2735/3125   train_loss = 0.876
2019-04-09T11:29:01.762199: Epoch   1 Batch 2755/3125   train_loss = 1.087
2019-04-09T11:29:02.171790: Epoch   1 Batch 2775/3125   train_loss = 1.101
2019-04-09T11:29:02.585480: Epoch   1 Batch 2795/3125   train_loss = 1.064
2019-04-09T11:29:02.995887: Epoch   1 Batch 2815/3125   train_loss = 0.981
2019-04-09T11:29:03.414306: Epoch   1 Batch 2835/3125   train_loss = 1.123
2019-04-09T11:29:03.824405: Epoch   1 Batch 2855/3125   train_loss = 1.069
2019-04-09T11:29:04.236239: Epoch   1 Batch 2875/3125   train_loss = 1.006
2019-04-09T11:29:04.644747: Epoch   1 Batch 2895/3125   train_loss = 1.013
2019-04-09T11:29:05.058545: Epoch   1 Batch 2915/3125   train_loss = 0.985
2019-04-09T11:29:05.473539: Epoch   1 Batch 2935/3125   train_loss = 1.152
2019-04-09T11:29:05.881997: Epoch   1 Batch 2955/3125   train_loss = 1.015
2019-04-09T11:29:06.294405: Epoch   1 Batch 2975/3125   train_loss = 0.977
2019-04-09T11:29:06.707933: Epoch   1 Batch 2995/3125   train_loss = 0.928
2019-04-09T11:29:07.122537: Epoch   1 Batch 3015/3125   train_loss = 1.033
2019-04-09T11:29:07.534921: Epoch   1 Batch 3035/3125   train_loss = 1.097
2019-04-09T11:29:07.945410: Epoch   1 Batch 3055/3125   train_loss = 1.058
2019-04-09T11:29:08.355520: Epoch   1 Batch 3075/3125   train_loss = 1.009
2019-04-09T11:29:08.775390: Epoch   1 Batch 3095/3125   train_loss = 0.946
2019-04-09T11:29:09.190497: Epoch   1 Batch 3115/3125   train_loss = 0.919
2019-04-09T11:29:09.605177: Epoch   1 Batch   19/781   test_loss = 1.005
2019-04-09T11:29:09.737030: Epoch   1 Batch   39/781   test_loss = 0.844
2019-04-09T11:29:09.863600: Epoch   1 Batch   59/781   test_loss = 0.955
2019-04-09T11:29:09.991439: Epoch   1 Batch   79/781   test_loss = 0.980
2019-04-09T11:29:10.118778: Epoch   1 Batch   99/781   test_loss = 0.997
2019-04-09T11:29:10.246117: Epoch   1 Batch  119/781   test_loss = 0.996
2019-04-09T11:29:10.374962: Epoch   1 Batch  139/781   test_loss = 0.988
2019-04-09T11:29:10.503975: Epoch   1 Batch  159/781   test_loss = 0.970
2019-04-09T11:29:10.630812: Epoch   1 Batch  179/781   test_loss = 0.950
2019-04-09T11:29:10.758151: Epoch   1 Batch  199/781   test_loss = 0.939
2019-04-09T11:29:10.885992: Epoch   1 Batch  219/781   test_loss = 0.993
2019-04-09T11:29:11.014332: Epoch   1 Batch  239/781   test_loss = 1.237
2019-04-09T11:29:11.141671: Epoch   1 Batch  259/781   test_loss = 0.976
2019-04-09T11:29:11.270013: Epoch   1 Batch  279/781   test_loss = 1.069
2019-04-09T11:29:11.399713: Epoch   1 Batch  299/781   test_loss = 1.209
2019-04-09T11:29:11.531062: Epoch   1 Batch  319/781   test_loss = 0.913
2019-04-09T11:29:11.661408: Epoch   1 Batch  339/781   test_loss = 0.906
2019-04-09T11:29:11.787744: Epoch   1 Batch  359/781   test_loss = 0.924
2019-04-09T11:29:11.914581: Epoch   1 Batch  379/781   test_loss = 1.030
2019-04-09T11:29:12.043424: Epoch   1 Batch  399/781   test_loss = 0.912
2019-04-09T11:29:12.171264: Epoch   1 Batch  419/781   test_loss = 0.959
2019-04-09T11:29:12.300107: Epoch   1 Batch  439/781   test_loss = 1.026
2019-04-09T11:29:12.428123: Epoch   1 Batch  459/781   test_loss = 1.085
2019-04-09T11:29:12.553965: Epoch   1 Batch  479/781   test_loss = 1.054
2019-04-09T11:29:12.683302: Epoch   1 Batch  499/781   test_loss = 0.919
2019-04-09T11:29:12.810139: Epoch   1 Batch  519/781   test_loss = 1.083
2019-04-09T11:29:12.939483: Epoch   1 Batch  539/781   test_loss = 0.888
2019-04-09T11:29:13.066822: Epoch   1 Batch  559/781   test_loss = 1.165
2019-04-09T11:29:13.195164: Epoch   1 Batch  579/781   test_loss = 1.014
2019-04-09T11:29:13.321500: Epoch   1 Batch  599/781   test_loss = 0.975
2019-04-09T11:29:13.449045: Epoch   1 Batch  619/781   test_loss = 1.152
2019-04-09T11:29:13.578390: Epoch   1 Batch  639/781   test_loss = 0.881
2019-04-09T11:29:13.706229: Epoch   1 Batch  659/781   test_loss = 1.086
2019-04-09T11:29:13.834069: Epoch   1 Batch  679/781   test_loss = 1.149
2019-04-09T11:29:13.964416: Epoch   1 Batch  699/781   test_loss = 0.888
2019-04-09T11:29:14.094763: Epoch   1 Batch  719/781   test_loss = 0.940
2019-04-09T11:29:14.223606: Epoch   1 Batch  739/781   test_loss = 1.001
2019-04-09T11:29:14.350443: Epoch   1 Batch  759/781   test_loss = 0.925
2019-04-09T11:29:14.479091: Epoch   1 Batch  779/781   test_loss = 0.786
2019-04-09T11:29:15.169929: Epoch   2 Batch   10/3125   train_loss = 0.962
2019-04-09T11:29:15.585033: Epoch   2 Batch   30/3125   train_loss = 0.921
2019-04-09T11:29:16.090936: Epoch   2 Batch   50/3125   train_loss = 1.098
2019-04-09T11:29:16.504056: Epoch   2 Batch   70/3125   train_loss = 1.066
2019-04-09T11:29:16.916616: Epoch   2 Batch   90/3125   train_loss = 1.065
2019-04-09T11:29:17.335995: Epoch   2 Batch  110/3125   train_loss = 0.908
2019-04-09T11:29:17.744923: Epoch   2 Batch  130/3125   train_loss = 0.927
2019-04-09T11:29:18.156518: Epoch   2 Batch  150/3125   train_loss = 1.094
2019-04-09T11:29:18.572814: Epoch   2 Batch  170/3125   train_loss = 1.062
2019-04-09T11:29:18.979180: Epoch   2 Batch  190/3125   train_loss = 1.043
2019-04-09T11:29:19.392758: Epoch   2 Batch  210/3125   train_loss = 0.920
2019-04-09T11:29:19.806360: Epoch   2 Batch  230/3125   train_loss = 0.990
2019-04-09T11:29:20.213864: Epoch   2 Batch  250/3125   train_loss = 0.956
2019-04-09T11:29:20.624843: Epoch   2 Batch  270/3125   train_loss = 0.816
2019-04-09T11:29:21.034399: Epoch   2 Batch  290/3125   train_loss = 1.029
2019-04-09T11:29:21.450506: Epoch   2 Batch  310/3125   train_loss = 1.039
2019-04-09T11:29:21.860168: Epoch   2 Batch  330/3125   train_loss = 0.981
2019-04-09T11:29:22.268774: Epoch   2 Batch  350/3125   train_loss = 0.927
2019-04-09T11:29:22.681125: Epoch   2 Batch  370/3125   train_loss = 1.157
2019-04-09T11:29:23.092834: Epoch   2 Batch  390/3125   train_loss = 1.131
2019-04-09T11:29:23.503543: Epoch   2 Batch  410/3125   train_loss = 0.945
2019-04-09T11:29:23.913894: Epoch   2 Batch  430/3125   train_loss = 1.121
2019-04-09T11:29:24.324622: Epoch   2 Batch  450/3125   train_loss = 0.925
2019-04-09T11:29:24.740883: Epoch   2 Batch  470/3125   train_loss = 0.952
2019-04-09T11:29:25.150474: Epoch   2 Batch  490/3125   train_loss = 1.031
2019-04-09T11:29:25.566388: Epoch   2 Batch  510/3125   train_loss = 1.045
2019-04-09T11:29:25.981499: Epoch   2 Batch  530/3125   train_loss = 0.936
2019-04-09T11:29:26.427824: Epoch   2 Batch  550/3125   train_loss = 1.041
2019-04-09T11:29:26.844394: Epoch   2 Batch  570/3125   train_loss = 1.175
2019-04-09T11:29:27.262411: Epoch   2 Batch  590/3125   train_loss = 1.093
2019-04-09T11:29:27.677138: Epoch   2 Batch  610/3125   train_loss = 0.941
2019-04-09T11:29:28.088132: Epoch   2 Batch  630/3125   train_loss = 1.067
2019-04-09T11:29:28.504546: Epoch   2 Batch  650/3125   train_loss = 1.015
2019-04-09T11:29:28.919901: Epoch   2 Batch  670/3125   train_loss = 0.921
2019-04-09T11:29:29.332525: Epoch   2 Batch  690/3125   train_loss = 0.946
2019-04-09T11:29:29.752401: Epoch   2 Batch  710/3125   train_loss = 0.958
2019-04-09T11:29:30.169512: Epoch   2 Batch  730/3125   train_loss = 0.833
2019-04-09T11:29:30.581918: Epoch   2 Batch  750/3125   train_loss = 0.983
2019-04-09T11:29:30.990078: Epoch   2 Batch  770/3125   train_loss = 0.882
2019-04-09T11:29:31.401819: Epoch   2 Batch  790/3125   train_loss = 0.922
2019-04-09T11:29:31.821438: Epoch   2 Batch  810/3125   train_loss = 0.843
2019-04-09T11:29:32.231582: Epoch   2 Batch  830/3125   train_loss = 0.875
2019-04-09T11:29:32.646142: Epoch   2 Batch  850/3125   train_loss = 1.077
2019-04-09T11:29:33.064808: Epoch   2 Batch  870/3125   train_loss = 0.952
2019-04-09T11:29:33.477008: Epoch   2 Batch  890/3125   train_loss = 0.888
2019-04-09T11:29:33.887466: Epoch   2 Batch  910/3125   train_loss = 1.012
2019-04-09T11:29:34.298086: Epoch   2 Batch  930/3125   train_loss = 0.959
2019-04-09T11:29:34.715677: Epoch   2 Batch  950/3125   train_loss = 0.975
2019-04-09T11:29:35.130281: Epoch   2 Batch  970/3125   train_loss = 1.050
2019-04-09T11:29:35.544737: Epoch   2 Batch  990/3125   train_loss = 0.864
2019-04-09T11:29:35.958160: Epoch   2 Batch 1010/3125   train_loss = 1.084
2019-04-09T11:29:36.371777: Epoch   2 Batch 1030/3125   train_loss = 0.946
2019-04-09T11:29:36.780334: Epoch   2 Batch 1050/3125   train_loss = 1.009
2019-04-09T11:29:37.193936: Epoch   2 Batch 1070/3125   train_loss = 0.981
2019-04-09T11:29:37.603917: Epoch   2 Batch 1090/3125   train_loss = 1.081
2019-04-09T11:29:38.014688: Epoch   2 Batch 1110/3125   train_loss = 1.080
2019-04-09T11:29:38.435423: Epoch   2 Batch 1130/3125   train_loss = 0.920
2019-04-09T11:29:38.848851: Epoch   2 Batch 1150/3125   train_loss = 0.949
2019-04-09T11:29:39.260649: Epoch   2 Batch 1170/3125   train_loss = 0.944
2019-04-09T11:29:39.676982: Epoch   2 Batch 1190/3125   train_loss = 1.046
2019-04-09T11:29:40.089421: Epoch   2 Batch 1210/3125   train_loss = 0.873
2019-04-09T11:29:40.501075: Epoch   2 Batch 1230/3125   train_loss = 0.862
2019-04-09T11:29:40.912917: Epoch   2 Batch 1250/3125   train_loss = 0.963
2019-04-09T11:29:41.331306: Epoch   2 Batch 1270/3125   train_loss = 1.041
2019-04-09T11:29:41.745589: Epoch   2 Batch 1290/3125   train_loss = 0.935
2019-04-09T11:29:42.155682: Epoch   2 Batch 1310/3125   train_loss = 1.011
2019-04-09T11:29:42.565230: Epoch   2 Batch 1330/3125   train_loss = 1.089
2019-04-09T11:29:42.972821: Epoch   2 Batch 1350/3125   train_loss = 0.929
2019-04-09T11:29:43.384313: Epoch   2 Batch 1370/3125   train_loss = 0.871
2019-04-09T11:29:43.800679: Epoch   2 Batch 1390/3125   train_loss = 1.056
2019-04-09T11:29:44.212277: Epoch   2 Batch 1410/3125   train_loss = 0.956
2019-04-09T11:29:44.622595: Epoch   2 Batch 1430/3125   train_loss = 0.991
2019-04-09T11:29:45.030926: Epoch   2 Batch 1450/3125   train_loss = 1.019
2019-04-09T11:29:45.446118: Epoch   2 Batch 1470/3125   train_loss = 1.018
2019-04-09T11:29:45.858249: Epoch   2 Batch 1490/3125   train_loss = 1.025
2019-04-09T11:29:46.264877: Epoch   2 Batch 1510/3125   train_loss = 0.987
2019-04-09T11:29:46.680210: Epoch   2 Batch 1530/3125   train_loss = 1.077
2019-04-09T11:29:47.097122: Epoch   2 Batch 1550/3125   train_loss = 0.871
2019-04-09T11:29:47.505701: Epoch   2 Batch 1570/3125   train_loss = 0.963
2019-04-09T11:29:47.915740: Epoch   2 Batch 1590/3125   train_loss = 0.935
2019-04-09T11:29:48.325191: Epoch   2 Batch 1610/3125   train_loss = 1.024
2019-04-09T11:29:48.741050: Epoch   2 Batch 1630/3125   train_loss = 1.033
2019-04-09T11:29:49.303637: Epoch   2 Batch 1650/3125   train_loss = 0.892
2019-04-09T11:29:49.716688: Epoch   2 Batch 1670/3125   train_loss = 0.828
2019-04-09T11:29:50.127782: Epoch   2 Batch 1690/3125   train_loss = 0.886
2019-04-09T11:29:50.541466: Epoch   2 Batch 1710/3125   train_loss = 1.033
2019-04-09T11:29:50.952638: Epoch   2 Batch 1730/3125   train_loss = 0.990
2019-04-09T11:29:51.366254: Epoch   2 Batch 1750/3125   train_loss = 0.851
2019-04-09T11:29:51.779575: Epoch   2 Batch 1770/3125   train_loss = 1.130
2019-04-09T11:29:52.189667: Epoch   2 Batch 1790/3125   train_loss = 0.970
2019-04-09T11:29:52.600989: Epoch   2 Batch 1810/3125   train_loss = 1.004
2019-04-09T11:29:53.010986: Epoch   2 Batch 1830/3125   train_loss = 1.035
2019-04-09T11:29:53.428213: Epoch   2 Batch 1850/3125   train_loss = 0.935
2019-04-09T11:29:53.847839: Epoch   2 Batch 1870/3125   train_loss = 1.039
2019-04-09T11:29:54.260999: Epoch   2 Batch 1890/3125   train_loss = 0.822
2019-04-09T11:29:54.670587: Epoch   2 Batch 1910/3125   train_loss = 0.885
2019-04-09T11:29:55.079904: Epoch   2 Batch 1930/3125   train_loss = 1.038
2019-04-09T11:29:55.492941: Epoch   2 Batch 1950/3125   train_loss = 0.887
2019-04-09T11:29:55.909977: Epoch   2 Batch 1970/3125   train_loss = 0.998
2019-04-09T11:29:56.321669: Epoch   2 Batch 1990/3125   train_loss = 0.864
2019-04-09T11:29:56.731912: Epoch   2 Batch 2010/3125   train_loss = 0.792
2019-04-09T11:29:57.143008: Epoch   2 Batch 2030/3125   train_loss = 0.907
2019-04-09T11:29:57.555451: Epoch   2 Batch 2050/3125   train_loss = 0.952
2019-04-09T11:29:57.967763: Epoch   2 Batch 2070/3125   train_loss = 0.882
2019-04-09T11:29:58.396253: Epoch   2 Batch 2090/3125   train_loss = 0.831
2019-04-09T11:29:58.810290: Epoch   2 Batch 2110/3125   train_loss = 1.050
2019-04-09T11:29:59.220382: Epoch   2 Batch 2130/3125   train_loss = 0.973
2019-04-09T11:29:59.638177: Epoch   2 Batch 2150/3125   train_loss = 1.009
2019-04-09T11:30:00.054094: Epoch   2 Batch 2170/3125   train_loss = 0.862
2019-04-09T11:30:00.465054: Epoch   2 Batch 2190/3125   train_loss = 0.967
2019-04-09T11:30:00.875581: Epoch   2 Batch 2210/3125   train_loss = 0.950
2019-04-09T11:30:01.283669: Epoch   2 Batch 2230/3125   train_loss = 0.843
2019-04-09T11:30:01.702253: Epoch   2 Batch 2250/3125   train_loss = 0.933
2019-04-09T11:30:02.116357: Epoch   2 Batch 2270/3125   train_loss = 0.917
2019-04-09T11:30:02.530943: Epoch   2 Batch 2290/3125   train_loss = 0.856
2019-04-09T11:30:02.942953: Epoch   2 Batch 2310/3125   train_loss = 0.851
2019-04-09T11:30:03.360388: Epoch   2 Batch 2330/3125   train_loss = 1.097
2019-04-09T11:30:03.770799: Epoch   2 Batch 2350/3125   train_loss = 0.989
2019-04-09T11:30:04.189427: Epoch   2 Batch 2370/3125   train_loss = 0.886
2019-04-09T11:30:04.602910: Epoch   2 Batch 2390/3125   train_loss = 1.017
2019-04-09T11:30:05.013193: Epoch   2 Batch 2410/3125   train_loss = 1.025
2019-04-09T11:30:05.426136: Epoch   2 Batch 2430/3125   train_loss = 0.885
2019-04-09T11:30:05.834048: Epoch   2 Batch 2450/3125   train_loss = 0.968
2019-04-09T11:30:06.246209: Epoch   2 Batch 2470/3125   train_loss = 1.042
2019-04-09T11:30:06.661647: Epoch   2 Batch 2490/3125   train_loss = 1.003
2019-04-09T11:30:07.071296: Epoch   2 Batch 2510/3125   train_loss = 1.084
2019-04-09T11:30:07.490192: Epoch   2 Batch 2530/3125   train_loss = 0.793
2019-04-09T11:30:07.904515: Epoch   2 Batch 2550/3125   train_loss = 0.954
2019-04-09T11:30:08.315032: Epoch   2 Batch 2570/3125   train_loss = 0.957
2019-04-09T11:30:08.733158: Epoch   2 Batch 2590/3125   train_loss = 0.984
2019-04-09T11:30:09.146760: Epoch   2 Batch 2610/3125   train_loss = 1.043
2019-04-09T11:30:09.564414: Epoch   2 Batch 2630/3125   train_loss = 0.660
2019-04-09T11:30:09.977708: Epoch   2 Batch 2650/3125   train_loss = 0.913
2019-04-09T11:30:10.392227: Epoch   2 Batch 2670/3125   train_loss = 1.051
2019-04-09T11:30:10.803323: Epoch   2 Batch 2690/3125   train_loss = 0.980
2019-04-09T11:30:11.221892: Epoch   2 Batch 2710/3125   train_loss = 0.845
2019-04-09T11:30:11.636832: Epoch   2 Batch 2730/3125   train_loss = 1.067
2019-04-09T11:30:12.048855: Epoch   2 Batch 2750/3125   train_loss = 1.020
2019-04-09T11:30:12.466622: Epoch   2 Batch 2770/3125   train_loss = 0.894
2019-04-09T11:30:12.877228: Epoch   2 Batch 2790/3125   train_loss = 0.881
2019-04-09T11:30:13.292940: Epoch   2 Batch 2810/3125   train_loss = 0.958
2019-04-09T11:30:13.707370: Epoch   2 Batch 2830/3125   train_loss = 0.816
2019-04-09T11:30:14.115458: Epoch   2 Batch 2850/3125   train_loss = 1.005
2019-04-09T11:30:14.527402: Epoch   2 Batch 2870/3125   train_loss = 0.792
2019-04-09T11:30:14.941006: Epoch   2 Batch 2890/3125   train_loss = 0.779
2019-04-09T11:30:15.351115: Epoch   2 Batch 2910/3125   train_loss = 1.007
2019-04-09T11:30:15.761429: Epoch   2 Batch 2930/3125   train_loss = 0.813
2019-04-09T11:30:16.174529: Epoch   2 Batch 2950/3125   train_loss = 1.069
2019-04-09T11:30:16.592845: Epoch   2 Batch 2970/3125   train_loss = 0.993
2019-04-09T11:30:17.005062: Epoch   2 Batch 2990/3125   train_loss = 0.862
2019-04-09T11:30:17.425470: Epoch   2 Batch 3010/3125   train_loss = 0.936
2019-04-09T11:30:17.837640: Epoch   2 Batch 3030/3125   train_loss = 0.968
2019-04-09T11:30:18.248424: Epoch   2 Batch 3050/3125   train_loss = 0.980
2019-04-09T11:30:18.666115: Epoch   2 Batch 3070/3125   train_loss = 0.896
2019-04-09T11:30:19.074163: Epoch   2 Batch 3090/3125   train_loss = 0.774
2019-04-09T11:30:19.491628: Epoch   2 Batch 3110/3125   train_loss = 0.837
2019-04-09T11:30:19.895275: Epoch   2 Batch   18/781   test_loss = 0.808
2019-04-09T11:30:20.023969: Epoch   2 Batch   38/781   test_loss = 0.915
2019-04-09T11:30:20.152310: Epoch   2 Batch   58/781   test_loss = 0.851
2019-04-09T11:30:20.280151: Epoch   2 Batch   78/781   test_loss = 0.905
2019-04-09T11:30:20.408187: Epoch   2 Batch   98/781   test_loss = 0.903
2019-04-09T11:30:20.536028: Epoch   2 Batch  118/781   test_loss = 0.884
2019-04-09T11:30:20.663366: Epoch   2 Batch  138/781   test_loss = 1.000
2019-04-09T11:30:20.791206: Epoch   2 Batch  158/781   test_loss = 0.904
2019-04-09T11:30:20.918545: Epoch   2 Batch  178/781   test_loss = 0.785
2019-04-09T11:30:21.045884: Epoch   2 Batch  198/781   test_loss = 0.922
2019-04-09T11:30:21.177736: Epoch   2 Batch  218/781   test_loss = 0.997
2019-04-09T11:30:21.310087: Epoch   2 Batch  238/781   test_loss = 0.998
2019-04-09T11:30:21.437625: Epoch   2 Batch  258/781   test_loss = 0.959
2019-04-09T11:30:21.565465: Epoch   2 Batch  278/781   test_loss = 1.074
2019-04-09T11:30:21.692804: Epoch   2 Batch  298/781   test_loss = 0.915
2019-04-09T11:30:21.821646: Epoch   2 Batch  318/781   test_loss = 0.889
2019-04-09T11:30:21.952495: Epoch   2 Batch  338/781   test_loss = 0.941
2019-04-09T11:30:22.081338: Epoch   2 Batch  358/781   test_loss = 0.913
2019-04-09T11:30:22.210686: Epoch   2 Batch  378/781   test_loss = 0.890
2019-04-09T11:30:22.344036: Epoch   2 Batch  398/781   test_loss = 0.833
2019-04-09T11:30:22.471957: Epoch   2 Batch  418/781   test_loss = 0.941
2019-04-09T11:30:22.599296: Epoch   2 Batch  438/781   test_loss = 1.013
2019-04-09T11:30:22.728139: Epoch   2 Batch  458/781   test_loss = 0.919
2019-04-09T11:30:22.855992: Epoch   2 Batch  478/781   test_loss = 0.965
2019-04-09T11:30:22.982816: Epoch   2 Batch  498/781   test_loss = 0.813
2019-04-09T11:30:23.110155: Epoch   2 Batch  518/781   test_loss = 0.919
2019-04-09T11:30:23.238497: Epoch   2 Batch  538/781   test_loss = 0.795
2019-04-09T11:30:23.366838: Epoch   2 Batch  558/781   test_loss = 0.830
2019-04-09T11:30:23.495883: Epoch   2 Batch  578/781   test_loss = 0.915
2019-04-09T11:30:23.623225: Epoch   2 Batch  598/781   test_loss = 1.055
2019-04-09T11:30:23.751062: Epoch   2 Batch  618/781   test_loss = 0.850
2019-04-09T11:30:23.879905: Epoch   2 Batch  638/781   test_loss = 0.845
2019-04-09T11:30:24.007243: Epoch   2 Batch  658/781   test_loss = 1.026
2019-04-09T11:30:24.138091: Epoch   2 Batch  678/781   test_loss = 0.926
2019-04-09T11:30:24.266433: Epoch   2 Batch  698/781   test_loss = 0.875
2019-04-09T11:30:24.395604: Epoch   2 Batch  718/781   test_loss = 1.006
2019-04-09T11:30:24.523445: Epoch   2 Batch  738/781   test_loss = 0.850
2019-04-09T11:30:24.651786: Epoch   2 Batch  758/781   test_loss = 0.892
2019-04-09T11:30:24.779626: Epoch   2 Batch  778/781   test_loss = 0.913
2019-04-09T11:30:25.360700: Epoch   3 Batch    5/3125   train_loss = 0.900
2019-04-09T11:30:25.776594: Epoch   3 Batch   25/3125   train_loss = 0.995
2019-04-09T11:30:26.190195: Epoch   3 Batch   45/3125   train_loss = 0.823
2019-04-09T11:30:26.605221: Epoch   3 Batch   65/3125   train_loss = 0.936
2019-04-09T11:30:27.017575: Epoch   3 Batch   85/3125   train_loss = 0.811
2019-04-09T11:30:27.433325: Epoch   3 Batch  105/3125   train_loss = 0.735
2019-04-09T11:30:27.845489: Epoch   3 Batch  125/3125   train_loss = 0.883
2019-04-09T11:30:28.255902: Epoch   3 Batch  145/3125   train_loss = 0.946
2019-04-09T11:30:28.676186: Epoch   3 Batch  165/3125   train_loss = 0.907
2019-04-09T11:30:29.086028: Epoch   3 Batch  185/3125   train_loss = 0.843
2019-04-09T11:30:29.498049: Epoch   3 Batch  205/3125   train_loss = 0.782
2019-04-09T11:30:29.910137: Epoch   3 Batch  225/3125   train_loss = 0.818
2019-04-09T11:30:30.321717: Epoch   3 Batch  245/3125   train_loss = 1.094
2019-04-09T11:30:30.732822: Epoch   3 Batch  265/3125   train_loss = 0.907
2019-04-09T11:30:31.144919: Epoch   3 Batch  285/3125   train_loss = 0.899
2019-04-09T11:30:31.564878: Epoch   3 Batch  305/3125   train_loss = 0.886
2019-04-09T11:30:31.986450: Epoch   3 Batch  325/3125   train_loss = 0.900
2019-04-09T11:30:32.402943: Epoch   3 Batch  345/3125   train_loss = 0.966
2019-04-09T11:30:32.817756: Epoch   3 Batch  365/3125   train_loss = 0.897
2019-04-09T11:30:33.231358: Epoch   3 Batch  385/3125   train_loss = 0.854
2019-04-09T11:30:33.642523: Epoch   3 Batch  405/3125   train_loss = 0.854
2019-04-09T11:30:34.052009: Epoch   3 Batch  425/3125   train_loss = 0.950
2019-04-09T11:30:34.463651: Epoch   3 Batch  445/3125   train_loss = 0.963
2019-04-09T11:30:34.877612: Epoch   3 Batch  465/3125   train_loss = 0.840
2019-04-09T11:30:35.291041: Epoch   3 Batch  485/3125   train_loss = 1.043
2019-04-09T11:30:35.701510: Epoch   3 Batch  505/3125   train_loss = 0.820
2019-04-09T11:30:36.113107: Epoch   3 Batch  525/3125   train_loss = 0.977
2019-04-09T11:30:36.526067: Epoch   3 Batch  545/3125   train_loss = 0.785
2019-04-09T11:30:36.938504: Epoch   3 Batch  565/3125   train_loss = 1.138
2019-04-09T11:30:37.354627: Epoch   3 Batch  585/3125   train_loss = 0.877
2019-04-09T11:30:37.769480: Epoch   3 Batch  605/3125   train_loss = 0.865
2019-04-09T11:30:38.180576: Epoch   3 Batch  625/3125   train_loss = 0.931
2019-04-09T11:30:38.595414: Epoch   3 Batch  645/3125   train_loss = 1.007
2019-04-09T11:30:39.007112: Epoch   3 Batch  665/3125   train_loss = 0.960
2019-04-09T11:30:39.427161: Epoch   3 Batch  685/3125   train_loss = 0.908
2019-04-09T11:30:39.841768: Epoch   3 Batch  705/3125   train_loss = 1.001
2019-04-09T11:30:40.258352: Epoch   3 Batch  725/3125   train_loss = 0.888
2019-04-09T11:30:40.672977: Epoch   3 Batch  745/3125   train_loss = 0.834
2019-04-09T11:30:41.090307: Epoch   3 Batch  765/3125   train_loss = 0.864
2019-04-09T11:30:41.504196: Epoch   3 Batch  785/3125   train_loss = 1.046
2019-04-09T11:30:41.912423: Epoch   3 Batch  805/3125   train_loss = 0.816
2019-04-09T11:30:42.328090: Epoch   3 Batch  825/3125   train_loss = 0.904
2019-04-09T11:30:42.740677: Epoch   3 Batch  845/3125   train_loss = 0.932
2019-04-09T11:30:43.153777: Epoch   3 Batch  865/3125   train_loss = 1.004
2019-04-09T11:30:43.566946: Epoch   3 Batch  885/3125   train_loss = 0.968
2019-04-09T11:30:43.981050: Epoch   3 Batch  905/3125   train_loss = 0.998
2019-04-09T11:30:44.394270: Epoch   3 Batch  925/3125   train_loss = 0.896
2019-04-09T11:30:44.807669: Epoch   3 Batch  945/3125   train_loss = 0.978
2019-04-09T11:30:45.224278: Epoch   3 Batch  965/3125   train_loss = 0.731
2019-04-09T11:30:45.644716: Epoch   3 Batch  985/3125   train_loss = 1.003
2019-04-09T11:30:46.056218: Epoch   3 Batch 1005/3125   train_loss = 0.794
2019-04-09T11:30:46.465616: Epoch   3 Batch 1025/3125   train_loss = 0.879
2019-04-09T11:30:46.878718: Epoch   3 Batch 1045/3125   train_loss = 1.127
2019-04-09T11:30:47.297579: Epoch   3 Batch 1065/3125   train_loss = 0.875
2019-04-09T11:30:47.709534: Epoch   3 Batch 1085/3125   train_loss = 0.834
2019-04-09T11:30:48.125642: Epoch   3 Batch 1105/3125   train_loss = 0.842
2019-04-09T11:30:48.538103: Epoch   3 Batch 1125/3125   train_loss = 0.859
2019-04-09T11:30:48.952197: Epoch   3 Batch 1145/3125   train_loss = 0.905
2019-04-09T11:30:49.366261: Epoch   3 Batch 1165/3125   train_loss = 0.964
2019-04-09T11:30:49.774853: Epoch   3 Batch 1185/3125   train_loss = 0.869
2019-04-09T11:30:50.190392: Epoch   3 Batch 1205/3125   train_loss = 0.836
2019-04-09T11:30:50.605998: Epoch   3 Batch 1225/3125   train_loss = 1.002
2019-04-09T11:30:51.020181: Epoch   3 Batch 1245/3125   train_loss = 1.006
2019-04-09T11:30:51.434899: Epoch   3 Batch 1265/3125   train_loss = 0.896
2019-04-09T11:30:51.850872: Epoch   3 Batch 1285/3125   train_loss = 0.960
2019-04-09T11:30:52.265731: Epoch   3 Batch 1305/3125   train_loss = 0.802
2019-04-09T11:30:53.236710: Epoch   3 Batch 1325/3125   train_loss = 0.886
2019-04-09T11:30:53.650278: Epoch   3 Batch 1345/3125   train_loss = 0.928
2019-04-09T11:30:54.066153: Epoch   3 Batch 1365/3125   train_loss = 0.761
2019-04-09T11:30:54.481716: Epoch   3 Batch 1385/3125   train_loss = 0.779
2019-04-09T11:30:54.890807: Epoch   3 Batch 1405/3125   train_loss = 0.857
2019-04-09T11:30:55.303205: Epoch   3 Batch 1425/3125   train_loss = 1.106
2019-04-09T11:30:55.713796: Epoch   3 Batch 1445/3125   train_loss = 1.002
2019-04-09T11:30:56.127899: Epoch   3 Batch 1465/3125   train_loss = 0.887
2019-04-09T11:30:56.544126: Epoch   3 Batch 1485/3125   train_loss = 0.920
2019-04-09T11:30:56.952476: Epoch   3 Batch 1505/3125   train_loss = 0.745
2019-04-09T11:30:57.370433: Epoch   3 Batch 1525/3125   train_loss = 0.759
2019-04-09T11:30:57.781531: Epoch   3 Batch 1545/3125   train_loss = 0.843
2019-04-09T11:30:58.194632: Epoch   3 Batch 1565/3125   train_loss = 0.983
2019-04-09T11:30:58.613587: Epoch   3 Batch 1585/3125   train_loss = 0.827
2019-04-09T11:30:59.029585: Epoch   3 Batch 1605/3125   train_loss = 0.971
2019-04-09T11:30:59.443109: Epoch   3 Batch 1625/3125   train_loss = 0.950
2019-04-09T11:30:59.862969: Epoch   3 Batch 1645/3125   train_loss = 0.978
2019-04-09T11:31:00.280054: Epoch   3 Batch 1665/3125   train_loss = 0.916
2019-04-09T11:31:00.697972: Epoch   3 Batch 1685/3125   train_loss = 0.893
2019-04-09T11:31:01.120406: Epoch   3 Batch 1705/3125   train_loss = 0.883
2019-04-09T11:31:01.540523: Epoch   3 Batch 1725/3125   train_loss = 0.834
2019-04-09T11:31:01.957635: Epoch   3 Batch 1745/3125   train_loss = 0.775
2019-04-09T11:31:02.372311: Epoch   3 Batch 1765/3125   train_loss = 0.825
2019-04-09T11:31:02.786676: Epoch   3 Batch 1785/3125   train_loss = 1.015
2019-04-09T11:31:03.204288: Epoch   3 Batch 1805/3125   train_loss = 0.958
2019-04-09T11:31:03.616851: Epoch   3 Batch 1825/3125   train_loss = 1.031
2019-04-09T11:31:04.029497: Epoch   3 Batch 1845/3125   train_loss = 0.922
2019-04-09T11:31:04.442097: Epoch   3 Batch 1865/3125   train_loss = 0.753
2019-04-09T11:31:04.856887: Epoch   3 Batch 1885/3125   train_loss = 0.986
2019-04-09T11:31:05.271825: Epoch   3 Batch 1905/3125   train_loss = 0.799
2019-04-09T11:31:05.688152: Epoch   3 Batch 1925/3125   train_loss = 0.830
2019-04-09T11:31:06.097059: Epoch   3 Batch 1945/3125   train_loss = 0.865
2019-04-09T11:31:06.510931: Epoch   3 Batch 1965/3125   train_loss = 0.867
2019-04-09T11:31:06.924666: Epoch   3 Batch 1985/3125   train_loss = 0.840
2019-04-09T11:31:07.341276: Epoch   3 Batch 2005/3125   train_loss = 0.881
2019-04-09T11:31:07.755738: Epoch   3 Batch 2025/3125   train_loss = 0.951
2019-04-09T11:31:08.168337: Epoch   3 Batch 2045/3125   train_loss = 0.754
2019-04-09T11:31:08.583280: Epoch   3 Batch 2065/3125   train_loss = 0.727
2019-04-09T11:31:08.998421: Epoch   3 Batch 2085/3125   train_loss = 1.058
2019-04-09T11:31:09.415818: Epoch   3 Batch 2105/3125   train_loss = 0.891
2019-04-09T11:31:09.827917: Epoch   3 Batch 2125/3125   train_loss = 0.976
2019-04-09T11:31:10.237408: Epoch   3 Batch 2145/3125   train_loss = 1.002
2019-04-09T11:31:10.652222: Epoch   3 Batch 2165/3125   train_loss = 0.862
2019-04-09T11:31:11.061610: Epoch   3 Batch 2185/3125   train_loss = 0.948
2019-04-09T11:31:11.476691: Epoch   3 Batch 2205/3125   train_loss = 0.958
2019-04-09T11:31:11.893028: Epoch   3 Batch 2225/3125   train_loss = 0.811
2019-04-09T11:31:12.428069: Epoch   3 Batch 2245/3125   train_loss = 0.798
2019-04-09T11:31:12.840171: Epoch   3 Batch 2265/3125   train_loss = 0.896
2019-04-09T11:31:13.254127: Epoch   3 Batch 2285/3125   train_loss = 1.099
2019-04-09T11:31:13.671868: Epoch   3 Batch 2305/3125   train_loss = 0.812
2019-04-09T11:31:14.083559: Epoch   3 Batch 2325/3125   train_loss = 0.788
2019-04-09T11:31:14.499758: Epoch   3 Batch 2345/3125   train_loss = 0.885
2019-04-09T11:31:14.912859: Epoch   3 Batch 2365/3125   train_loss = 0.702
2019-04-09T11:31:15.331776: Epoch   3 Batch 2385/3125   train_loss = 0.915
2019-04-09T11:31:15.749019: Epoch   3 Batch 2405/3125   train_loss = 0.908
2019-04-09T11:31:16.161618: Epoch   3 Batch 2425/3125   train_loss = 0.875
2019-04-09T11:31:16.583581: Epoch   3 Batch 2445/3125   train_loss = 1.002
2019-04-09T11:31:17.000198: Epoch   3 Batch 2465/3125   train_loss = 0.748
2019-04-09T11:31:17.420234: Epoch   3 Batch 2485/3125   train_loss = 0.880
2019-04-09T11:31:17.834288: Epoch   3 Batch 2505/3125   train_loss = 0.852
2019-04-09T11:31:18.247812: Epoch   3 Batch 2525/3125   train_loss = 0.849
2019-04-09T11:31:18.663700: Epoch   3 Batch 2545/3125   train_loss = 1.010
2019-04-09T11:31:19.076134: Epoch   3 Batch 2565/3125   train_loss = 0.851
2019-04-09T11:31:19.490451: Epoch   3 Batch 2585/3125   train_loss = 0.768
2019-04-09T11:31:19.905388: Epoch   3 Batch 2605/3125   train_loss = 0.867
2019-04-09T11:31:20.318355: Epoch   3 Batch 2625/3125   train_loss = 1.004
2019-04-09T11:31:20.732786: Epoch   3 Batch 2645/3125   train_loss = 0.906
2019-04-09T11:31:21.146894: Epoch   3 Batch 2665/3125   train_loss = 0.984
2019-04-09T11:31:21.566102: Epoch   3 Batch 2685/3125   train_loss = 0.920
2019-04-09T11:31:21.981681: Epoch   3 Batch 2705/3125   train_loss = 0.784
2019-04-09T11:31:22.399609: Epoch   3 Batch 2725/3125   train_loss = 0.916
2019-04-09T11:31:22.817940: Epoch   3 Batch 2745/3125   train_loss = 0.925
2019-04-09T11:31:23.266133: Epoch   3 Batch 2765/3125   train_loss = 0.837
2019-04-09T11:31:23.679262: Epoch   3 Batch 2785/3125   train_loss = 0.935
2019-04-09T11:31:24.097862: Epoch   3 Batch 2805/3125   train_loss = 0.839
2019-04-09T11:31:24.511944: Epoch   3 Batch 2825/3125   train_loss = 0.844
2019-04-09T11:31:24.926787: Epoch   3 Batch 2845/3125   train_loss = 0.858
2019-04-09T11:31:25.347381: Epoch   3 Batch 2865/3125   train_loss = 0.853
2019-04-09T11:31:25.764592: Epoch   3 Batch 2885/3125   train_loss = 0.939
2019-04-09T11:31:26.184209: Epoch   3 Batch 2905/3125   train_loss = 0.969
2019-04-09T11:31:26.601925: Epoch   3 Batch 2925/3125   train_loss = 0.868
2019-04-09T11:31:27.016711: Epoch   3 Batch 2945/3125   train_loss = 0.900
2019-04-09T11:31:27.435058: Epoch   3 Batch 2965/3125   train_loss = 0.939
2019-04-09T11:31:27.848061: Epoch   3 Batch 2985/3125   train_loss = 0.843
2019-04-09T11:31:28.261955: Epoch   3 Batch 3005/3125   train_loss = 0.860
2019-04-09T11:31:28.677308: Epoch   3 Batch 3025/3125   train_loss = 0.917
2019-04-09T11:31:29.091668: Epoch   3 Batch 3045/3125   train_loss = 0.883
2019-04-09T11:31:29.505770: Epoch   3 Batch 3065/3125   train_loss = 0.864
2019-04-09T11:31:29.920149: Epoch   3 Batch 3085/3125   train_loss = 0.867
2019-04-09T11:31:30.335191: Epoch   3 Batch 3105/3125   train_loss = 0.929
2019-04-09T11:31:30.978022: Epoch   3 Batch   17/781   test_loss = 0.866
2019-04-09T11:31:31.112380: Epoch   3 Batch   37/781   test_loss = 0.868
2019-04-09T11:31:31.248741: Epoch   3 Batch   57/781   test_loss = 0.894
2019-04-09T11:31:31.387784: Epoch   3 Batch   77/781   test_loss = 0.898
2019-04-09T11:31:31.519144: Epoch   3 Batch   97/781   test_loss = 0.790
2019-04-09T11:31:31.648478: Epoch   3 Batch  117/781   test_loss = 0.950
2019-04-09T11:31:31.787347: Epoch   3 Batch  137/781   test_loss = 0.922
2019-04-09T11:31:31.934742: Epoch   3 Batch  157/781   test_loss = 0.919
2019-04-09T11:31:32.076115: Epoch   3 Batch  177/781   test_loss = 0.873
2019-04-09T11:31:32.206462: Epoch   3 Batch  197/781   test_loss = 0.928
2019-04-09T11:31:32.347500: Epoch   3 Batch  217/781   test_loss = 0.699
2019-04-09T11:31:32.483362: Epoch   3 Batch  237/781   test_loss = 0.752
2019-04-09T11:31:32.612205: Epoch   3 Batch  257/781   test_loss = 1.014
2019-04-09T11:31:32.754584: Epoch   3 Batch  277/781   test_loss = 0.979
2019-04-09T11:31:32.897965: Epoch   3 Batch  297/781   test_loss = 0.961
2019-04-09T11:31:33.031821: Epoch   3 Batch  317/781   test_loss = 1.030
2019-04-09T11:31:33.166680: Epoch   3 Batch  337/781   test_loss = 0.906
2019-04-09T11:31:33.308477: Epoch   3 Batch  357/781   test_loss = 0.883
2019-04-09T11:31:33.450355: Epoch   3 Batch  377/781   test_loss = 0.932
2019-04-09T11:31:33.580701: Epoch   3 Batch  397/781   test_loss = 0.918
2019-04-09T11:31:33.721075: Epoch   3 Batch  417/781   test_loss = 0.842
2019-04-09T11:31:33.859944: Epoch   3 Batch  437/781   test_loss = 0.808
2019-04-09T11:31:33.988286: Epoch   3 Batch  457/781   test_loss = 0.690
2019-04-09T11:31:34.116627: Epoch   3 Batch  477/781   test_loss = 0.923
2019-04-09T11:31:34.256500: Epoch   3 Batch  497/781   test_loss = 0.807
2019-04-09T11:31:34.394868: Epoch   3 Batch  517/781   test_loss = 0.805
2019-04-09T11:31:34.522207: Epoch   3 Batch  537/781   test_loss = 0.802
2019-04-09T11:31:34.650046: Epoch   3 Batch  557/781   test_loss = 1.050
2019-04-09T11:31:34.792425: Epoch   3 Batch  577/781   test_loss = 0.912
2019-04-09T11:31:34.930292: Epoch   3 Batch  597/781   test_loss = 0.875
2019-04-09T11:31:35.058634: Epoch   3 Batch  617/781   test_loss = 0.862
2019-04-09T11:31:35.184973: Epoch   3 Batch  637/781   test_loss = 0.781
2019-04-09T11:31:35.314815: Epoch   3 Batch  657/781   test_loss = 1.008
2019-04-09T11:31:35.444363: Epoch   3 Batch  677/781   test_loss = 0.931
2019-04-09T11:31:35.578721: Epoch   3 Batch  697/781   test_loss = 0.907
2019-04-09T11:31:35.712076: Epoch   3 Batch  717/781   test_loss = 0.812
2019-04-09T11:31:35.841921: Epoch   3 Batch  737/781   test_loss = 0.764
2019-04-09T11:31:35.983800: Epoch   3 Batch  757/781   test_loss = 1.099
2019-04-09T11:31:36.119660: Epoch   3 Batch  777/781   test_loss = 0.960
2019-04-09T11:31:36.666392: Epoch   4 Batch    0/3125   train_loss = 0.960
2019-04-09T11:31:37.108038: Epoch   4 Batch   20/3125   train_loss = 0.848
2019-04-09T11:31:37.523644: Epoch   4 Batch   40/3125   train_loss = 0.929
2019-04-09T11:31:37.940279: Epoch   4 Batch   60/3125   train_loss = 0.729
2019-04-09T11:31:38.360397: Epoch   4 Batch   80/3125   train_loss = 0.870
2019-04-09T11:31:38.783226: Epoch   4 Batch  100/3125   train_loss = 0.972
2019-04-09T11:31:39.208774: Epoch   4 Batch  120/3125   train_loss = 1.008
2019-04-09T11:31:39.670500: Epoch   4 Batch  140/3125   train_loss = 0.932
2019-04-09T11:31:40.130223: Epoch   4 Batch  160/3125   train_loss = 0.786
2019-04-09T11:31:40.578223: Epoch   4 Batch  180/3125   train_loss = 0.829
2019-04-09T11:31:40.994831: Epoch   4 Batch  200/3125   train_loss = 1.105
2019-04-09T11:31:41.423976: Epoch   4 Batch  220/3125   train_loss = 0.862
2019-04-09T11:31:41.847103: Epoch   4 Batch  240/3125   train_loss = 0.981
2019-04-09T11:31:42.273237: Epoch   4 Batch  260/3125   train_loss = 0.926
2019-04-09T11:31:42.696015: Epoch   4 Batch  280/3125   train_loss = 0.991
2019-04-09T11:31:43.118928: Epoch   4 Batch  300/3125   train_loss = 1.056
2019-04-09T11:31:43.543558: Epoch   4 Batch  320/3125   train_loss = 0.991
2019-04-09T11:31:43.963668: Epoch   4 Batch  340/3125   train_loss = 0.723
2019-04-09T11:31:44.405001: Epoch   4 Batch  360/3125   train_loss = 0.811
2019-04-09T11:31:44.837830: Epoch   4 Batch  380/3125   train_loss = 0.903
2019-04-09T11:31:45.256898: Epoch   4 Batch  400/3125   train_loss = 0.788
2019-04-09T11:31:45.684205: Epoch   4 Batch  420/3125   train_loss = 0.845
2019-04-09T11:31:46.114850: Epoch   4 Batch  440/3125   train_loss = 0.845
2019-04-09T11:31:46.554569: Epoch   4 Batch  460/3125   train_loss = 0.917
2019-04-09T11:31:46.990729: Epoch   4 Batch  480/3125   train_loss = 0.982
2019-04-09T11:31:47.417146: Epoch   4 Batch  500/3125   train_loss = 0.671
2019-04-09T11:31:47.851802: Epoch   4 Batch  520/3125   train_loss = 0.905
2019-04-09T11:31:48.283919: Epoch   4 Batch  540/3125   train_loss = 0.806
2019-04-09T11:31:48.718582: Epoch   4 Batch  560/3125   train_loss = 1.032
2019-04-09T11:31:49.138201: Epoch   4 Batch  580/3125   train_loss = 0.989
2019-04-09T11:31:49.559825: Epoch   4 Batch  600/3125   train_loss = 0.909
2019-04-09T11:31:49.989670: Epoch   4 Batch  620/3125   train_loss = 0.941
2019-04-09T11:31:50.406780: Epoch   4 Batch  640/3125   train_loss = 0.862
2019-04-09T11:31:50.859348: Epoch   4 Batch  660/3125   train_loss = 0.912
2019-04-09T11:31:51.275455: Epoch   4 Batch  680/3125   train_loss = 0.932
2019-04-09T11:31:51.691919: Epoch   4 Batch  700/3125   train_loss = 0.911
2019-04-09T11:31:52.107926: Epoch   4 Batch  720/3125   train_loss = 0.782
2019-04-09T11:31:52.527656: Epoch   4 Batch  740/3125   train_loss = 0.911
2019-04-09T11:31:52.969684: Epoch   4 Batch  760/3125   train_loss = 0.782
2019-04-09T11:31:53.409955: Epoch   4 Batch  780/3125   train_loss = 0.905
2019-04-09T11:31:53.832580: Epoch   4 Batch  800/3125   train_loss = 0.798
2019-04-09T11:31:54.247683: Epoch   4 Batch  820/3125   train_loss = 0.871
2019-04-09T11:31:54.668933: Epoch   4 Batch  840/3125   train_loss = 0.808
2019-04-09T11:31:55.088550: Epoch   4 Batch  860/3125   train_loss = 0.828
2019-04-09T11:31:55.506010: Epoch   4 Batch  880/3125   train_loss = 0.811
2019-04-09T11:31:55.953370: Epoch   4 Batch  900/3125   train_loss = 0.888
2019-04-09T11:31:56.475762: Epoch   4 Batch  920/3125   train_loss = 0.953
2019-04-09T11:31:56.895627: Epoch   4 Batch  940/3125   train_loss = 0.898
2019-04-09T11:31:57.314926: Epoch   4 Batch  960/3125   train_loss = 0.927
2019-04-09T11:31:57.736404: Epoch   4 Batch  980/3125   train_loss = 1.019
2019-04-09T11:31:58.155519: Epoch   4 Batch 1000/3125   train_loss = 0.972
2019-04-09T11:31:58.571659: Epoch   4 Batch 1020/3125   train_loss = 0.885
2019-04-09T11:31:58.987239: Epoch   4 Batch 1040/3125   train_loss = 0.766
2019-04-09T11:31:59.407857: Epoch   4 Batch 1060/3125   train_loss = 0.975
2019-04-09T11:31:59.827189: Epoch   4 Batch 1080/3125   train_loss = 0.890
2019-04-09T11:32:00.250485: Epoch   4 Batch 1100/3125   train_loss = 0.794
2019-04-09T11:32:00.665686: Epoch   4 Batch 1120/3125   train_loss = 0.830
2019-04-09T11:32:01.076280: Epoch   4 Batch 1140/3125   train_loss = 0.850
2019-04-09T11:32:01.495207: Epoch   4 Batch 1160/3125   train_loss = 0.826
2019-04-09T11:32:01.909009: Epoch   4 Batch 1180/3125   train_loss = 0.813
2019-04-09T11:32:02.325685: Epoch   4 Batch 1200/3125   train_loss = 1.011
2019-04-09T11:32:02.747689: Epoch   4 Batch 1220/3125   train_loss = 0.964
2019-04-09T11:32:03.171817: Epoch   4 Batch 1240/3125   train_loss = 0.782
2019-04-09T11:32:03.593569: Epoch   4 Batch 1260/3125   train_loss = 0.848
2019-04-09T11:32:04.011798: Epoch   4 Batch 1280/3125   train_loss = 0.908
2019-04-09T11:32:04.430913: Epoch   4 Batch 1300/3125   train_loss = 0.794
2019-04-09T11:32:04.846453: Epoch   4 Batch 1320/3125   train_loss = 0.872
2019-04-09T11:32:05.263562: Epoch   4 Batch 1340/3125   train_loss = 0.716
2019-04-09T11:32:05.679810: Epoch   4 Batch 1360/3125   train_loss = 0.847
2019-04-09T11:32:06.099427: Epoch   4 Batch 1380/3125   train_loss = 0.831
2019-04-09T11:32:06.515033: Epoch   4 Batch 1400/3125   train_loss = 0.932
2019-04-09T11:32:06.932977: Epoch   4 Batch 1420/3125   train_loss = 0.911
2019-04-09T11:32:07.349584: Epoch   4 Batch 1440/3125   train_loss = 0.767
2019-04-09T11:32:07.768391: Epoch   4 Batch 1460/3125   train_loss = 0.885
2019-04-09T11:32:08.186503: Epoch   4 Batch 1480/3125   train_loss = 0.855
2019-04-09T11:32:08.610562: Epoch   4 Batch 1500/3125   train_loss = 0.890
2019-04-09T11:32:09.027935: Epoch   4 Batch 1520/3125   train_loss = 0.807
2019-04-09T11:32:09.448052: Epoch   4 Batch 1540/3125   train_loss = 0.970
2019-04-09T11:32:09.864802: Epoch   4 Batch 1560/3125   train_loss = 0.786
2019-04-09T11:32:10.279906: Epoch   4 Batch 1580/3125   train_loss = 0.913
2019-04-09T11:32:10.694227: Epoch   4 Batch 1600/3125   train_loss = 0.830
2019-04-09T11:32:11.113843: Epoch   4 Batch 1620/3125   train_loss = 0.764
2019-04-09T11:32:11.535264: Epoch   4 Batch 1640/3125   train_loss = 0.948
2019-04-09T11:32:11.951873: Epoch   4 Batch 1660/3125   train_loss = 1.003
2019-04-09T11:32:12.368324: Epoch   4 Batch 1680/3125   train_loss = 0.899
2019-04-09T11:32:12.877578: Epoch   4 Batch 1700/3125   train_loss = 0.787
2019-04-09T11:32:13.293848: Epoch   4 Batch 1720/3125   train_loss = 0.872
2019-04-09T11:32:13.710885: Epoch   4 Batch 1740/3125   train_loss = 0.929
2019-04-09T11:32:14.120976: Epoch   4 Batch 1760/3125   train_loss = 0.887
2019-04-09T11:32:14.538451: Epoch   4 Batch 1780/3125   train_loss = 0.851
2019-04-09T11:32:14.959239: Epoch   4 Batch 1800/3125   train_loss = 0.820
2019-04-09T11:32:15.374844: Epoch   4 Batch 1820/3125   train_loss = 0.807
2019-04-09T11:32:15.787555: Epoch   4 Batch 1840/3125   train_loss = 0.903
2019-04-09T11:32:16.206090: Epoch   4 Batch 1860/3125   train_loss = 0.977
2019-04-09T11:32:16.620547: Epoch   4 Batch 1880/3125   train_loss = 0.887
2019-04-09T11:32:17.036185: Epoch   4 Batch 1900/3125   train_loss = 0.734
2019-04-09T11:32:17.454960: Epoch   4 Batch 1920/3125   train_loss = 0.883
2019-04-09T11:32:17.870896: Epoch   4 Batch 1940/3125   train_loss = 0.792
2019-04-09T11:32:18.287611: Epoch   4 Batch 1960/3125   train_loss = 0.756
2019-04-09T11:32:18.708944: Epoch   4 Batch 1980/3125   train_loss = 0.856
2019-04-09T11:32:19.124550: Epoch   4 Batch 2000/3125   train_loss = 0.989
2019-04-09T11:32:19.539524: Epoch   4 Batch 2020/3125   train_loss = 0.987
2019-04-09T11:32:19.955392: Epoch   4 Batch 2040/3125   train_loss = 0.793
2019-04-09T11:32:20.373002: Epoch   4 Batch 2060/3125   train_loss = 0.851
2019-04-09T11:32:20.788365: Epoch   4 Batch 2080/3125   train_loss = 0.980
2019-04-09T11:32:21.207642: Epoch   4 Batch 2100/3125   train_loss = 0.782
2019-04-09T11:32:21.628621: Epoch   4 Batch 2120/3125   train_loss = 0.808
2019-04-09T11:32:22.042255: Epoch   4 Batch 2140/3125   train_loss = 0.840
2019-04-09T11:32:22.456976: Epoch   4 Batch 2160/3125   train_loss = 0.829
2019-04-09T11:32:22.867969: Epoch   4 Batch 2180/3125   train_loss = 0.917
2019-04-09T11:32:23.281501: Epoch   4 Batch 2200/3125   train_loss = 0.803
2019-04-09T11:32:23.696260: Epoch   4 Batch 2220/3125   train_loss = 0.832
2019-04-09T11:32:24.112367: Epoch   4 Batch 2240/3125   train_loss = 0.797
2019-04-09T11:32:24.528127: Epoch   4 Batch 2260/3125   train_loss = 0.872
2019-04-09T11:32:24.944427: Epoch   4 Batch 2280/3125   train_loss = 0.880
2019-04-09T11:32:25.362539: Epoch   4 Batch 2300/3125   train_loss = 0.847
2019-04-09T11:32:25.776624: Epoch   4 Batch 2320/3125   train_loss = 0.908
2019-04-09T11:32:26.191315: Epoch   4 Batch 2340/3125   train_loss = 0.849
2019-04-09T11:32:26.607493: Epoch   4 Batch 2360/3125   train_loss = 0.881
2019-04-09T11:32:27.021723: Epoch   4 Batch 2380/3125   train_loss = 0.835
2019-04-09T11:32:27.440410: Epoch   4 Batch 2400/3125   train_loss = 0.915
2019-04-09T11:32:27.850694: Epoch   4 Batch 2420/3125   train_loss = 0.794
2019-04-09T11:32:28.265448: Epoch   4 Batch 2440/3125   train_loss = 0.800
2019-04-09T11:32:28.684222: Epoch   4 Batch 2460/3125   train_loss = 0.852
2019-04-09T11:32:29.103336: Epoch   4 Batch 2480/3125   train_loss = 0.954
2019-04-09T11:32:29.520448: Epoch   4 Batch 2500/3125   train_loss = 0.811
2019-04-09T11:32:29.941087: Epoch   4 Batch 2520/3125   train_loss = 0.885
2019-04-09T11:32:30.357195: Epoch   4 Batch 2540/3125   train_loss = 0.845
2019-04-09T11:32:30.780301: Epoch   4 Batch 2560/3125   train_loss = 0.665
2019-04-09T11:32:31.195065: Epoch   4 Batch 2580/3125   train_loss = 0.825
2019-04-09T11:32:31.604654: Epoch   4 Batch 2600/3125   train_loss = 0.868
2019-04-09T11:32:32.018648: Epoch   4 Batch 2620/3125   train_loss = 0.813
2019-04-09T11:32:32.435706: Epoch   4 Batch 2640/3125   train_loss = 0.826
2019-04-09T11:32:32.853230: Epoch   4 Batch 2660/3125   train_loss = 1.017
2019-04-09T11:32:33.270841: Epoch   4 Batch 2680/3125   train_loss = 0.769
2019-04-09T11:32:33.692010: Epoch   4 Batch 2700/3125   train_loss = 0.922
2019-04-09T11:32:34.136192: Epoch   4 Batch 2720/3125   train_loss = 0.796
2019-04-09T11:32:34.551979: Epoch   4 Batch 2740/3125   train_loss = 0.870
2019-04-09T11:32:34.968683: Epoch   4 Batch 2760/3125   train_loss = 0.799
2019-04-09T11:32:35.385795: Epoch   4 Batch 2780/3125   train_loss = 0.842
2019-04-09T11:32:35.803009: Epoch   4 Batch 2800/3125   train_loss = 1.050
2019-04-09T11:32:36.220554: Epoch   4 Batch 2820/3125   train_loss = 1.034
2019-04-09T11:32:36.638668: Epoch   4 Batch 2840/3125   train_loss = 0.822
2019-04-09T11:32:37.057100: Epoch   4 Batch 2860/3125   train_loss = 0.789
2019-04-09T11:32:37.477429: Epoch   4 Batch 2880/3125   train_loss = 0.858
2019-04-09T11:32:37.894122: Epoch   4 Batch 2900/3125   train_loss = 0.833
2019-04-09T11:32:38.309463: Epoch   4 Batch 2920/3125   train_loss = 0.849
2019-04-09T11:32:38.727701: Epoch   4 Batch 2940/3125   train_loss = 0.879
2019-04-09T11:32:39.142808: Epoch   4 Batch 2960/3125   train_loss = 0.877
2019-04-09T11:32:39.560118: Epoch   4 Batch 2980/3125   train_loss = 0.827
2019-04-09T11:32:39.978247: Epoch   4 Batch 3000/3125   train_loss = 0.920
2019-04-09T11:32:40.396863: Epoch   4 Batch 3020/3125   train_loss = 1.001
2019-04-09T11:32:40.812059: Epoch   4 Batch 3040/3125   train_loss = 0.956
2019-04-09T11:32:41.228167: Epoch   4 Batch 3060/3125   train_loss = 0.814
2019-04-09T11:32:41.643774: Epoch   4 Batch 3080/3125   train_loss = 1.017
2019-04-09T11:32:42.059833: Epoch   4 Batch 3100/3125   train_loss = 1.032
2019-04-09T11:32:42.478235: Epoch   4 Batch 3120/3125   train_loss = 0.816
2019-04-09T11:32:42.674176: Epoch   4 Batch   16/781   test_loss = 0.830
2019-04-09T11:32:42.806027: Epoch   4 Batch   36/781   test_loss = 0.903
2019-04-09T11:32:42.936875: Epoch   4 Batch   56/781   test_loss = 0.934
2019-04-09T11:32:43.067222: Epoch   4 Batch   76/781   test_loss = 0.974
2019-04-09T11:32:43.197569: Epoch   4 Batch   96/781   test_loss = 1.000
2019-04-09T11:32:43.326913: Epoch   4 Batch  116/781   test_loss = 0.887
2019-04-09T11:32:43.457535: Epoch   4 Batch  136/781   test_loss = 0.811
2019-04-09T11:32:43.588383: Epoch   4 Batch  156/781   test_loss = 0.876
2019-04-09T11:32:43.716224: Epoch   4 Batch  176/781   test_loss = 0.865
2019-04-09T11:32:43.846583: Epoch   4 Batch  196/781   test_loss = 0.786
2019-04-09T11:32:43.975413: Epoch   4 Batch  216/781   test_loss = 0.974
2019-04-09T11:32:44.105258: Epoch   4 Batch  236/781   test_loss = 0.793
2019-04-09T11:32:44.235605: Epoch   4 Batch  256/781   test_loss = 0.827
2019-04-09T11:32:44.367456: Epoch   4 Batch  276/781   test_loss = 1.097
2019-04-09T11:32:44.496146: Epoch   4 Batch  296/781   test_loss = 0.813
2019-04-09T11:32:44.625489: Epoch   4 Batch  316/781   test_loss = 0.820
2019-04-09T11:32:44.754834: Epoch   4 Batch  336/781   test_loss = 0.760
2019-04-09T11:32:44.884178: Epoch   4 Batch  356/781   test_loss = 0.885
2019-04-09T11:32:45.013021: Epoch   4 Batch  376/781   test_loss = 0.872
2019-04-09T11:32:45.141362: Epoch   4 Batch  396/781   test_loss = 0.807
2019-04-09T11:32:45.273213: Epoch   4 Batch  416/781   test_loss = 0.935
2019-04-09T11:32:45.402058: Epoch   4 Batch  436/781   test_loss = 0.955
2019-04-09T11:32:45.533244: Epoch   4 Batch  456/781   test_loss = 0.735
2019-04-09T11:32:45.666098: Epoch   4 Batch  476/781   test_loss = 0.931
2019-04-09T11:32:45.795442: Epoch   4 Batch  496/781   test_loss = 0.966
2019-04-09T11:32:45.925789: Epoch   4 Batch  516/781   test_loss = 0.760
2019-04-09T11:32:46.054130: Epoch   4 Batch  536/781   test_loss = 0.990
2019-04-09T11:32:46.183474: Epoch   4 Batch  556/781   test_loss = 0.868
2019-04-09T11:32:46.312818: Epoch   4 Batch  576/781   test_loss = 0.940
2019-04-09T11:32:46.441522: Epoch   4 Batch  596/781   test_loss = 0.959
2019-04-09T11:32:46.571869: Epoch   4 Batch  616/781   test_loss = 0.930
2019-04-09T11:32:46.702216: Epoch   4 Batch  636/781   test_loss = 0.809
2019-04-09T11:32:46.831560: Epoch   4 Batch  656/781   test_loss = 0.876
2019-04-09T11:32:46.960904: Epoch   4 Batch  676/781   test_loss = 1.057
2019-04-09T11:32:47.092254: Epoch   4 Batch  696/781   test_loss = 0.856
2019-04-09T11:32:47.222600: Epoch   4 Batch  716/781   test_loss = 0.852
2019-04-09T11:32:47.351944: Epoch   4 Batch  736/781   test_loss = 1.075
2019-04-09T11:32:47.480286: Epoch   4 Batch  756/781   test_loss = 0.809
2019-04-09T11:32:47.610131: Epoch   4 Batch  776/781   test_loss = 0.753
Model Trained and Saved

在 TensorBoard 中查看可视化结果

tensorboard --logdir=/PATH_TO_CODE/runs/1513402825/summaries/

技术图片

保存参数

保存save_dir 在生成预测时使用。

save_params((save_dir))

load_dir = load_params()

显示训练Loss

plt.plot(losses['train'], label='Training loss')
plt.legend()
_ = plt.ylim()

技术图片

显示测试Loss

迭代次数再增加一些，下降的趋势会明显一些

plt.plot(losses['test'], label='Test loss')
plt.legend()
_ = plt.ylim()

技术图片

获取 Tensors

使用函数 get_tensor_by_name()从 loaded_graph 中获取tensors，后面的推荐功能要用到。

def get_tensors(loaded_graph):

    uid = loaded_graph.get_tensor_by_name("uid:0")
    user_gender = loaded_graph.get_tensor_by_name("user_gender:0")
    user_age = loaded_graph.get_tensor_by_name("user_age:0")
    user_job = loaded_graph.get_tensor_by_name("user_job:0")
    movie_id = loaded_graph.get_tensor_by_name("movie_id:0")
    movie_categories = loaded_graph.get_tensor_by_name("movie_categories:0")
    movie_titles = loaded_graph.get_tensor_by_name("movie_titles:0")
    targets = loaded_graph.get_tensor_by_name("targets:0")
    dropout_keep_prob = loaded_graph.get_tensor_by_name("dropout_keep_prob:0")
    lr = loaded_graph.get_tensor_by_name("LearningRate:0")
    #两种不同计算预测评分的方案使用不同的name获取tensor inference
#     inference = loaded_graph.get_tensor_by_name("inference/inference/BiasAdd:0")
    inference = loaded_graph.get_tensor_by_name("inference/ExpandDims:0") # 之前是MatMul:0 因为inference代码修改了 这里也要修改 感谢网友 @清歌 指出问题
    movie_combine_layer_flat = loaded_graph.get_tensor_by_name("movie_fc/Reshape:0")
    user_combine_layer_flat = loaded_graph.get_tensor_by_name("user_fc/Reshape:0")
    return uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference, movie_combine_layer_flat, user_combine_layer_flat

指定用户和电影进行评分

这部分就是对网络做正向传播，计算得到预测的评分

def rating_movie(user_id_val, movie_id_val):
    loaded_graph = tf.Graph()  #
    with tf.Session(graph=loaded_graph) as sess:  #
        # Load saved model
        loader = tf.train.import_meta_graph(load_dir + '.meta')
        loader.restore(sess, load_dir)
    
        # Get Tensors from loaded model
        uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, inference,_, __ = get_tensors(loaded_graph)  #loaded_graph
    
        categories = np.zeros([1, 18])
        categories[0] = movies.values[movieid2idx[movie_id_val]][2]
    
        titles = np.zeros([1, sentences_size])
        titles[0] = movies.values[movieid2idx[movie_id_val]][1]
    
        feed = {
              uid: np.reshape(users.values[user_id_val-1][0], [1, 1]),
              user_gender: np.reshape(users.values[user_id_val-1][1], [1, 1]),
              user_age: np.reshape(users.values[user_id_val-1][2], [1, 1]),
              user_job: np.reshape(users.values[user_id_val-1][3], [1, 1]),
              movie_id: np.reshape(movies.values[movieid2idx[movie_id_val]][0], [1, 1]),
              movie_categories: categories,  #x.take(6,1)
              movie_titles: titles,  #x.take(5,1)
              dropout_keep_prob: 1}
    
        # Get Prediction
        inference_val = sess.run([inference], feed)  
    
        return (inference_val)

rating_movie(234, 1401)

INFO:tensorflow:Restoring parameters from ./save





[array([[3.1157281]], dtype=float32)]

生成Movie特征矩阵

将训练好的电影特征组合成电影特征矩阵并保存到本地

loaded_graph = tf.Graph()  #
movie_matrics = []
with tf.Session(graph=loaded_graph) as sess:  #
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, movie_combine_layer_flat, __ = get_tensors(loaded_graph)  #loaded_graph

    for item in movies.values:
        categories = np.zeros([1, 18])
        categories[0] = item.take(2)

        titles = np.zeros([1, sentences_size])
        titles[0] = item.take(1)

        feed = {
            movie_id: np.reshape(item.take(0), [1, 1]),
            movie_categories: categories,  #x.take(6,1)
            movie_titles: titles,  #x.take(5,1)
            dropout_keep_prob: 1}

        movie_combine_layer_flat_val = sess.run([movie_combine_layer_flat], feed)  
        movie_matrics.append(movie_combine_layer_flat_val)

pickle.dump((np.array(movie_matrics).reshape(-1, 200)), open('movie_matrics.p', 'wb'))
movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

movie_matrics = pickle.load(open('movie_matrics.p', mode='rb'))

生成User特征矩阵

将训练好的用户特征组合成用户特征矩阵并保存到本地

loaded_graph = tf.Graph()  #
users_matrics = []
with tf.Session(graph=loaded_graph) as sess:  #
    # Load saved model
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)

    # Get Tensors from loaded model
    uid, user_gender, user_age, user_job, movie_id, movie_categories, movie_titles, targets, lr, dropout_keep_prob, _, __,user_combine_layer_flat = get_tensors(loaded_graph)  #loaded_graph

    for item in users.values:

        feed = {
            uid: np.reshape(item.take(0), [1, 1]),
            user_gender: np.reshape(item.take(1), [1, 1]),
            user_age: np.reshape(item.take(2), [1, 1]),
            user_job: np.reshape(item.take(3), [1, 1]),
            dropout_keep_prob: 1}

        user_combine_layer_flat_val = sess.run([user_combine_layer_flat], feed)  
        users_matrics.append(user_combine_layer_flat_val)

pickle.dump((np.array(users_matrics).reshape(-1, 200)), open('users_matrics.p', 'wb'))
users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

INFO:tensorflow:Restoring parameters from ./save

users_matrics = pickle.load(open('users_matrics.p', mode='rb'))

开始推荐电影

使用生产的用户特征矩阵和电影特征矩阵做电影推荐

看过这个电影的人还看了（喜欢）哪些电影

首先选出喜欢某个电影的top_k个人，得到这几个人的用户特征向量。
然后计算这几个人对所有电影的评分
选择每个人评分最高的电影作为推荐
同样加入了随机选择

import random

def recommend_other_favorite_movie(movie_id_val, top_k = 20):
    loaded_graph = tf.Graph()  #
    with tf.Session(graph=loaded_graph) as sess:  #
        # Load saved model
        loader = tf.train.import_meta_graph(load_dir + '.meta')
        loader.restore(sess, load_dir)

        probs_movie_embeddings = (movie_matrics[movieid2idx[movie_id_val]]).reshape([1, 200])
        probs_user_favorite_similarity = tf.matmul(probs_movie_embeddings, tf.transpose(users_matrics))
        favorite_user_id = np.argsort(probs_user_favorite_similarity.eval())[0][-top_k:]
    #     print(normalized_users_matrics.eval().shape)
    #     print(probs_user_favorite_similarity.eval()[0][favorite_user_id])
    #     print(favorite_user_id.shape)
    
        print("您看的电影是：{}".format(movies_orig[movieid2idx[movie_id_val]]))
        
        print("喜欢看这个电影的人是：{}".format(users_orig[favorite_user_id-1]))
        probs_users_embeddings = (users_matrics[favorite_user_id-1]).reshape([-1, 200])
        probs_similarity = tf.matmul(probs_users_embeddings, tf.transpose(movie_matrics))
        sim = (probs_similarity.eval())
    #     results = (-sim[0]).argsort()[0:top_k]
    #     print(results)
    
    #     print(sim.shape)
    #     print(np.argmax(sim, 1))
        p = np.argmax(sim, 1)
        print("喜欢看这个电影的人还喜欢看：")

        results = set()
        while len(results) != 5:
            c = p[random.randrange(top_k)]
            results.add(c)
        for val in (results):
            print(val)
            print(movies_orig[val])
        
        return results

recommend_other_favorite_movie(1401, 20)

INFO:tensorflow:Restoring parameters from ./save
您看的电影是：[1401 'Ghosts of Mississippi (1996)' 'Drama']
喜欢看这个电影的人是：[[1568 'F' 1 10]
 [4814 'M' 18 14]
 [5217 'M' 25 17]
 [1745 'M' 45 0]
 [1763 'M' 35 7]
 [5861 'F' 50 1]
 [493 'M' 50 7]
 [3031 'M' 18 4]
 [2144 'M' 18 0]
 [1644 'M' 18 12]
 [3833 'M' 25 1]
 [5678 'M' 35 17]
 [1701 'F' 25 4]
 [3297 'M' 18 4]
 [4800 'M' 18 4]
 [1109 'M' 18 10]
 [2496 'M' 50 1]
 [100 'M' 35 17]
 [2154 'M' 25 12]
 [4085 'F' 25 6]]
喜欢看这个电影的人还喜欢看：
1132
[1148 'Wrong Trousers, The (1993)' 'Animation|Comedy']
1133
[1149 'JLG/JLG - autoportrait de d閏embre (1994)' 'Documentary|Drama']
847
[858 'Godfather, The (1972)' 'Action|Crime|Drama']
763
[773 'Touki Bouki (Journey of the Hyena) (1973)' 'Drama']
1950
[2019
 'Seven Samurai (The Magnificent Seven) (Shichinin no samurai) (1954)'
 'Action|Drama']





{763, 847, 1132, 1133, 1950}

结论

以上就是实现的常用的推荐功能，将网络模型作为回归问题进行训练，得到训练好的用户特征矩阵和电影特征矩阵进行推荐。

扩展阅读

如果你对个性化推荐感兴趣，以下资料建议你看看：

今天的分享就到这里，请多指教！

基于卷积神经网络CNN的电影推荐系统

标签：his fat 测试 ica jpg href pen embedding layer

原文地址：https://www.cnblogs.com/chenxiangzhen/p/10676348.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

基于卷积神经网络CNN的电影推荐系统

下载数据集

先来看看数据

用户数据

电影数据

评分数据

来说说数据预处理

实现数据预处理

加载数据并保存到本地

预处理后的数据

从本地读取数据

模型设计

文本卷积网络

辅助函数

编码实现

超参

输入

构建神经网络

定义User的嵌入矩阵

将User的嵌入矩阵一起全连接生成User的特征

定义Movie ID的嵌入矩阵

对电影类型的多个嵌入向量做加和

Movie Title的文本卷积网络实现

将Movie的各个层一起做全连接

构建计算图

取得batch

训练网络

在 TensorBoard 中查看可视化结果

保存参数

显示训练Loss

显示测试Loss

获取 Tensors

指定用户和电影进行评分

生成Movie特征矩阵

生成User特征矩阵

开始推荐电影

推荐同类型的电影

推荐您喜欢的电影

看过这个电影的人还看了（喜欢）哪些电影

结论

扩展阅读