码迷,mamicode.com
首页 > 其他好文 > 详细

2015-2-19 log

时间:2015-02-09 22:57:26      阅读:153      评论:0      收藏:0      [点我收藏+]

标签:

I checked the combine2vec code and found a bug. In negative sample mode, I did not update the input word vector when the sample is positive. I fixed the bug, but the two objective functions still did not converge together.

I adapted word2vec training startegy in sentence branch. There may be bugs. Because too many biword count values are zero. I need to check the cooccurrence file, bigram code and sentence branch combine2vec code.

I collected gradient statistics yesterday. Interesting phenomenan:

1. If exchange "word" and "last_word" in skip-gram model in the word2vec code, the traing loss become much smaller and the training speed is almost twice higher. This is wired.

2. The gradient variation trend is different between the above two approach.

3. Ployseme may not converge well, so their gradient is large. In reality, words apperaed rearly also produc large gradient.

Read Sequence to sequence learning.

Configured PyCharm.

Checked GroundHog code. It contains enconder-decoder machine translation code. But it is more than 3000 lines. So I will implement sequence to sequence learning code first.

 

Today I sitll wasted a lot of time.

5:00-6:00 Browsed websites.

7:30-8:30 Gone out and ate litta pizza. Honostly, I did not need the supper.

8:30-10:00 Wasted a  lot of time wandering. Checked the GroundHog code, Configured PyCharm, but never face the real problem. Just implement the sequence to sequence learning first.

2015-2-19 log

标签:

原文地址:http://www.cnblogs.com/peng-ge/p/4282675.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!