标签:function existing pre data 原因 div style 代码 car
Deep Traffic是之前MIT一个课程的(娱乐)作业。有兴趣的可以去玩玩这个开车小游戏。
但是我发现这里最奇特的是线性模型居然效果血好。
以下为我一部分测试数据:
单隐层32结点fc:70.36
三隐层32-16-8结点fc:71.1
三隐层32-16-8结点fc加上一些乱搞:71.2
(我没有找到conv之类的东西)
但重点是——
线性sigmoid:71.88
可能是因为模型太简单,线性就可以很好拟合的原因。
而且线性模型训练真的很快。可以多迭代很多次。
最后,贴一波现在最佳代码(还有很多参数不懂 / 没调):
1 //<![CDATA[ 2 3 // a few things don‘t have var in front of them - they update already existing variables the game needs 4 lanesSide = 1; 5 patchesAhead = 6; 6 patchesBehind = 7; 7 trainIterations = 50000; 8 9 var num_inputs = (lanesSide * 2 + 1) * (patchesAhead + patchesBehind); 10 var num_actions = 5; 11 var temporal_window = 3; 12 var network_size = num_inputs * temporal_window + num_actions * temporal_window + num_inputs; 13 14 var layer_defs = []; 15 layer_defs.push({ 16 type: ‘input‘, 17 out_sx: 1, 18 out_sy: 1, 19 out_depth: network_size 20 }); 21 layer_defs.push({ 22 type: ‘regression‘, 23 num_neurons: num_actions 24 }); 25 26 var tdtrainer_options = { 27 learning_rate: 0.001, 28 momentum: 0.2, 29 batch_size: 64, 30 l2_decay: 0.01 31 }; 32 33 var opt = {}; 34 opt.temporal_window = temporal_window; 35 opt.experience_size = 3000; 36 opt.start_learn_threshold = 500; 37 opt.gamma = 0.7; 38 opt.learning_steps_total = 10000; 39 opt.learning_steps_burnin = 1000; 40 opt.epsilon_min = 0.0; 41 opt.epsilon_test_time = 0.0; 42 opt.layer_defs = layer_defs; 43 opt.tdtrainer_options = tdtrainer_options; 44 45 brain = new deepqlearn.Brain(num_inputs, num_actions, opt); 46 47 learn = function (state, lastReward) { 48 brain.backward(lastReward); 49 var action = brain.forward(state); 50 51 draw_net(); 52 draw_stats(); 53 54 return action; 55 } 56 57 //]]> 58
把这个贴进Deep Traffic的代码框试试吧。
本人ML弱鸡,如果有意见或建议麻烦提出:)
标签:function existing pre data 原因 div style 代码 car
原文地址:http://www.cnblogs.com/codeiscode/p/7763857.html