标签:pac make bsp range div rom one close world
下方是用來簡單的測試 grid_mdp.py的程式,執行後會用隨機動作去跑動作。
1 import gym 2 import tensorflow 3 import random 4 from gym import wrappers 5 6 env = gym.make(‘GridWorld-v0‘) 7 8 env = wrappers.Monitor(env, ‘./outputs/grid_mdp-experiment-‘, force=True) 9 10 for episode in range(100): 11 env.reset() 12 for i in range(100): 13 env.render() 14 next_state, reward, done, _ = env.step(random.choice(env.action_space)) # take a random action 15 16 if done : 17 break 18 19 print(‘episdoe: ‘, episode) 20 21 wrappers.Monitor.close(env)
标签:pac make bsp range div rom one close world
原文地址:https://www.cnblogs.com/lishyhan/p/9052161.html