码迷,mamicode.com
首页 > 其他好文 > 详细

gym 搭建 RL 环境

时间:2020-07-21 09:54:32      阅读:114      评论:0      收藏:0      [点我收藏+]

标签:max   time   set   min   crete   def   inf   close   mode   

gym调用

gym的调用遵从以下的顺序

  1. env = gym.make('x')
  2. observation = env.reset()
  3. for i in range(time_steps):
    env.render()
    action = policy(observation)
    observation, reward, done, info = env.step(action)
    if done:
    ……
    break
  4. env.close()

例程

例程是一个简单的策略,杆左斜车左移,右斜则右移。

import gym
import numpy as np
env = gym.make('CartPole-v0')
t_all = []
action_bef = 0
for i_episode in range(5):
    observation = env.reset()
    for t in range(100):
        env.render()
        cp, cv, pa, pv = observation
        if abs(pa)<= 0.1:
            action = 1 -action_bef
        elif pa >= 0:
            action = 1
        elif pa <= 0:
            action = 0
        observation, reward, done, info = env.step(action)
        action_bef = action
        if done:
            # print("Episode finished after {} timesteps".format(t+1))
            t_all.append(t)
            break
        if t ==99:
            t_all.append(0)
env.close()
print(t_all)
print(np.mean(t_all))


gym的搭建

gym的函数构成

一个完整的gym环境包括以下函数:类构建、初始化、

  • class Cartpoleenv(gym.env)
    • def __ init __(self):
    • def reset(self):
    • def seed(self, seed = None): return [seed]
    • def step(self, action): return self.state, reward, done, {}
    • def render(self, mode='human'): return self.viewer.render()
    • def close():

功能函数

  • 参数限位 vel = np.clip(vel, vel_min, vel_max)
  • action输入校验
    self.action_space.contains(action)

  • action和observation空间定义
    Discrete: 0,1,2
    low = np.array([min_0,min_1],dtype=np.float32)
    high = np.array([max_0,max_1],dtype=np.float32)

    self.action_space = spaces.Discrete(3)
    self.observation_space = spaces.Box(
    self.low, self.high, dtype=np.float32)

gym 搭建 RL 环境

标签:max   time   set   min   crete   def   inf   close   mode   

原文地址:https://www.cnblogs.com/tolshao/p/gym-da-jian-rl-huan-jing.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!