python binning data openAI gym

I am attempting to create a custom environment for reinforcement learning with openAI gym. I need to represent all possible values that the environment will see in a variable called observation_space. There are 3 possible actions for the agent to use called action_space

To be more specific the observation_space is a temperature sensor which will see possible ranges from 50 to 150 degrees and I think I can represent all of this by:

EDIT, I had the action_space numpy array wrong

import numpy as np
action_space = np.array([ 0,  1,  2])
observation_space = np.arange(50,150,1)

Is there a better method that I could use for the observation_space where I could bin the data? IE, make 20 bins 50-55, 55-60, 60-65, etc...

I think what I have will work but seems sort of cumbersome... And I am sure there is a better practice as there is not a lot of wisdom on my end this subject. This will print out a Q table:

action_size = action_space.shape[0]
state_size = observation_space.shape[0]

qtable = np.zeros((state_size, action_size))
print(qtable)

标签： python numpy reinforcement-learning openai-gym

1条回答

Lonely孤独者°

2楼-- · 2019-06-10 11:18

This is not really related to programming, so maybe on stats.stackexchange you may get better answers. Anyway, it just depends on how much accuracy you want. I guess you want to change the temperature (increase, decrease, don't change) according to the sensor readings. Is there much different (in terms of optimal action) between 50 and 51? If not, then you can discretize the state space every 2 degrees. And so on.

More generally, doing so you are using what in RL are called "features". A discretization over an interval of the state space is called tile coding and usually works well.

If you are new to RL, I really advise to read this book, or at least Chapters 1,3,4 which are related to what you are doing.

0人赞添加讨论(0) 举报

python binning data openAI gym

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间