I am attempting to create a custom environment for reinforcement learning with openAI gym. I need to represent all possible values that the environment will see in a variable called observation_space
. There are 3 possible actions for the agent to use called action_space
To be more specific the observation_space
is a temperature sensor which will see possible ranges from 50 to 150 degrees and I think I can represent all of this by:
EDIT, I had the action_space numpy array wrong
import numpy as np
action_space = np.array([ 0, 1, 2])
observation_space = np.arange(50,150,1)
Is there a better method that I could use for the observation_space
where I could bin the data? IE, make 20 bins 50-55, 55-60, 60-65, etc...
I think what I have will work but seems sort of cumbersome... And I am sure there is a better practice as there is not a lot of wisdom on my end this subject. This will print out a Q table:
action_size = action_space.shape[0]
state_size = observation_space.shape[0]
qtable = np.zeros((state_size, action_size))
print(qtable)