In Deep Reinforcement Learning, using continuous action spaces, why does it seem to be common practice to clamp the action right before the agent's execution?
Examples:
OpenAI Gym Mountain Car https://github.com/openai/gym/blob/master/gym/envs/classic_control/continuous_mountain_car.py#L57
Unity 3DBall https://github.com/Unity-Technologies/ml-agents/blob/master/unity-environment/Assets/ML-Agents/Examples/3DBall/Scripts/Ball3DAgent.cs#L29
Isn't information lost doing so? Like if the model outputs +10 for velocity (moving), which is then clamped to +1, the action itself behaves rather discrete (concerning its mere execution). For a fine grained movement, wouldn't it make more sense to multiply the output by something like 0.1?