sumo - flow/tutorials/tutorial09_environments.ipynb的“rl_actions”是什么意思

Question

当我学习教程 9 时，我对 rl_actions 感到困惑。因为在程序上，rl_actions是没有初始化和定义的。为什么 _apply_rl_actions 函数和 compute_reward 函数都有一个 'rl_actions' 参数？我还检查了车辆内核代码，关于 apply_acceleration 函数。原来的一个是：

def apply_acceleration(self, veh_ids, acc):
        """See parent class."""
        # to hand the case of a single vehicle
        if type(veh_ids) == str:
            veh_ids = [veh_ids]
            acc = [acc]

        for i, vid in enumerate(veh_ids):
            if acc[i] is not None and vid in self.get_ids():
                this_vel = self.get_speed(vid)
                next_vel = max([this_vel + acc[i] * self.sim_step, 0])
                self.kernel_api.vehicle.slowDown(vid, next_vel, 1e-3)

score 1 · Accepted Answer

flow/envs/base_env.py在方法中查看，step这是调用的位置。所有这 3 种方法都将应用到代理的操作作为参数。这些动作由 RL 算法提供。的形状是您的环境方法中提供的形状。apply_rl_actionscompute_rewardrl_actionsrl_actionsaction_space

RL 算法会在每一步自动调用您的step方法，为其提供要应用的操作。Flow 的环境实际上被封装在一个Gym环境中，该环境被赋予 RL 算法。RL 算法可以在任何Gym环境下工作，这使得它非常通用，因为所有Gym环境都有诸如等的方法step。reset如果您想了解更多关于它如何工作的信息，请查看如何训练自定义Gym环境。

sumo - flow/tutorials/tutorial09_environments.ipynb的“rl_actions”是什么意思

1 回答 1

Related

Reference