python - 关于使用 Unity：ML-Agents 和 DQN 算法

问问题 2021-11-06T17:53:09.953

47 次

通过连接外部 API 和我创建的统一环境，我很难学习。

我正在查看以前的 ml-agent 版本的 DQN 代码，并想使用以下代码。我应该如何在当前版本中使用它？

# send the action to the environment and receive resultant environment information
        env_info = env.step(action)[brain_name]        

        next_state = env_info.vector_observations[0]   # get the next state
        reward = env_info.rewards[0]                   # get the reward
        done = env_info.local_done[0]                  # see if episode has finished

和

# reset the unity environment at the beginning of each episode
    env_info = env.reset(train_mode=True)[brain_name]     

    # get initial state of the unity environment 
    state = env_info.vector_observations[0]

像这样。

我在看官方纪录片的时候正在尝试将其更改为当前版本，但没有明确解决。

我应该如何在当前的 ml-agents 版本中使用它？这到底是什么意思？我的意思是，为什么这会成为环境的“状态”？

下面的代码是看了官方纪录片后写的，但不知道这样对不对，也不知道是什么意思。

    decision_steps, terminal_steps = env.get_steps(behavior_name)
    
    state = decision_steps.obs[index][0,:]

python - 关于使用 Unity：ML-Agents 和 DQN 算法

0 回答 0

Related

Reference