jupyter-notebook - 是否可以完全从 Jupyter Notebook 中运行自定义 OpenAI 健身房环境

Question

长话短说：我收到了一些用于自定义 openAI 健身房环境的 Python 代码。我可以从命令行通过 ExperimentGrid 成功运行代码，但希望能够从 Jupyter 笔记本中运行整个实验，而不是调用脚本。这对于我将在更远的地方进行的一些实验来说会更方便。

我的问题：是否可以完全从 Jupyter Notebook 中在自定义OpenAI 健身房环境中执行实验，如果可以，如何？我已经看到很多人从 Jupyter 执行健身房的标准环境（如 SpaceInvaders-v0 或 CartPole-v0）的例子，但即便如此，他们还是用

env=gym.make('SpaceInvaders-v0')

并且基本上在幕后执行该环境的脚本。

下面是我的代码如何设置为从命令行运行以及我在 Jupyter 中遇到的错误的基本描述。

任何意见，将不胜感激。诚然，我对 Gym、Python 和 Linux 相当陌生。

我的基本环境代码在 envs/mygames/Custom_Env.py 中的结构如下：

various import statements (numpy, gym, pyglet, copy)
class Entity()
class State()
class The_Custom_Env(core.Env) # This is the main environment class
class Shell_Class # This class calls The_Custom_Env and provides some arguments

在 mygames/__ init__.py 中，我导入了 Shell_Class：

from gym.envs.mygames.Custom_Env import Shell_Class

在 envs/__ init__.py 中，我已经注册了环境

register(
id='TEST-v0',
entry_point='gym.envs.mygames:Shell_Class', 
max_episode_steps=200,
reward_threshold=25.0,)

最后，如果我从命令行执行包含此代码的脚本，则实验可以正常工作：

from spinup.utils.run_utils import ExperimentGrid
from spinup import ppo_pytorch
import torch

if __name__ == '__main__':
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('--cpu', type=int, default=4)
    parser.add_argument('--num_runs', type=int, default=1)
    args = parser.parse_args()

    eg = ExperimentGrid(name='super-cool-test')
    eg.add('env_name', 'TEST-v0', '', True)
    eg.add('seed', [10*i for i in range(args.num_runs)])
    eg.add('epochs', [10])
    eg.add('steps_per_epoch', 4000)
    eg.add('ac_kwargs:hidden_sizes', [(32, 32)], 'hid')
    eg.add('ac_kwargs:activation', [torch.nn.ReLU], '')
    eg.add('pi_lr', [0.001])
    eg.add('clip_ratio', 0.3)
    eg.run(ppo_pytorch, num_cpu=args.cpu)

我的 Jupyter 尝试

我将 Custom_env.py 中的所有代码放在单元格 #1 中。然后我在 #2 单元格中注册了环境：

gym.register(
id='TEST-v1',
entry_point='__main__:Shell_Class',
max_episode_steps=200,
reward_threshold=25.0,)

基于此 Q/A：注册在 jupyter 笔记本单元内定义的健身房环境，我在单元 #3 中创建环境：

gym.make('TEST-v1')

并得到这个非描述性的输出：

<TimeLimit<Shell_Class< TEST-v1 >>>

在单元格 #4 中，我尝试直接在 Jupyter 中执行 ExperimentGrid 代码，如下所示：

from spinup.utils.run_utils import ExperimentGrid
from spinup import ppo_pytorch
import torch

num_runs=1
cpu=4
env_name='TEST-v1'
eg = ExperimentGrid(name='Jupyter-test')
eg.add('env_name', env_name, '', True)
eg.add('seed', [10*i for i in range(num_runs)])
eg.add('epochs', 500)
eg.add('steps_per_epoch', 4000)
eg.add('ac_kwargs:hidden_sizes', [(32, 32)], 'hid')
eg.add('ac_kwargs:activation', [torch.nn.ReLU], '')
eg.add('pi_lr', 0.001)
eg.add('clip_ratio', 0.3)
eg.run(ppo_pytorch, num_cpu=cpu)

实验照常启动，但随后遇到某种错误：

> ================================================================================
ExperimentGrid [Jupyter-test] runs over parameters:

 env_name                                 [] 

    TEST-v1

 seed                                     [see] 

    0

 epochs                                   [epo] 

    500

 steps_per_epoch                          [ste] 

    4000

 ac_kwargs:hidden_sizes                   [hid] 

    (32, 32)

 ac_kwargs:activation                     [] 

    ReLU

 pi_lr                                    [pi] 

    0.001

 clip_ratio                               [cli] 

    0.3

 Variants, counting seeds:               1
 Variants, not counting seeds:           1

================================================================================

Preparing to run the following experiments...

Jupyter-test_test-v1

================================================================================

Launch delayed to give you a few seconds to review your experiments.

To customize or disable this behavior, change WAIT_BEFORE_LAUNCH in
spinup/user_config.py.

================================================================================
                                                                                
Running experiment:

Jupyter-test_test-v1

with kwargs:

{
    "ac_kwargs":    {
        "activation":   "ReLU",
        "hidden_sizes": [
            32,
            32
        ]
    },
    "clip_ratio":   0.3,
    "env_name": "TEST-v1",
    "epochs":   500,
    "pi_lr":    0.001,
    "seed": 0,
    "steps_per_epoch":  4000
}





================================================================================


There appears to have been an error in your experiment.

Check the traceback above to see what actually went wrong. The 
traceback below, included for completeness (but probably not useful
for diagnosing the error), shows the stack leading up to the 
experiment launch.

================================================================================



---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
<ipython-input-14-de843fd528cf> in <module>
     15 eg.add('pi_lr', 0.001)
     16 eg.add('clip_ratio', 0.3)
---> 17 eg.run(ppo_pytorch, num_cpu=cpu)

~/Downloads/spinningup/spinup/utils/run_utils.py in run(self, thunk, num_cpu, data_dir, datestamp)
    544 
    545             call_experiment(exp_name, thunk_, num_cpu=num_cpu, 
--> 546                             data_dir=data_dir, datestamp=datestamp, **var)
    547 
    548 

~/Downloads/spinningup/spinup/utils/run_utils.py in call_experiment(exp_name, thunk, seed, num_cpu, data_dir, datestamp, **kwargs)
    169     cmd = [sys.executable if sys.executable else 'python', entrypoint, encoded_thunk]
    170     try:
--> 171         subprocess.check_call(cmd, env=os.environ)
    172     except CalledProcessError:
    173         err_msg = '\n'*3 + '='*DIV_LINE_WIDTH + '\n' + dedent("""

~/anaconda3/envs/spinningup/lib/python3.6/subprocess.py in check_call(*popenargs, **kwargs)
    309         if cmd is None:
    310             cmd = popenargs[0]
--> 311         raise CalledProcessError(retcode, cmd)
    312     return 0
    313

jupyter-notebook - 是否可以完全从 Jupyter Notebook 中运行自定义 OpenAI 健身房环境

0 回答 0

Related

Reference