tensorflow - 为 RLLib 模型传递自定义模型参数的正确方法？

Question

我有一个基本的自定义模型，它本质上只是默认 RLLib 全连接模型的复制粘贴（https://github.com/ray-project/ray/blob/master/rllib/models/tf/fcnet.py）我正在通过带有"custom_model_config": {}字典的配置文件传递自定义模型参数。此配置文件如下所示：

# Custom RLLib model
custom_model: test_model

# Custom options
custom_model_config:
  ## Default fully connected network settings
  # Nonlinearity for fully connected net (tanh, relu)
  "fcnet_activation": "tanh"
  # Number of hidden layers for fully connected net
  "fcnet_hiddens": [256, 256]
  # For DiagGaussian action distributions, make the second half of the model
  # outputs floating bias variables instead of state-dependent. This only
  # has an effect is using the default fully connected net.
  "free_log_std": False
  # Whether to skip the final linear layer used to resize the hidden layer
  # outputs to size `num_outputs`. If True, then the last hidden layer
  # should already match num_outputs.
  "no_final_linear": False
  # Whether layers should be shared for the value function.
  "vf_share_layers": True

  ## Additional settings
  # L2 regularization value for fully connected layers
  "l2_reg_value": 0.1

当我使用这个设置开始训练过程时，RLLib 给了我以下警告：

自定义 ModelV2 应该接受所有自定义选项作为 **kwargs，而不是在 config['custom_model_config'] 中期望它们！

我了解 **kwargs 的作用，但我不确定如何使用自定义 RLLib 模型来实现它以修复此警告。有任何想法吗？

score 1 · Accepted Answer

TL;DR：添加**customized_model_kwargs您的网络__init__，然后从中取回您的自定义配置。

我将向您解释如何避免此警告。

当您使用自定义网络时，您肯定会使用以下内容：

policy.target_q_model = ModelCatalog.get_model_v2(
        obs_space=obs_space,
        action_space=action_space,
        num_outputs=1,
        model_config=config["model"],
        framework="torch",
        name=Q_TARGET_SCOPE)

该模型由 Ray 像这样实例化（参见 ModelCatalog https://docs.ray.io/en/master/_modules/ray/rllib/models/catalog.html）：

instance = model_cls(obs_space, action_space, num_outputs,
                                         model_config, name,
                                         **customized_model_kwargs)

因此，您应该像这样声明您的网络：

  def __init__(self, obs_space: gym.spaces.Space,
               action_space: gym.spaces.Space, num_outputs: int,
               model_config: ModelConfigDict, name: str, **customized_model_kwargs):
    TorchModelV2.__init__(self, obs_space, action_space, num_outputs,
                          model_config, name)
    nn.Module.__init__(self)

注意customized_model_kwargs参数。

然后，您可以使用customized_model_kwargs["your_key"]来访问您的自定义配置。

注意：对于 TF 也是类似的

score 1 · Accepted Answer

您可以通过设置"custom_model_config"模型配置的一部分来传递自定义模型参数。默认情况下它是空的。

从文档：

# Name of a custom model to use
"custom_model": None,
# Extra options to pass to the custom classes. These will be available to
# the Model's constructor in the model_config field. Also, they will be
# attempted to be passed as **kwargs to ModelV2 models. For an example,
# see rllib/models/[tf|torch]/attention_net.py.
"custom_model_config": {},

您的自定义模型model_config在其构造函数中有一个参数。您可以通过访问模型参数model_config["custom_model_config"]。

例子：

# setting custom params
config = ppo.DEFAULT_CONFIG.copy()
config["model"] = {
  "custom_model": MyModel,
  "custom_model_config": {
    "my_param": 42
  }
}
...
trainer = ppo.PPOTrainer(config=config, env=MyEnv)

内部MyModel：

class MyModel(TFModelV2):
  def __init__(self, obs_space, action_space, num_outputs, model_config, name, **kwargs):
    self.my_param = model_config["custom_model_config"]["my_param"]

tensorflow - 为 RLLib 模型传递自定义模型参数的正确方法？

2 回答 2

Related

Reference