我正在尝试训练代理玩Connect4游戏。我找到了一个如何训练它的例子。板的表示是 1x6x7 数组:
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 2]
[0 0 0 0 0 0 1]]]
使用了这种神经网络架构:
class Net(BaseFeaturesExtractor):
def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
super(Net, self).__init__(observation_space, features_dim)
# We assume CxHxW images (channels first)
# Re-ordering will be done by pre-preprocessing or wrapper
n_input_channels = observation_space.shape[0]
self.cnn = nn.Sequential(
nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=0),
nn.ReLU(),
nn.Flatten(),
)
# Compute shape by doing one forward pass
with th.no_grad():
n_flatten = self.cnn(th.as_tensor(observation_space.sample()[None]).float()).shape[1]
self.linear = nn.Sequential(nn.Linear(n_flatten, features_dim), nn.ReLU())
def forward(self, observations: th.Tensor) -> th.Tensor:
return self.linear(self.cnn(observations))
在与随机移动的智能体 2 的游戏中,它的得分还不错:
Agent 1 Win Percentage: 0.59
Agent 2 Win Percentage: 0.38
Number of Invalid Plays by Agent 1: 3
Number of Invalid Plays by Agent 2: 0
Number of Draws (in 100 game rounds): 0
这里建议将 3 层表示作为改进代理的方法之一:
我试图实现它,这是新的 3 层板表示的示例:
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 1]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 1]
[0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 1]
[0 0 0 0 0 0 0]
[1 1 1 1 1 1 0]]]
当我使用当前的神经网络架构运行它时,代理无法正确训练:
Agent 1 Win Percentage: 0.0
Agent 2 Win Percentage: 0.0
Number of Invalid Plays by Agent 1: 100
Number of Invalid Plays by Agent 2: 0
Number of Draws (in 100 game rounds): 0
在这里你可以看到我的代码。
正如你现在所看到的,我有 3 层而不是 1 层。这就是我尝试使用 Conv3d 的原因:
class Net(BaseFeaturesExtractor):
def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
super(Net, self).__init__(observation_space, features_dim)
# We assume CxHxW images (channels first)
# Re-ordering will be done by pre-preprocessing or wrapper
n_input_channels = observation_space.shape[0]
self.cnn = nn.Sequential(
nn.Conv3d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv3d(32, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv3d(64, 128, kernel_size=3, stride=1, padding=0),
nn.ReLU(),
nn.Flatten(),
)
# Compute shape by doing one forward pass
with th.no_grad():
n_flatten = self.cnn(th.as_tensor(observation_space.sample()[None]).float()).shape[1]
self.linear = nn.Sequential(nn.Linear(n_flatten, features_dim), nn.ReLU())
当我尝试运行此代码时,它显示此错误:
RuntimeError: 5 维权重 [32, 1, 3, 3, 3] 的预期 5 维输入,但得到了大小为 [1, 3, 6, 7] 的 4 维输入
我的问题:如何使用具有 3x6x7 形状输入的 Conv3D 层?