我正在使用以下内容:
- CUDA 10.0
- PyTorch 1.2
- https://github.com/ruotianluo/pytorch-faster-rcnn
- 测试权重集与训练权重集不同。
- 训练权重集来自 caffe 预训练的 ResNet101 主干
我已经获取了这个 repo 并将其转换为使用 Kitti 数据。为此,我在数据集中添加了一个新的 Kitti 类并完成了必要的转换。测试和评估都使用 PASCAL VOC 中的以下类集:
self._classes = (
'__background__', # always index 0
'aeroplane',
'bicycle',
'bird',
'boat',
'bottle',
'bus',
'car',
'cat',
'chair',
'cow',
'diningtable',
'dog',
'horse',
'motorbike',
'person',
'pottedplant',
'sheep',
'sofa',
'train',
'tvmonitor')
我已将课程设置更改为:
self._classes = (
'dontcare', # always index 0
'pedestrian',
'car',
'truck',
'cyclist')
#-----------------------------
N.B.: Classes should NOT matter here, as the result out of the backbone is simply a featureset, not a classification
#-----------------------------
在看似随机的图像中(将这些“问题”图像从训练集中取出似乎会改变程序在哪个图像上失败),训练代码似乎会从 region-proposal-network 产生 NaN。我有点不知道为什么。
- 尝试将归一化更改为 Kitti 特定的归一化值
- 尝试将图像大小调整为 224x224
尝试将归一化数字除以平均标准偏差
-----------------
网络定义
-----------------
self.conv1 = conv3x3(inplanes, planes, stride) self.bn1 = norm_layer(planes) self.relu = nn.ReLU(inplace=True) self.conv2 = conv3x3(planes, planes) self.bn2 = norm_layer(planes) self.downsample = 下采样 self.stride = stride
self._layers['head'] = nn.Sequential(self.resnet.conv1, self.resnet.bn1, self.resnet.relu,self.resnet.maxpool, self.resnet.layer1, self.resnet.layer2,self .resnet.layer3)
self.rpn_net = nn.Conv2d(self._net_conv_channels, cfg.RPN_CHANNELS, [3, 3], padding=1)
-----------------
准备图像
-----------------
self._image = torch.from_numpy(image.transpose([0, 3, 1, 2])).to(self._device) self.net.train_step(blob, self.optimizer)
-----------------
计算图
-----------------
(1) self.forward(blob['data'], blobs['im_info'], blobs['gt_boxes']) (2) rois, cls_prob, bbox_pred = self._predict() (3) net_conv = self._image_to_head () (4) net_conv = self._layers'head' (5) rpn = F.relu(self.rpn_net(net_conv))
------------------
解决问题的有用函数
------------------
def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1): """3x3 卷积与填充""" return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,padding=dilation ,组=组,偏差=假,膨胀=膨胀)
def conv1x1(in_planes, out_planes, stride=1): """1x1 卷积""" return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)
我不知道为什么会发生这种情况,但显然我希望 ResNet101 骨干网中有实数。可能不得不切换到vgg16。
(3) 的输出
tensor([[[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
...,
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]]], device='cuda:0'
有谁知道这里发生了什么?