opencv - 如何使用 readNet（或 readFromDarknet）而不是 readNetFromCaffe？

Question

我通过加载预训练的 MobileNet SSD 模型使用 opencv 进行了对象检测。从这篇文章。它读取视频并毫无问题地检测对象。但我想用readNet(or readFromDarknet) 代替readNetFromCaffe

net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

因为我只在 Darknet 框架中预先训练了我自己的对象的权重和 cfg 文件。readNetFromCaffe因此，我只是在上面的帖子readNet中更改并得到了一个错误：

Traceback (most recent call last):
  File "people_counter.py", line 124, in <module>
    for i in np.arange(0, detections.shape[2]):
IndexError: tuple index out of range

这detections是一个输出

blob = cv2.dnn.blobFromImage(frame, 1.0/255.0, (416, 416), True, crop=False)
net.setInput(blob)
detections = net.forward()

它的形状是 (1, 1, 100, 7)元组（使用时readNetFromCaffe）。

我有点期待它不会仅仅通过改变模型来工作。然后我决定寻找一个使用对象检测器的代码，readNet我在这里找到了它。我通读了代码，发现以下相同的行：

blob = cv2.dnn.blobFromImage(image, scale, (416,416), (0,0,0), True, crop=False)
net.setInput(blob)
outs = net.forward(get_output_layers(net))

在这里，形状outs是 (1, 845, 6) list。但是为了让我能够立即使用它（这里）， outs应该与detections. 我已经到了这部分，但不知道应该如何进行。

如果有不清楚的地方，我只需要帮助使用readNet（或readFromDarknet）而不是readNetFromCaffe在这篇文章中

score 0 · Accepted Answer

如果我们仔细查看代码，我们可以看到所有内容都取决于detections第 121 行的输出，我们应该调整其输出以使其与第63 行outs的this匹配。花了将近一天的时间后，我得出了一个合理的结论（不是完美的）解决方案。基本上，这都是关于 readNetFromCaffe和readFromDarknet的输出 blob，因为它们分别输出形状为1x1xNx7和的 blob NxC。这里Ns 是检测次数，但向量大小不同，即Nin1x1xNx7是检测次数，每个检测是值 [batchId, classId, confidence, left, top, right, bottom]和NinNxC检测到的物体数量，C 是类别数量 + 4，其中前 4 个数字是[center_x, center_y, width, height]。分析完这些，我们可以替换（124-130行）

for i in np.arange(0, detections.shape[2]):
    confidence = detections[0, 0, i, 2]
    if confidence > args["confidence"]:
        idx = int(detections[0, 0, i, 1])
        if CLASSES[idx] != "person":
            continue
        box = detections[0, 0, i, 3:7] * np.array([W, H, W, H])
        (startX, startY, endX, endY) = box.astype("int")

与等价线

    for i in np.arange(0, detections.shape[0]):
        scores = detections[i][5:]
        classId = np.argmax(scores)
        confidence = scores[classId]
        if confidence > args["confidence"]:
            idx = int(classId)
            if CLASSES[idx] != "person":
                continue

            center_x = int(detections[i][0] * 416)    
            center_y = int(detections[i][1] * 416)    
            width = int(detections[i][2] * 416)        
            height = int(detections[i][3] * 416)     
            left = int(center_x - width / 2)         
            top = int(center_y - height / 2)
            right = width + left - 1
            bottom = height + top - 1

            box = [left, top, width, height]
            (startX, startY, endX, endY) = box

通过这种方式，我们可以使用 Darknet 的cfg和权重跟踪“人”类，并使用可视化线向上/向下计数。

同样，可能还有一些其他更简单的方法来跟踪暗网权重文件的检测，但这适用于这种特殊情况。

参考：关于 and 输出的 blob 的更多信息readNetFromCaffereadFromDarknet

opencv - 如何使用 readNet（或 readFromDarknet）而不是 readNetFromCaffe？

1 回答 1

Related

Reference