python - 实时使用 cv2.CascadeClassifier() 时显示图像的延迟

Question

我正在为 Tello 无人机开发一个 ROS 项目，我使用这个驱动程序。当我只是订阅 CompressedImage来自无人机相机的消息并在屏幕上显示图像时，我没有任何问题，一切正常。

但是，一旦我尝试将人脸检测与一起使用cv2.CascadeClassifier，帧就会实时出现大约 30 秒的巨大延迟。因此，图像仅在大约 30 秒后显示在屏幕上。有谁知道如何最大限度地减少这种延迟以获得实时的良好结果？

这是到目前为止的代码：

#!/usr/bin/env python

import rospy
from sensor_msgs.msg import CompressedImage
import av
import cv2
import numpy
import threading
import traceback


class StandaloneVideoStream(object):
    def __init__(self):
        self.cond = threading.Condition()
        self.queue = []
        self.closed = False

    def read(self, size):
        self.cond.acquire()
        try:
            if len(self.queue) == 0 and not self.closed:
                self.cond.wait(2.0)
            data = bytes()
            while 0 < len(self.queue) and len(data) + len(self.queue[0]) < size:
                data = data + self.queue[0]
                del self.queue[0]
        finally:
            self.cond.release()
        return data

    def seek(self, offset, whence):
        return -1

    def close(self):
        self.cond.acquire()
        self.queue = []
        self.closed = True
        self.cond.notifyAll()
        self.cond.release()

    def add_frame(self, buf):
        self.cond.acquire()
        self.queue.append(buf)
        self.cond.notifyAll()
        self.cond.release()


stream = StandaloneVideoStream()


def callback(msg):
    stream.add_frame(msg.data)


def main():
    rospy.init_node('face_detection')

    rospy.Subscriber('/tello/image_raw/h264', CompressedImage, callback)

    container = av.open(stream)

    for frame in container.decode(video=0):
        image_msg = cv2.cvtColor(numpy.array(frame.to_image()), cv2.COLOR_RGB2BGR) 

        stop_data = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
        found = stop_data.detectMultiScale(image_msg, minSize =(20, 20))

        amount_found = len(found)

        if amount_found != 0:
            for (x, y, width, height) in found:  
                cv2.rectangle(image_msg, (x, y), (x + height, y + width), (0, 255, 0), 5)

        cv2.imshow('Frame', image_msg)
        cv2.waitKey(1)


if __name__ == '__main__':
    try:
        main()
    except BaseException:
        traceback.print_exc()
    finally:
        stream.close()
        cv2.destroyAllWindows()

编辑：

当我打印出图像的形状 ( image_msg) 时，我得到了(720, 960, 3)高度、宽度和 3 个通道的尺寸

这显示了流的大小（以字节为单位）

...
    Tello: 15:54:16.106:  Info: video data 599118 bytes 290.2KB/sec
    Tello: 15:54:18.106:  Info: video data 502212 bytes 245.2KB/sec
    Tello: 15:54:20.108:  Info: video data 503748 bytes 245.7KB/sec
    Tello: 15:54:22.109:  Info: video data 503182 bytes 245.6KB/sec
    Tello: 15:54:22.446:  Info: video recv: 1460 bytes 1b00 +103
    Tello: 15:54:22.813:  Info: video recv: 1460 bytes 2400 +173
    Tello: 15:54:23.190:  Info: video recv: 1460 bytes 2f00 +177
    Tello: 15:54:23.554:  Info: video recv: 1460 bytes 3a00 +178
    Tello: 15:54:23.918:  Info: video recv: 1460 bytes 4500 +176
    Tello: 15:54:24.268:  Info: video recv: 1460 bytes 5000 +160
    Tello: 15:54:24.268:  Info: video data 502157 bytes 227.1KB/sec
    Tello: 15:54:24.585:  Info: video recv: 1460 bytes 5c00 +140
    Tello: 15:54:24.917:  Info: video recv: 1460 bytes 6600 +142
    Tello: 15:54:25.266:  Info: video recv: 1460 bytes 7000 +157
    Tello: 15:54:25.545:  Info: video recv: 1460 bytes 7a00 +102
    Tello: 15:54:25.878:  Info: video recv: 1460 bytes 8201 +140
    Tello: 15:54:26.178:  Info: video recv: 1460 bytes 8d00 +102
    Tello: 15:54:26.271:  Info: video data 534194 bytes 260.5KB/sec
...

score 0 · Accepted Answer

如果这段代码是 30 秒的循环时间，并且注释掉的“stop_data =”和“found =”行明显更快，那么这就是瓶颈。您有 3 个选项（按严重性）：1）更改参数，2）更改输入数据，3）更改算法。我假设您已尝试 (1) 更改参数并且您不想 (3) 更改算法，因此您唯一的选择是 (2) 更改输入数据。

尝试将图像下采样为 240x360 或 480x720。它应该更快。您可以使用cv2.pyrDown()(c++ doc) (Python ex)，这是一种高斯平滑下采样，它可以使图像比简单的每 n 个像素下采样更平滑。

python - 实时使用 cv2.CascadeClassifier() 时显示图像的延迟

1 回答 1

Related

Reference