c++ - 如何使用 OpenCV 在深度图像中找到任意变换的矩形？

Question

我正在尝试使用深度传感器为 Oculus Rift 开发套件添加位置跟踪。但是，我在产生可用结果的操作序列方面遇到了麻烦。

我从一个 16 位深度的图像开始，其中的值有点（但不是真的）对应于毫米。图像中未定义的值已设置为 0。

首先，我通过更新掩码图像以排除它们，从而消除了某个远近距离之外的所有内容。

  cv::Mat result = cv::Mat::zeros(depthImage.size(), CV_8UC3);
  cv::Mat depthMask;
  depthImage.convertTo(depthMask, CV_8U);
  for_each_pixel<DepthImagePixel, uint8_t>(depthImage, depthMask, 
    [&](DepthImagePixel & depthPixel, uint8_t & maskPixel){
      if (!maskPixel) {
        return;
      }
      static const uint16_t depthMax = 1200;
      static const uint16_t depthMin = 200;
      if (depthPixel < depthMin || depthPixel > depthMax) {
        maskPixel = 0;
      }
  });

接下来，由于我想要的特征可能比整体场景平均值更接近相机，我再次更新掩码以排除任何不在中值特定范围内的内容：

  const float depthAverage = cv::mean(depthImage, depthMask)[0];
  const uint16_t depthMax = depthAverage * 1.0;
  const uint16_t depthMin = depthAverage * 0.75;
  for_each_pixel<DepthImagePixel, uint8_t>(depthImage, depthMask, 
    [&](DepthImagePixel & depthPixel, uint8_t & maskPixel){
      if (!maskPixel) {
        return;
      }
      if (depthPixel < depthMin || depthPixel > depthMax) {
        maskPixel = 0;
      }
  });

最后，我将不在掩码中的所有内容归零，并将剩余值缩放到 10 和 255 之间，然后将图像格式转换为 8 位

  cv::Mat outsideMask;
  cv::bitwise_not(depthMask, outsideMask);
  // Zero out outside the mask
  cv::subtract(depthImage, depthImage, depthImage, outsideMask);
  // Within the mask, normalize to the range + X
  cv::subtract(depthImage, depthMin, depthImage, depthMask);
  double minVal, maxVal;
  minMaxLoc(depthImage, &minVal, &maxVal);
  float range = depthMax - depthMin;
  float scale = (((float)(UINT8_MAX - 10) / range));
  depthImage *= scale;
  cv::add(depthImage, 10, depthImage, depthMask);
  depthImage.convertTo(depthImage, CV_8U);

结果如下所示：

源图像

我对这部分代码非常满意，因为它产生了非常清晰的视觉特征。

然后，我应用了一些平滑操作来消除深度相机中可笑的噪音：

cv::medianBlur(depthImage, depthImage, 9);
cv::Mat blurred;
cv::bilateralFilter(depthImage, blurred, 5, 250, 250);
depthImage = blurred;
cv::Mat result = cv::Mat::zeros(depthImage.size(), CV_8UC3);
cv::insertChannel(depthImage, result, 0);

同样，这些特征在视觉上看起来很清晰，但我想知道它们是否不能以某种方式锐化：

在此处输入图像描述

接下来我使用 canny 进行边缘检测：

  cv::Mat canny_output;
  {
    cv::Canny(depthImage, canny_output, 20, 80, 3, true);
    cv::insertChannel(canny_output, result, 1);
  }

我正在寻找的线条在那里，但没有很好地代表角落：

在此处输入图像描述

最后我使用概率霍夫来识别线条：

  std::vector<cv::Vec4i> lines;
  cv::HoughLinesP(canny_output, lines, pixelRes, degreeRes * CV_PI / 180, hughThreshold, hughMinLength, hughMaxGap);
  for (size_t i = 0; i < lines.size(); i++)
  {
    cv::Vec4i l = lines[i];
    glm::vec2 a((l[0], l[1]));
    glm::vec2 b((l[2], l[3]));
    float length = glm::length(a - b);
    cv::line(result, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(0, 0, 255), 3, CV_AA);
  }

这导致了这个图像

在此处输入图像描述

在这一点上，我觉得我已经脱轨了，因为我找不到一套好的 Hough 参数来生成合理数量的候选线来搜索我的形状，而且我不确定是否我应该摆弄霍夫或着眼于改善先前步骤的输出。

有没有一种好方法可以在每个阶段客观地验证我的结果，而不是仅仅摆弄输入值直到我认为它“看起来不错”？有没有更好的方法来找到给定起始图像的矩形（并且假设它不一定朝向特定方向？

score 2 · Accepted Answer

Very cool project!

Though, I feel like your approach does not use all the info that you could get from the depthmap (e.g. 3D points, normals, etc), which would help a lot.

The Point Cloud Library (PCL), which is a C++ library dedicated to the processing of RGB-D data, has a tutorial on plane segmentation using RANSAC which could inspire you. You might not want to use PCL in your program, due to the numerous dependencies, however as it is open-source, you can find the algorithm implementation on Github (PCL SAC segmentation). However, RANSAC might be slow and produce unwanted results depending on the scene.

You could also try to use the approach presented in "Real-Time Plane Segmentation using RGB-D Cameras" by Holz, Holzer, Rusu and Behnke, 2011 (PDF), which suggests fast normal estimation using integral images followed by plane detection using clustering of normals.

c++ - 如何使用 OpenCV 在深度图像中找到任意变换的矩形？

1 回答 1

Related

Reference