c++ - 使用 opencv 匹配一组图像中的图像，以便在 C++ 中进行识别

Question

编辑：我通过这篇文章获得了足够的声誉，可以使用更多链接对其进行编辑，这将帮助我更好地理解我的观点

玩以撒装订的人经常会在小基座上看到重要的物品。

目标是让用户对某个项目能够按下一个按钮感到困惑，然后该按钮将指示他“装箱”该项目（想想 Windows 桌面装箱）。该框为我们提供了感兴趣的区域（实际项目加上一些背景环境）以与整个项目网格进行比较。

理论用户装箱项目在此处输入图像描述

项目的理论网格（没有更多，我只是从 isaac wiki 的绑定中撕下这个）在此处输入图像描述

项目网格中标识为用户装箱的项目的位置将表示图像上的特定区域，该区域与提供有关项目信息的 isaac wiki 绑定的正确链接相关。

在网格中，项目是从底行算起的第 3 列的第 1 列。我在下面尝试的所有事情中都使用了这两个图像

我的目标是创建一个程序，该程序可以手动裁剪游戏“以撒的结合”中的物品，通过将图像与游戏中物品表的图像进行比较来识别裁剪的物品，然后显示正确的维基页面。

这将是我的第一个“真正的项目”，因为它需要大量的图书馆学习才能完成我想做的事情。这有点不知所措。

我只是通过谷歌搜索弄乱了一些选项。（你可以通过搜索方法名称和opencv快速找到我使用的教程。由于某种原因，我的帐户因链接发布而受到严格限制）

使用暴力匹配器：

http://docs.opencv.org/doc/tutorials/features2d/feature_description/feature_description.html

#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"

using namespace cv;

void readme();

/** @function main */
int main( int argc, char** argv )
{
  if( argc != 3 )
   { return -1; }

  Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
  Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

  if( !img_1.data || !img_2.data )
   { return -1; }

  //-- Step 1: Detect the keypoints using SURF Detector
  int minHessian = 400;

  SurfFeatureDetector detector( minHessian );

  std::vector<KeyPoint> keypoints_1, keypoints_2;

  detector.detect( img_1, keypoints_1 );
  detector.detect( img_2, keypoints_2 );

  //-- Step 2: Calculate descriptors (feature vectors)
  SurfDescriptorExtractor extractor;

  Mat descriptors_1, descriptors_2;

  extractor.compute( img_1, keypoints_1, descriptors_1 );
  extractor.compute( img_2, keypoints_2, descriptors_2 );

  //-- Step 3: Matching descriptor vectors with a brute force matcher
  BruteForceMatcher< L2<float> > matcher;
  std::vector< DMatch > matches;
  matcher.match( descriptors_1, descriptors_2, matches );

  //-- Draw matches
  Mat img_matches;
  drawMatches( img_1, keypoints_1, img_2, keypoints_2, matches, img_matches );

  //-- Show detected matches
  imshow("Matches", img_matches );

  waitKey(0);

  return 0;
  }

 /** @function readme */
 void readme()
 { std::cout << " Usage: ./SURF_descriptor <img1> <img2>" << std::endl; }

在此处输入图像描述

导致看起来不太有用的东西。使用 flann 得到更清晰但同样不可靠的结果。

http://docs.opencv.org/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html

#include <stdio.h>
#include <iostream>
#include "opencv2/core/core.hpp"
#include <opencv2/legacy/legacy.hpp>
#include <opencv2/nonfree/features2d.hpp>
#include "opencv2/highgui/highgui.hpp"

using namespace cv;

void readme();

/** @function main */
int main( int argc, char** argv )
{
  if( argc != 3 )
  { readme(); return -1; }

  Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
  Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

  if( !img_1.data || !img_2.data )
  { std::cout<< " --(!) Error reading images " << std::endl; return -1; }

  //-- Step 1: Detect the keypoints using SURF Detector
  int minHessian = 400;

  SurfFeatureDetector detector( minHessian );

  std::vector<KeyPoint> keypoints_1, keypoints_2;

  detector.detect( img_1, keypoints_1 );
  detector.detect( img_2, keypoints_2 );

  //-- Step 2: Calculate descriptors (feature vectors)
  SurfDescriptorExtractor extractor;

  Mat descriptors_1, descriptors_2;

  extractor.compute( img_1, keypoints_1, descriptors_1 );
  extractor.compute( img_2, keypoints_2, descriptors_2 );

  //-- Step 3: Matching descriptor vectors using FLANN matcher
  FlannBasedMatcher matcher;
  std::vector< DMatch > matches;
  matcher.match( descriptors_1, descriptors_2, matches );

  double max_dist = 0; double min_dist = 100;

  //-- Quick calculation of max and min distances between keypoints
  for( int i = 0; i < descriptors_1.rows; i++ )
  { double dist = matches[i].distance;
    if( dist < min_dist ) min_dist = dist;
    if( dist > max_dist ) max_dist = dist;
  }

  printf("-- Max dist : %f \n", max_dist );
  printf("-- Min dist : %f \n", min_dist );

  //-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist )
  //-- PS.- radiusMatch can also be used here.
  std::vector< DMatch > good_matches;

  for( int i = 0; i < descriptors_1.rows; i++ )
  { if( matches[i].distance < 2*min_dist )
    { good_matches.push_back( matches[i]); }
  }

  //-- Draw only "good" matches
  Mat img_matches;
  drawMatches( img_1, keypoints_1, img_2, keypoints_2,
               good_matches, img_matches, Scalar::all(-1), Scalar::all(-1),
               vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS );

  //-- Show detected matches
  imshow( "Good Matches", img_matches );

  for( int i = 0; i < good_matches.size(); i++ )
  { printf( "-- Good Match [%d] Keypoint 1: %d  -- Keypoint 2: %d  \n", i, good_matches[i].queryIdx, good_matches[i].trainIdx ); }

  waitKey(0);

  return 0;
 }

 /** @function readme */
 void readme()
 { std::cout << " Usage: ./SURF_FlannMatcher <img1> <img2>" << std::endl; }

在此处输入图像描述

到目前为止，模板匹配是我最好的方法。在 6 种方法中，它的范围从仅获得 0-4 个正确标识。

http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html

#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>

using namespace std;
using namespace cv;

/// Global Variables
Mat img; Mat templ; Mat result;
char* image_window = "Source Image";
char* result_window = "Result window";

int match_method;
int max_Trackbar = 5;

/// Function Headers
void MatchingMethod( int, void* );

/** @function main */
int main( int argc, char** argv )
{
  /// Load image and template
  img = imread( argv[1], 1 );
  templ = imread( argv[2], 1 );

  /// Create windows
  namedWindow( image_window, CV_WINDOW_AUTOSIZE );
  namedWindow( result_window, CV_WINDOW_AUTOSIZE );

  /// Create Trackbar
  char* trackbar_label = "Method: \n 0: SQDIFF \n 1: SQDIFF NORMED \n 2: TM CCORR \n 3: TM CCORR NORMED \n 4: TM COEFF \n 5: TM COEFF NORMED";
  createTrackbar( trackbar_label, image_window, &match_method, max_Trackbar, MatchingMethod );

  MatchingMethod( 0, 0 );

  waitKey(0);
  return 0;
}

/**
 * @function MatchingMethod
 * @brief Trackbar callback
 */
void MatchingMethod( int, void* )
{
  /// Source image to display
  Mat img_display;
  img.copyTo( img_display );

  /// Create the result matrix
  int result_cols =  img.cols - templ.cols + 1;
  int result_rows = img.rows - templ.rows + 1;

  result.create( result_cols, result_rows, CV_32FC1 );

  /// Do the Matching and Normalize
  matchTemplate( img, templ, result, match_method );
  normalize( result, result, 0, 1, NORM_MINMAX, -1, Mat() );

  /// Localizing the best match with minMaxLoc
  double minVal; double maxVal; Point minLoc; Point maxLoc;
  Point matchLoc;

  minMaxLoc( result, &minVal, &maxVal, &minLoc, &maxLoc, Mat() );

  /// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
  if( match_method  == CV_TM_SQDIFF || match_method == CV_TM_SQDIFF_NORMED )
    { matchLoc = minLoc; }
  else
    { matchLoc = maxLoc; }

  /// Show me what you got
  rectangle( img_display, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
  rectangle( result, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );

  imshow( image_window, img_display );
  imshow( result_window, result );

  return;
}

http://imgur.com/pIRBPQM,h0wkqer,1JG0QY0,haLJzRF,CmrlTeL,DZuW73V#3

6 失败，通过，失败，通过，通过，通过

不过，这是一个最好的案例结果。我尝试的下一个项目是

在此处输入图像描述并导致失败，失败，失败，失败，失败，失败

从一个项目到另一个项目，所有这些方法都有一些效果很好，一些效果很差

所以我会问：模板匹配是我最好的选择，还是有一种我没有考虑过的方法会成为我的圣杯？

如何让用户手动创建裁剪？Opencv 在这方面的文档非常糟糕，我在网上找到的示例是非常古老的 cpp 或纯 C。

谢谢你的帮助。到目前为止，这次冒险是一次有趣的经历。我不得不删除所有可以更好地描述一切如何进行的链接，但该网站说我发布了超过 10 个链接，即使我没有发布。

整个游戏中的更多项目示例：

岩石是一种稀有物品，也是少数可以出现在屏幕上“任何地方”的物品之一。像岩石这样的项目是用户裁剪项目是隔离项目的最佳方式的原因，否则它们的位置仅在几个特定位置。

在此处输入图像描述

一个boss战后的物品，到处都是很多东西，中间是透明的。我想这是更难正确工作的问题之一

在此处输入图像描述

罕见的房间。简单的背景。没有项目透明度。

在此处输入图像描述

这是游戏中所有物品的两张表。我最终会将它们制作成一张图片，但现在它们是直接取自 isaac wiki 的。

在此处输入图像描述

score 2 · Accepted Answer

这里的一个重要细节是您对表中的每个项目都有纯图像。您知道背景的颜色，并且可以将项目与图片的其余部分分离。例如，除了表示图像本身的矩阵之外，您还可以存储相同大小的 1-s 和 0-s 矩阵，其中 1 对应于图像区域，0 对应于背景。让我们称这个矩阵为“掩码”和项目的纯图像 - “模式”。

有两种比较图像的方法：将图像与图案匹配和将图案与图像匹配。您所描述的是将图像与模式匹配 - 您有一些裁剪的图像并希望找到类似的模式。相反，请考虑在图像上搜索模式。

让我们首先定义一个函数match()，它接受相同大小的图案、遮罩和图像，并检查遮罩下图案上的区域是否与图像中的区域完全相同（伪代码）：

def match(pattern, mask, image):
    for x = 0 to pattern.width:
        for y = 0 to pattern.height: 
           if mask[x, y] == 1 and              # if in pattern this pixel is not part of background
              pattern[x, y] != image[x, y]:    # and pixels on pattern and image differ
               return False  
    return True

但是图案和裁剪图像的大小可能会有所不同。对此的标准解决方案（例如，在级联分类器中使用）是使用滑动窗口- 只需在图像上移动模式“窗口”并检查模式是否与所选区域匹配。这几乎就是图像检测在 OpenCV 中的工作方式。

当然，这个解决方案不是很健壮——裁剪、调整大小或任何其他图像转换可能会改变一些像素，在这种情况下，方法match()将始终返回 false。为了克服这个问题，您可以使用image 和 pattern 之间的距离来代替布尔答案。在这种情况下，函数match()应该返回一些相似性值，例如，介于 0 和 1 之间，其中 1 代表“完全相同”，而 0 代表“完全不同”。然后你要么设置相似度的阈值（例如，图像应该与图案至少有 85% 相似），要么只选择相似度最高的图案。

由于游戏中的物品是人造图像，并且它们的变化非常小，因此这种方法应该足够了。但是，对于更复杂的情况，您将需要其他功能，而不仅仅是掩码下的像素。正如我在评论中已经建议的那样，特征脸、使用类似 Haar 特征的级联分类器甚至主动外观模型等方法对于这些任务可能更有效。至于 SURF，据我所知，它更适合具有不同角度和对象大小的任务，但不适用于不同的背景和所有此类事情。

score 2 · Accepted Answer

我在试图找出我自己的模板匹配问题时遇到了你的问题，现在我回来分享我认为根据我自己的经验可能是你最好的选择。您可能早就放弃了这一点，但是有一天其他人可能会穿上类似的鞋子。

您共享的所有项目都不是实心矩形，并且由于opencv 中的模板匹配无法与蒙版一起使用，您将始终将参考图像与我必须假设的至少几个不同背景进行比较（更不用说那些在不同背景的不同位置发现，使模板匹配更差）。
它将始终比较背景像素并混淆您的匹配，除非您可以收集可以找到参考图像的每一种情况。如果血液/等的贴花也将更多的可变性引入到项目周围的背景中，那么模板匹配可能不会得到很好的结果。

因此，如果我是您，我会尝试的两件事取决于一些细节：

如果可能，裁剪找到该项目的每种情况的参考模板（这不是一个好时机），然后将用户指定的区域与每个项目的每个模板进行比较。从这些比较中得到最好的结果，如果幸运的话，你会得到一个正确的匹配。
您分享的示例屏幕截图在背景上没有任何暗线/黑色线条，因此所有项目的轮廓都很突出。如果这在整个游戏中保持一致，您可以在用户指定的区域内找到边缘并检测外部轮廓。您将提前处理每个参考项目的外部轮廓并存储这些轮廓。然后，您可以将用户裁剪中的轮廓与数据库中的每个轮廓进行比较，以最佳匹配作为答案。

我相信其中任何一个都可以为您工作，具体取决于您的屏幕截图是否很好地代表了游戏。

注意：轮廓匹配将比模板匹配快得多。速度足够快，可以实时运行，并且可能不需要用户裁剪任何东西。

c++ - 使用 opencv 匹配一组图像中的图像，以便在 C++ 中进行识别

2 回答 2

Related

Reference