3

我有大量的卡片图像,以及一张特定卡片的照片。我可以使用哪些工具来查找与我的收藏最相似的图片?

这是收集样本:

这是我要查找的内容:

4

5 回答 5

5

新方法!

似乎以下 ImageMagick 命令,或者它的变体,取决于查看更多图像选择,将提取卡片顶部的措辞

convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png

它占用图像的前 10% 和宽度的 80%(从左上角的 10% 开始,并将其存储crop.png如下:

在此处输入图像描述

如果您通过tessseractOCR 运行,如下所示:

tesseract crop.png agg

你会得到一个名为的文件,agg.txt其中包含:

E‘ Aggressive Urge \L® E

您可以通过它grep进行清理,只寻找彼此相邻的大写和小写字母:

grep -Eo "\<[A-Za-z]+\>" agg.txt

要得到

Aggressive Urge

:-)

于 2014-08-13T21:35:21.023 回答
3

感谢您发布一些照片。

我编写了Perceptual HashingNeal Krawetz 博士发现的算法。在将您的图像与卡片进行比较时,我得到以下相似性百分比度量:

Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%

因此,它不是您的图像类型的理想鉴别器,但有点工作。您可能希望使用它来为您的用例定制它。

我会为您收藏中的每个图像计算一个哈希值,一次一个,并且只为每个图像存储一次哈希值。然后,当您获得一张新卡时,计算其哈希值并将其与存储的哈希值进行比较。

#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){

   TEMP="tmp$$.png"

   # Force image to 8x8 pixels and greyscale
   convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"

   # Calculate mean brightness and correct to range 0..255
   MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)

   # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
   hash=""
   for i in {0..7}; do
      for j in {0..7}; do
         pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
         bit="0"
         [ $pixel -gt $MEAN ] && bit="1"
         hash="$hash$bit"
      done
   done
   hex=$(echo "obase=16;ibase=2;$hash" | bc)
   printf "%016s\n" $hex
   #rm "$TEMP" > /dev/null 2>&1
}

function HammingDistance(){
   # Convert input hex strings to upper case like bc requires
   STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
   STR2=$(tr '[a-z]' '[A-Z]' <<< $2)

   # Convert hex to binary and zero left pad to 64 binary digits
   STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
   STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))

   # Calculate Hamming distance between two strings, each differing bit adds 1
   hamming=0
   for i in {0..63};do
      a=${STR1:i:1}
      b=${STR2:i:1}
      [ $a != $b ] && ((hamming++))
   done

   # Hamming distance is in range 0..64 and small means more similar
   # We want percentage similarity, so we do a little maths
   similarity=$((100-(hamming*100/64)))
   echo $similarity
}

function Usage(){
   echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
   exit 1
}

################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
   # Expecting a single image file for which to generate hash
   if [ ! -f "$1" ]; then
      echo "ERROR: File $1 does not exist" >&2
      exit 1
   fi
   PerceptualHash "$1" 
   exit 0
fi

if [ $# -eq 2 ]; then
   # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
   if [ -f "$1" ]; then
      hash1=$(PerceptualHash "$1")
   else
      hash1=$1
   fi
   if [ -f "$2" ]; then
      hash2=$(PerceptualHash "$2")
   else
      hash2=$2
   fi
   HammingDistance $hash1 $hash2
   exit 0
fi

Usage
于 2014-08-08T13:18:08.633 回答
2

I also tried a normalised cross-correlation of each of your images with the card, like this:

#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do 
   cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
   compare - card.jpg -metric NCC null: 2>&1)
   echo "$cc:$i"
done | sort -n

and I got this output (sorted by match quality):

0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg

which shows that the card correlates best with demystify.jpg.

Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.

于 2014-08-08T15:49:26.203 回答
1

如果我理解正确,您需要将它们作为图片进行比较。这里有一个非常简单但有效的解决方案 - 它称为Sikuli

我可以使用哪些工具来查找与我的收藏最相似的图片?

该工具在图像处理方面非常出色,不仅能够查找您的卡片(图像)是否与您已经定义为图案的内容相似,还能够搜索部分图像内容(所谓的矩形)。

默认情况下,您可以通过 Python 扩展它的功能。任何 ImageObject 都可以设置为以百分比形式接受similarity_pattern,这样您就可以准确地找到您要查找的内容。

该工具的另一大优势是您可以在一天内学习基础知识。

希望这可以帮助。

于 2014-08-19T15:30:38.117 回答
1

我通过将图像数据排列为向量并获取集合图像向量和搜索到的图像向量之间的内积来尝试此操作。最相似的向量将给出最高的内积。我将所有图像的大小调整为相同的大小以获得相等长度的向量,这样我就可以取内积。这种调整大小将额外减少内积计算成本并给出实际图像的粗略近似。

您可以使用 Matlab 或 Octave 快速检查这一点。下面是 Matlab/Octave 脚本。我在那里添加了评论。我尝试将变量mult从 1 更改为 8(您可以尝试任何整数值),对于所有这些情况,图像 Demystify 给出了卡片图像的最高内积。对于 mult = 8,我在 Matlab 中得到以下ip向量:

ip =

683007892

558305537

604013365

如您所见,它为图像 Demystify 提供了 683007892 的最高内积。

% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');

% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;

% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);

% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
    double(smallAggressiveUrge(:)) ...
    double(smallAbundance(:))];

% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));

% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;

编辑

我尝试了另一种方法,基本上采用参考图像和卡片图像之间的欧几里德距离(l2 范数),它给了我非常好的结果,我在这个链接上找到了用于测试卡片图像的大量参考图像(383 张图像) 。

在这里,我没有获取整个图像,而是提取了包含图像的上部并将其用于比较。

在以下步骤中,所有训练图像和测试图像在进行任何处理之前都将调整为预定义的大小。

  • 从训练图像中提取图像区域
  • 对这些图像执行形态关闭以获得粗略的近似值(此步骤可能不是必需的)
  • 将这些图像矢量化并存储在训练集中(我称之为训练集,即使这种方法没有训练)
  • 加载测试卡图像,提取图像感兴趣区域(ROI),应用关闭,然后矢量化
  • 计算每个参考图像向量和测试图像向量之间的欧式距离
  • 选择最小距离项(或前 k 项)

我使用 OpenCV 在 C++ 中做到了这一点。我还包括一些使用不同尺度的测试结果。

#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>

using namespace cv;
using namespace std;

#define INPUT_FOLDER_PATH       string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH   string("Your training image folder path")

void search()
{
    WIN32_FIND_DATA ffd;
    HANDLE hFind = INVALID_HANDLE_VALUE;

    vector<Mat> images;
    vector<string> labelNames;
    int label = 0;
    double scale = .2;  // you can experiment with scale
    Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
    Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));

    // get all training samples in the directory
    hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
    if (INVALID_HANDLE_VALUE == hFind) 
    {
        cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
        return;
    } 
    do
    {
        if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
        {
            Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
            Mat re;
            resize(im, re, imgSize, 0, 0);  // resize the image

            // extract only the upper part that contains the image
            Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
            // get a coarse approximation
            morphologyEx(roi, roi, MORPH_CLOSE, kernel);

            images.push_back(roi.reshape(1)); // vectorize the roi
            labelNames.push_back(string(ffd.cFileName));
        }

    }
    while (FindNextFile(hFind, &ffd) != 0);

    // load the test image, apply the same preprocessing done for training images
    Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
    Mat re;
    resize(test, re, imgSize, 0, 0);
    Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
    morphologyEx(roi, roi, MORPH_CLOSE, kernel);
    Mat testre = roi.reshape(1);

    struct imgnorm2_t
    {
        string name;
        double norm2;
    };
    vector<imgnorm2_t> imgnorm;
    for (size_t i = 0; i < images.size(); i++)
    {
        imgnorm2_t data = {labelNames[i], 
            norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
        imgnorm.push_back(data); // store data
    }

    // sort stored data based on euclidean-distance in the ascending order
    sort(imgnorm.begin(), imgnorm.end(), 
        [] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
    for (size_t i = 0; i < imgnorm.size(); i++)
    {
        cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
    }
}

结果:

比例= 1.0;

demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6

规模 = .8;

demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36

规模 = .6;

demystify.jpg:6226.6,steel_golem.jpg:6887.96,sylvan_basilisk.jpg:7013.05

规模 = .4;

demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67

比例 = .2;

demystify.jpg:1903.05,steel_golem.jpg:2154.64,sylvan_basilisk.jpg:2277.42

于 2014-08-14T09:01:23.013 回答