python - 使用 Python（不使用 SciPy）检测照片中的特定水印

Question

我有大量图片（数十万张），对于每张图片，我需要说明它的右上角是否有水印。水印总是相同的并且在相同的位置。它采用带符号和一些文本的丝带形式。我正在寻找简单快速的方法来做到这一点，理想情况下，不使用 SciPy（因为它在我正在使用的服务器上不可用——但它可以使用 NumPy）

到目前为止，我已经尝试使用 PIL 和裁剪功能来隔离水印应该存在的图像区域，然后将直方图与 RMS 功能进行比较（请参阅http://snipplr.com/view/757/compare-两个 pil-images-in-python/）。这不是很好，因为两个方向都有很多错误。

任何想法将不胜感激。谢谢

score 15 · Accepted Answer

另一种可能性是使用机器学习。我的背景是自然语言处理（不是计算机视觉），但我尝试使用您对问题的描述创建一个训练和测试集，并且它似乎有效（对看不见的数据 100% 准确度）。

训练集

训练集由带有水印（正例）和不带水印（负例）的相同图像组成。

测试集

测试集由不在训练集中的图像组成。

示例数据

如果有兴趣，您可以使用示例训练和测试图像进行尝试。

代码：

完整版可作为要点。摘录如下：

import glob

from classify import MultinomialNB
from PIL import Image


TRAINING_POSITIVE = 'training-positive/*.jpg'
TRAINING_NEGATIVE = 'training-negative/*.jpg'
TEST_POSITIVE = 'test-positive/*.jpg'
TEST_NEGATIVE = 'test-negative/*.jpg'

# How many pixels to grab from the top-right of image.
CROP_WIDTH, CROP_HEIGHT = 100, 100
RESIZED = (16, 16)


def get_image_data(infile):
    image = Image.open(infile)
    width, height = image.size
    # left upper right lower
    box = width - CROP_WIDTH, 0, width, CROP_HEIGHT
    region = image.crop(box)
    resized = region.resize(RESIZED)
    data = resized.getdata()
    # Convert RGB to simple averaged value.
    data = [sum(pixel) / 3 for pixel in data]
    # Combine location and value.
    values = []
    for location, value in enumerate(data):
        values.extend([location] * value)
    return values


def main():
    watermark = MultinomialNB()
    # Training
    count = 0
    for infile in glob.glob(TRAINING_POSITIVE):
        data = get_image_data(infile)
        watermark.train((data, 'positive'))
        count += 1
        print 'Training', count
    for infile in glob.glob(TRAINING_NEGATIVE):
        data = get_image_data(infile)
        watermark.train((data, 'negative'))
        count += 1
        print 'Training', count
    # Testing
    correct, total = 0, 0
    for infile in glob.glob(TEST_POSITIVE):
        data = get_image_data(infile)
        prediction = watermark.classify(data)
        if prediction.label == 'positive':
            correct += 1
        total += 1
        print 'Testing ({0} / {1})'.format(correct, total)
    for infile in glob.glob(TEST_NEGATIVE):
        data = get_image_data(infile)
        prediction = watermark.classify(data)
        if prediction.label == 'negative':
            correct += 1
        total += 1
        print 'Testing ({0} / {1})'.format(correct, total)
    print 'Got', correct, 'out of', total, 'correct'


if __name__ == '__main__':
    main()

示例输出

Training 1
Training 2
Training 3
Training 4
Training 5
Training 6
Training 7
Training 8
Training 9
Training 10
Training 11
Training 12
Training 13
Training 14
Testing (1 / 1)
Testing (2 / 2)
Testing (3 / 3)
Testing (4 / 4)
Testing (5 / 5)
Testing (6 / 6)
Testing (7 / 7)
Testing (8 / 8)
Testing (9 / 9)
Testing (10 / 10)
Got 10 out of 10 correct
[Finished in 3.5s]

score 2 · Accepted Answer

水印的位置是否准确？水印是如何应用于背景图像的？

我假设水印是部分加法或乘法函数。加水印的图像大概是这样计算的：

resultPixel = imagePixel + (watermarkPixel*mixinValue)

mixinValue 将是 0.0-1.0，因此您可以通过使用 (1-mixinValue) 的乘数重新应用水印来完成混合。这应该会产生与水印匹配的像素。只需针对原始水印测试结果图像的颜色即可。

testPixel = resultPixel + (watermarkPixel*(1-mixinValue))
assert testPixel == watermarkPixel

当然，水印图像的压缩可能会导致您的 testPixel 出现一些差异。

score 2 · Accepted Answer

您始终可以使用 restb.ai 的Specialized Image Recognition API 来自动化水印检测过程。

导入请求

url = "https://api.restb.ai/segmentation"

querystring = {"client_key":"your-free-key-here","model_id":"re_logo","image_url":"http://demo.restb.ai/img/gallery/realestate/logos-watermarks/ re_logo-1.jpg"}

response = requests.request("GET", url, params=querystring)

打印（响应。文本）

Logo & Watermark 检测演示截图

python - 使用 Python（不使用 SciPy）检测照片中的特定水印

3 回答 3

训练集

测试集

示例数据

代码：

示例输出

Related

Reference