编辑:
再玩一点,我发现在循环中删除条件和额外的 Vector 访问可以在我的机器上加快大约 100 毫秒。
这是之前的 XOR 循环:
// Original Vector XOR code:
for (var i: int = 0; i < len; i++) {
// XOR.
result[i] = vec1[i] ^ vec2[i];
if (ignoreAlpha) {
// Force alpha of FF so we can see the result.
result[i] |= 0xFF000000;
}
}
这是 Vector 解决方案的更新 XOR 循环:
if (ignoreAlpha) {
// Force alpha of FF so we can see the result.
alphaMask = 0xFF000000;
}
// Fewer Vector accessors makes it quicker:
for (var i: int = 0; i < len; i++) {
// XOR.
result[i] = alphaMask | (vec1[i] ^ vec2[i]);
}
回答:
以下是我在 Flash 中对两个图像进行异或测试的解决方案。
我发现 PixelBender 解决方案比直接使用 ActionScript慢6-10 倍。
我不知道是因为我的算法很慢,还是只是试图在 PixelBender 中伪造按位运算的限制。
结果:
- 像素弯曲器:~6500 毫秒
- BitmapData.getVector(): ~480-500ms
- BitmapData.getPixel32(): ~1200ms
- BitmapData.getPixels(): ~1200ms
明显的赢家是使用BitmapData.getVector()
两个像素数据流,然后对这两个流进行 XOR。
1. PixelBender 解决方案
这就是我在 PixelBender 中实现按位 XOR 的方式,基于 Wikipedia 上给出的公式:http ://en.wikipedia.org/wiki/Bitwise_operation#Mathematical_equivalents
这是最终 PBK 的要点:https ://gist.github.com/Coridyn/67a0ff75afaa0163f673
在我的机器上对两个 3200x1400 图像运行 XOR 大约需要6500-6700ms。
我首先将公式转换为 JavaScript 以检查它是否正确:
// Do it for each RGBA channel.
// Each channel is assumed to be 8bits.
function XOR(x, y){
var result = 0;
var bitCount = 8; // log2(x) + 1
for (var n = 0; n < bitCount; n++) {
var pow2 = pow(2, n);
var x1 = mod(floor(x / pow2), 2);
var y1 = mod(floor(y / pow2), 2);
var z1 = mod(x1 + y1, 2);
result += pow2 * z1;
}
console.log('XOR(%s, %s) = %s', x, y, result);
console.log('%s ^ %s = %s', x, y, (x ^ y));
return result;
}
// Split out these functions so it's
// easier to convert to PixelBender.
function mod(x, y){
return x % y;
}
function pow(x, y){
return Math.pow(x, y);
}
function floor(x){
return Math.floor(x);
}
确认它是正确的:
// Test the manual XOR is correct.
XOR(255, 85); // 170
XOR(170, 85); // 255
XOR(170, 170); // 0
然后我通过使用一系列宏展开循环将 JavaScript 转换为 PixelBender:
// Bitwise algorithm was adapted from the "mathematical equivalents" formula on Wikipedia:
// http://en.wikipedia.org/wiki/Bitwise_operation#Mathematical_equivalents
// Macro for 2^n (it needs to be done a lot).
#define POW2(n) pow(2.0, n)
// Slight optimisation for the zeroth case - 2^0 = 1 is redundant so remove it.
#define XOR_i_0(x, y) ( mod( mod(floor(x), 2.0) + mod(floor(y), 2.0), 2.0 ) )
// Calculations for a given "iteration".
#define XOR_i(x, y, i) ( POW2(i) * ( mod( mod(floor(x / POW2(i)), 2.0) + mod(floor(y / POW2(i)), 2.0), 2.0 ) ) )
// Flash doesn't support loops.
// Unroll the loop by defining macros that call the next macro in the sequence.
// Adapted from: http://www.simppa.fi/blog/category/pixelbender/
// http://www.simppa.fi/source/LoopMacros2.pbk
#define XOR_0(x, y) XOR_i_0(x, y)
#define XOR_1(x, y) XOR_i(x, y, 1.0) + XOR_0(x, y)
#define XOR_2(x, y) XOR_i(x, y, 2.0) + XOR_1(x, y)
#define XOR_3(x, y) XOR_i(x, y, 3.0) + XOR_2(x, y)
#define XOR_4(x, y) XOR_i(x, y, 4.0) + XOR_3(x, y)
#define XOR_5(x, y) XOR_i(x, y, 5.0) + XOR_4(x, y)
#define XOR_6(x, y) XOR_i(x, y, 6.0) + XOR_5(x, y)
#define XOR_7(x, y) XOR_i(x, y, 7.0) + XOR_6(x, y)
// Entry point for XOR function.
// This will calculate the XOR the current pixels.
#define XOR(x, y) XOR_7(x, y)
// PixelBender uses floats from 0.0 to 1.0 to represent 0 to 255
// but the bitwise operations above work on ints.
// These macros convert between float and int values.
#define FLOAT_TO_INT(x) float(x) * 255.0
#define INT_TO_FLOAT(x) float(x) / 255.0
函数中当前像素的每个通道的异或evaluatePixel
:
void evaluatePixel()
{
// Acquire the pixel values from both images at the current location.
float4 frontPixel = sampleNearest(inputImage, outCoord());
float4 backPixel = sampleNearest(diffImage, outCoord());
// Set up the output variable - RGBA.
pixel4 result = pixel4(0.0, 0.0, 0.0, 1.0);
// XOR each channel.
result.r = INT_TO_FLOAT ( XOR(FLOAT_TO_INT(frontPixel.r), FLOAT_TO_INT(backPixel.r)) );
result.g = INT_TO_FLOAT ( XOR(FLOAT_TO_INT(frontPixel.g), FLOAT_TO_INT(backPixel.g)) );
result.b = INT_TO_FLOAT ( XOR(FLOAT_TO_INT(frontPixel.b), FLOAT_TO_INT(backPixel.b)) );
// Return the result for this pixel.
dst = result;
}
ActionScript 解决方案
2.BitmapData.getVector()
我发现最快的解决方案是Vector
从两个图像中提取一个像素并在 ActionScript 中执行 XOR。
对于相同的两个 3200x1400,这大约需要480-500ms。
package diff
{
import flash.display.Bitmap;
import flash.display.DisplayObject;
import flash.display.IBitmapDrawable;
import flash.display.BitmapData;
import flash.geom.Rectangle;
import flash.utils.ByteArray;
/**
* @author Coridyn
*/
public class BitDiff
{
/**
* Perform a binary diff between two images.
*
* Return the result as a Vector of uints (as used by BitmapData).
*
* @param image1
* @param image2
* @param ignoreAlpha
* @return
*/
public static function diffImages(image1: DisplayObject,
image2: DisplayObject,
ignoreAlpha: Boolean = true): Vector.<uint> {
// For simplicity get the smallest common width and height of the two images
// to perform the XOR.
var w: Number = Math.min(image1.width, image2.width);
var h: Number = Math.min(image1.height, image2.height);
var rect: Rectangle = new Rectangle(0, 0, w, h);
var vec1: Vector.<uint> = BitDiff.getVector(image1, rect);
var vec2: Vector.<uint> = BitDiff.getVector(image2, rect);
var resultVec: Vector.<uint> = BitDiff.diffVectors(vec1, vec2, ignoreAlpha);
return resultVec;
}
/**
* Extract a portion of an image as a Vector of uints.
*
* @param drawable
* @param rect
* @return
*/
public static function getVector(drawable: DisplayObject, rect: Rectangle): Vector.<uint> {
var data: BitmapData = BitDiff.getBitmapData(drawable);
var vec: Vector.<uint> = data.getVector(rect);
data.dispose();
return vec;
}
/**
* Perform a binary diff between two streams of pixel data.
*
* If `ignoreAlpha` is false then will not normalise the
* alpha to make sure the pixels are opaque.
*
* @param vec1
* @param vec2
* @param ignoreAlpha
* @return
*/
public static function diffVectors(vec1: Vector.<uint>,
vec2: Vector.<uint>,
ignoreAlpha: Boolean): Vector.<uint> {
var larger: Vector.<uint> = vec1;
if (vec1.length < vec2.length) {
larger = vec2;
}
var len: Number = Math.min(vec1.length, vec2.length),
result: Vector.<uint> = new Vector.<uint>(len, true);
var alphaMask = 0;
if (ignoreAlpha) {
// Force alpha of FF so we can see the result.
alphaMask = 0xFF000000;
}
// Assume same length.
for (var i: int = 0; i < len; i++) {
// XOR.
result[i] = alphaMask | (vec1[i] ^ vec2[i]);
}
if (vec1.length != vec2.length) {
// Splice the remaining items.
result = result.concat(larger.slice(len));
}
return result;
}
}
}
3.BitmapData.getPixel32()
您当前循环 BitmapData 的方法给出了大约1200msBitmapData.getPixel32()
的类似速度:
for (var y: int = 0; y < h; y++) {
for (var x: int = 0; x < w; x++) {
sourcePixel = bd1.getPixel32(x, y);
resultPixel = sourcePixel ^ bd2.getPixel(x, y);
result.setPixel32(x, y, resultPixel);
}
}
4.BitmapData.getPixels()
我最后的测试是尝试迭代两个ByteArray
像素数据(非常类似于Vector
上面的解决方案)。这个实现也花了大约1200 毫秒:
/**
* Extract a portion of an image as a Vector of uints.
*
* @param drawable
* @param rect
* @return
*/
public static function getByteArray(drawable: DisplayObject, rect: Rectangle): ByteArray {
var data: BitmapData = BitDiff.getBitmapData(drawable);
var pixels: ByteArray = data.getPixels(rect);
data.dispose();
return pixels;
}
/**
* Perform a binary diff between two streams of pixel data.
*
* If `ignoreAlpha` is false then will not normalise the
* alpha to make sure the pixels are opaque.
*
* @param ba1
* @param ba2
* @param ignoreAlpha
* @return
*/
public static function diffByteArrays(ba1: ByteArray,
ba2: ByteArray,
ignoreAlpha: Boolean): ByteArray {
// Reset position to start of array.
ba1.position = 0;
ba2.position = 0;
var larger: ByteArray = ba1;
if (ba1.bytesAvailable < ba2.bytesAvailable) {
larger = ba2;
}
var len: Number = Math.min(ba1.length / 4, ba2.length / 4),
result: ByteArray = new ByteArray();
// Assume same length.
var resultPixel:uint;
for (var i: uint = 0; i < len; i++) {
// XOR.
resultPixel = ba1.readUnsignedInt() ^ ba2.readUnsignedInt();
if (ignoreAlpha) {
// Force alpha of FF so we can see the result.
resultPixel |= 0xFF000000;
}
result.writeUnsignedInt(resultPixel);
}
// Seek back to the start.
result.position = 0;
return result;
}