1

WWDC session 510中,Apple 工程师提出了对编码CIKernel的支持,Metal并声称它应该可以更快地工作。

我已经一起制作了一个测试项目,它在两者中都实现了运动模糊metalglsl代码类似于 510 会话中的代码)。

有时metal kernel更快,有时glsl kernel更快,但我绝对看不到metal kernel性能的一致性和全面更好的表现。应该是这样的,我错过了什么吗?

注意:该项目不会在模拟器上运行,您需要 A8+ 驱动的设备。

4

1 回答 1

1

看起来其中一些与硬件有关。这是我的 iPad Pro 10.5 英寸结果:

glsl 1 took 229.572057723999ms
glsl 2 took 49.1310358047485ms
glsl 3 took 46.7269420623779ms
glsl 4 took 53.08997631073ms
glsl 5 took 48.9979982376099ms
glsl 6 took 49.0390062332153ms
glsl 7 took 52.5139570236206ms
glsl 8 took 46.4930534362793ms
glsl 9 took 39.6310091018677ms
glsl 10 took 45.9860563278198ms
metal 1 took 77.7549743652344ms
metal 2 took 44.1800355911255ms
metal 3 took 46.0859537124634ms
metal 4 took 45.3709363937378ms
metal 5 took 43.5279607772827ms
metal 6 took 38.9848947525024ms
metal 7 took 37.1809005737305ms
metal 8 took 37.8340482711792ms
metal 9 took 37.6850366592407ms
metal 10 took 37.5720262527466ms

我的 iPhoneSE 结果:

glsl 1 took 394.147992134094ms
glsl 2 took 94.601035118103ms
glsl 3 took 81.4379453659058ms
glsl 4 took 76.9931077957153ms
glsl 5 took 77.0320892333984ms
glsl 6 took 75.8579969406128ms
glsl 7 took 76.9950151443481ms
glsl 8 took 77.8199434280396ms
glsl 9 took 79.7009468078613ms
glsl 10 took 79.4800519943237ms
metal 1 took 146.992921829224ms
metal 2 took 88.6669158935547ms
metal 3 took 81.8150043487549ms
metal 4 took 78.1329870223999ms
metal 5 took 79.5910358428955ms
metal 6 took 93.6589241027832ms
metal 7 took 94.8940515518188ms
metal 8 took 89.0530347824097ms
metal 9 took 84.3830108642578ms
metal 10 took 77.949047088623ms

一个问题和一个想法:

  • 什么设备产生了你的结果?
  • 我很好奇如果不同类型的过滤器,比如颜色内核会表现不同。
于 2018-02-08T13:17:16.507 回答