28

我一直在玩 JavaScript 中的类型化数组

var buffer = new ArrayBuffer(16);
var int32View = new Int32Array(buffer);

我想普通数组([1, 257, true])在 JavaScript 中性能很差,因为它们的值可以是任何类型,因此,达到内存中的偏移量并非易事。

我最初认为 JavaScript 数组下标与对象的工作方式相同(因为它们有很多相似之处),并且是基于哈希映射的,需要基于哈希的查找。但我还没有找到太多可信的信息来证实这一点。

所以,我认为类型化数组表现如此出色的原因是因为它们像 C 中的普通数组一样工作,它们总是被类型化的。给定上面的初始代码示例,并希望获得类型化数组中的第 10 个值...

var value = int32View[10];
  • 类型是Int32,因此每个值必须由32位或4字节组成。
  • 下标是10
  • 所以该值在内存中的位置是<array offset> + (4 * 10),然后读取4字节以获得总值。

我基本上只是想确认我的假设。我对此的想法是否正确,如果不是,请详细说明。

我查看了V8 源代码,看看我是否可以自己回答,但我的 C 生锈了,而且我对 C++ 不太熟悉。

4

4 回答 4

50

Typed Arrays were designed by the WebGL standards committee, for performance reasons. Typically Javascript arrays are generic and can hold objects, other arrays and so on - and the elements are not necessarily sequential in memory, like they would be in C. WebGL requires buffers to be sequential in memory, because that's how the underlying C API expects them. If Typed Arrays are not used, passing an ordinary array to a WebGL function requires a lot of work: each element must be inspected, the type checked, and if it's the right thing (e.g. a float) then copy it out to a separate sequential C-like buffer, then pass that sequential buffer to the C API. Ouch - lots of work! For performance-sensitive WebGL applications this could cause a big drop in the framerate.

On the other hand, like you suggest in the question, Typed Arrays use a sequential C-like buffer already in their behind-the-scenes storage. When you write to a typed array, you are indeed assigning to a C-like array behind the scenes. For the purposes of WebGL, this means the buffer can be used directly by the corresponding C API.

Note your memory address calculation isn't quite enough: the browser must also bounds-check the array, to prevent out-of-range accesses. This has to happen with any kind of Javascript array, but in many cases clever Javascript engines can omit the check when it can prove the index value is already within bounds (such as looping from 0 to the length of the array). It also has to check the array index is really a number and not a string or something else! But it is in essence like you describe, using C-like addressing.

BUT... that's not all! In some cases clever Javascript engines can also deduce the type of ordinary Javascript arrays. In an engine like V8, if you make an ordinary Javascript array and only store floats in it, V8 may optimistically decide it's an array of floats and optimise the code it generates for that. The performance can then be equivalent to typed arrays. So typed arrays aren't actually necessary to reach maximum performance: just use arrays predictably (with every element the same type) and some engines can optimise for that as well.

So why do typed arrays still need to exist?

  • Optimisations like deducing the type of arrays is really complicated. If V8 deduces an ordinary array has only floats in it, then you store an object in an element, it has to de-optimise and regenerate code that makes the array generic again. It's quite an achievement that all this works transparently. Typed Arrays are much simpler: they're guaranteed to be one type, and you just can't store other things like objects in them.
  • Optimisations are never guaranteed to happen; you may store only floats in an ordinary array, but the engine may decide for various reasons not to optimise it.
  • The fact they're much simpler means other less-sophisticated javascript engines can easily implement them. They don't need all the advanced deoptimisation support.
  • Even with really advanced engines, proving optimisations can be used is extremely difficult and can sometimes be impossible. A typed array significantly simplifies the level of proof the engine needs to be able to optimise around it. A value returned from a typed array is certainly of a certain type, and engines can optimise for the result being that type. A value returned from an ordinary array could in theory have any type, and the engine may not be able to prove it will always have the same type result, and therefore generates less efficient code. Therefore code around a typed array is more easily optimised.
  • Typed arrays remove the opportunity to make a mistake. You just can't accidentally store an object and suddenly get far worse performance.

So, in short, ordinary arrays can in theory be equally fast as typed arrays. But typed arrays make it much easier to reach peak performance.

于 2012-11-11T20:06:06.647 回答
7

是的,你基本上是正确的。对于标准的 JavaScript 数组,JavaScript 引擎必须假定数组中的数据都是对象。它仍然可以将其存储为类似 C 的数组/向量,其中对内存的访问仍然像您描述的那样。问题是数据不是值,而是引用该值(对象)的东西。

因此,执行a[i] = b[i] + 2要求引擎:

  1. 访问 b 中索引 i 处的对象;
  2. 检查对象是什么类型;
  3. 从对象中提取值;
  4. 将值加 2;
  5. 使用 4 的新计算值创建一个新对象;
  6. 将步骤 5 中的新对象分配到索引 i 处的 a。

使用类型化数组,引擎可以:

  1. 访问 b 中索引 i 处的值(包括将其放入 CPU 寄存器中);
  2. 将值增加 2;
  3. 将步骤 2 中的新对象分配给索引 i 处的 a。

注意:这些不是 JavaScript 引擎将执行的确切步骤,因为这取决于正在编译的代码(包括周围的代码)和相关的引擎。

这使得结果计算更加高效。此外,类型化数组具有内存布局保证(n 字节值的数组),因此可用于直接与数据(音频、视频等)交互。

于 2012-11-11T19:17:52.233 回答
3

When it comes to performance, things can change fast. As AshleysBrain says, it comes down to whether the VM can deduce that a normal array can be implemented as a typed array quickly and accurately. That depends on the particular optimizations of the particular JavaScript VM, and it can change in any new browser version.

This Chrome developer comment provides some guidance that worked as of June 2012:

  1. Normal arrays can be as fast as typed arrays if you do a lot of sequential access. Random access outside the bounds of the array causes the array to grow.
  2. Typed arrays are fast for access, but slow to be allocated. If you create temporary arrays frequently, avoid typed arrays. (Fixing this is possible, but it's low priority.)
  3. Micro-benchmarks such as JSPerf are not reliable for real-world performance.

If I might elaborate on the last point, I've seen this phenomenon with Java for years. When you test the speed of a small piece of code by running it over and over again in isolation, the VM optimizes the heck out of it. It makes optimizations which only make sense for that specific test. Your benchmark can get a hundredfold speed improvement compared to running the same code inside another program, or compared to running it immediately after running several different tests that optimize the same code differently.

于 2013-10-01T14:36:29.660 回答
1

我并不是任何 javascript 引擎的真正贡献者,只是在 v8 上读过一些资料,所以我的回答可能并不完全正确:

数组中的值(只有没有孔/间隙的普通数组,不是稀疏的。稀疏数组被视为对象。)都是指针或具有固定长度的数字(在 v8 中它们是 32 位,如果是 31 位整数,则它最后用一点标记0,否则它是一个指针)。

所以我认为查找内存位置与 typedArray 没有什么不同,因为整个数组的字节数都是相同的。但不同之处在于,如果它是一个对象,那么您必须添加一个拆箱层,这对于普通的 typedArrays 不会发生。

当然,在访问 typedArrays 时,绝对没有普通数组所具有的类型检查(尽管这可能会在高度优化的代码中被删除,它只为热代码生成)。

对于写作,如果是相同的类型,应该不会慢很多。如果它是不同的类型,那么 JS 引擎可能会为其生成多态代码,这会更慢。

您也可以尝试在 jsperf.com 上进行一些基准测试以确认。

于 2012-11-11T05:57:49.790 回答