0

我在几个地方看到过,包括 CCSpriteBatchNode 的源代码,添加/删除子节点是“昂贵的”。我的理解是,使用批处理节点的全部目的是防止在将来自同一个精灵表的许多精灵添加到同一个容器时反复发生昂贵的 OpenGL 调用。

我想知道的是1)向精灵批处理节点添加/删除子节点有多“昂贵”,以及2)何时认为使用一个合适?

例如,我有一个创建十个精灵的激光对象......当它在屏幕上移动时,它会显示/隐藏给定屏幕位置的当前精灵。当它到达屏幕的最右边缘时,激光对象被丢弃,十个精灵也被丢弃。所以,我想知道,这是不是一个精灵批处理节点不适合使用的情况,因为它只有 10 个精灵,而且它发生得如此之快——移动动画是 0.2 秒,所以如果玩家要快速开火,这意味着一遍又一遍地向批处理节点添加/删除 10 个精灵......

在其他情况下,我已经为各种对象设置了一个 SpriteBatchNode,偶尔我会遇到需要添加的一次性精灵,它恰好是同一个精灵表的一部分,所以我很想添加它到那个批处理节点,因为它在那里,并且它已经指定给那个特定的精灵表......无论如何,我很想对这个主题进行一些澄清。

4

3 回答 3

0

1) how "expensive" is adding / removing childs to a sprite batch node

The only scenario I am aware that it can be "expensive" is when you have to increase the atlas capacity. You see, batch nodes have a capacity, and if you add a child that surpasses it, the node will have to increase its capacity and recalculate texture coordinates for all sprites.

To fix this, you simply give your batch node a reasonable capacity to begin with - not too little and not too much. It's up to you to identify such number, depending on your needs.

2) when is it considered a appropriate to make use of one?

Whenever you have several sprites that can use the same texture source. For a Mario game, it is clear that you will need several coins on the screen. This would be a good use case for a batch node: have a batch node for the coin image, and then all your coin sprites will use this batch node.

Sometimes you can pack several elements into the same texture. Say, you could fit a coin image, a monster image, and a mushroom image all in the same texture. This way, all your coins, monsters and mushrooms could use the same batch node.

You shouldn't need batch nodes for things like background textures, because you probably only need one background sprite anyway.

So, I was wondering, is this a case where a sprite batch node would be not appropriate to use because it's only 10 sprites, and it happens so fast-- The move animation is 0.2 seconds, so that if the player were to rapidly fire, that would mean adding/removing 10 sprites to a batch node over and over...

This is a valid use case for a batch node. 10 sprites are drawn simulatenously, after all. And, if you know that you won't be using a laser object anymore, you can always unload the corresponding batch node. I imagine that you may have several laser objects in your game, so a batch node is a good idea.

Frankly, don't worry much about performance. I use dozens in my game all the time for all sorts of things (characters, weather particles, map objects, collectibles, interface, etc), and thanks to them I rarely ever see it fall below 55fps.

In fact, I find it hard to argue against using batch nodes. They rarely cause any harm.

于 2014-05-11T23:25:08.813 回答
0

如前所述,精灵批处理节点为它的所有子节点批处理对 GPU 的调用(因为它们使用相同的纹理)。然而,为了对性能产生影响,必须涉及大量的精灵。对于 10 个精灵,我认为这不会有什么不同......

也就是说,请注意,如果您使用的是新版本的 Cocos2d(如 3.0),现在处于测试阶段的 3.1 提供自动批处理,因此您无需浪费时间玩 CCSpriteBatchNode。Cocos2d 会自动批处理发送到 GPU 的数据。

于 2014-05-12T13:32:13.640 回答
0

CCSpriteBatchNodea和 normal之间的主要区别在于CCSprite,a 一次CCSpriteBatchNode将所有精灵的所有数据发送到 GPU,而不是为每个精灵执行此操作。

绘图调用的CCSprite工作方式如下:

glVertexAttribPointer(kCCVertexAttrib_Position, 3, GL_FLOAT, GL_FALSE, kQuadSize, (void*) (offset + diff));
glVertexAttribPointer(kCCVertexAttrib_TexCoords, 2, GL_FLOAT, GL_FALSE, kQuadSize, (void*)(offset + diff));
glVertexAttribPointer(kCCVertexAttrib_Color, 4, GL_UNSIGNED_BYTE, GL_TRUE, kQuadSize, (void*)(offset + diff));
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

基本上会进行 3 次调用来设置精灵的数据,然后调用到glDrawArrays。如果您有 100 个精灵,则此代码将执行 100 次。

现在让我们看一下CCSpriteBatchNode(我选择了没有VAO的实现,这是另一种可能的优化):

glVertexAttribPointer(kCCVertexAttrib_Position, 3, GL_FLOAT, GL_FALSE, kQuadSize, (GLvoid*) offsetof( ccV3F_C4B_T2F, vertices));
glVertexAttribPointer(kCCVertexAttrib_Color, 4, GL_UNSIGNED_BYTE, GL_TRUE, kQuadSize, (GLvoid*) offsetof( ccV3F_C4B_T2F, colors));
glVertexAttribPointer(kCCVertexAttrib_TexCoords, 2, GL_FLOAT, GL_FALSE, kQuadSize, (GLvoid*) offsetof( ccV3F_C4B_T2F, texCoords));
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, buffersVBO_[1]);
glDrawElements(GL_TRIANGLE_STRIP, (GLsizei) n*6, GL_UNSIGNED_SHORT, (GLvoid*) (start*6*sizeof(indices_[0])) );

现在这段代码一次设置了所有精灵的所有数据,因为它存储在连续的内存中。这个调用对于 1、10、100 都是一样的,不管精灵的数量是多少。

这就是它更高效的原因,但同时,由于数据连续存储在内存中,当删除或添加或修改子项时,必须相应地更改数组并在 GPU 中更新。这就是添加和删除成本的来源(甚至隐藏的 CCSprite 只是跳过渲染阶段,而批处理节点中的隐藏 CCSprite 则不会)

根据个人经验,我可以告诉您,成本通常可以忽略不计,并且您应该始终CCSpriteBatchNode在可能的情况下使用 a(因为它们有其限制,例如在整个节点上混合而不是基于每个精灵和类似的东西)并且当您绘制多个相同类型/原因的精灵。

不过,为自己设定基准应该很容易。

于 2014-05-11T23:19:15.293 回答