c - Efficient way to handle/define the array of struct pollfd for poll syscall

Question

I am using poll for implementing the server side of a client-server model with multiplexing.

There's a main loop that the server is running.
In each loop I am checking all the fds at the array of struct pollfd until I find n sockets that have content for me (where n is the number that poll returned).
Also, poll is used without timeout (-1 as argument).

So if a new connection is requested at the listening socket I am inserting the socket and some client information into a active connections list I am keeping.

What is the most efficient way to handle the array of poll?

Should I define a small array of size 10 and if there are more clients re allocate memory for an array of size 20 or 50?

Or should I: (a) free the array of struct pollfd type, and (b) reallocate it with size that is equal to the size of the list(+1 for the listening socket) => every time a client closes a connection (and thus I have to remove an element from the array (probably setting the socket to -1, so that poll ignores it) resulting in unused space in array)

What do you recommend?

Thank you for your time

Edit: I think I've found a solution with realloc and shifting with memmove each time a client disconnects to cover his socket at the fds array.

score 5 · Accepted Answer

处理民意调查数组的最有效方法是什么？

实现定义。在您知道必须优化之前不要优化；这被称为过早优化，这是浪费时间。

我应该定义一个大小为 10 的小数组，如果有更多的客户端为大小为 20 或 50 的数组重新分配内存？

如果您的分析器确定您pollfd的 s 是您程序中的一个重要瓶颈，并且您的老板（或教授）说您的程序“不够快”，那就去吧。如果您的列表将获得的最大值是 50，那么只需使用静态数组，将fd成员设置-1为一小部分。

如果您尝试在最坏的情况下处理大量客户端，您可能会担心在处理较小数字时数组尾随未使用的空间......因此会选择一个可调整大小的pollfd数组。

或者我应该：（a）释放 struct pollfd 类型的数组，并且（b）重新分配它的大小等于列表的大小（监听套接字+1）=>每次客户端关闭连接时（因此我必须从数组中删除一个元素（可能将套接字设置为-1，以便 poll 忽略它）导致数组中未使用空间）

我能想到的最简单的算法在调整大小时会加倍空间。realloc与线性调整大小相比，加倍调整大小的最显着好处可能是减少调用次数；与其接受 1024 个连接并调用realloc1024 次，我更愿意接受 1024 个连接并调用realloc10 或 11 次。这对我来说似乎也更简单，因为不需要将数组容量单独存储到已用计数；您可以使用二进制数的属性来发挥自己的优势：(n - 1) & n == 0何时n是 2 的幂。

#define is_power_of_two(n) !(((n) - 1) & (n))

struct pollfd *pollfd_array_resize(struct pollfd *array, size_t size) {
    const size_t max_size = SIZE_MAX / sizeof *array;
    if (size == max_size) { return NULL; }
    if (is_power_of_two(size)) {
        array = realloc(array, (size > 0 ? size * 2 : 1) * sizeof *array);
    }
    return array;
}

编辑：我想我已经找到了一个使用 realloc 的解决方案，并在每次客户端断开连接以覆盖 fds 数组中的套接字时使用 memmove 移动。

这似乎是一个相当不错的解决方案。memmove它增加了缓存局部性，但代价是realloc每次客户端断开连接时都必须调用。

我什至不会考虑对数组进行“碎片整理”，直到我的分析器表明这poll占用了太多的处理器时间。发生这种情况时，我会考虑将fd成员设置为负值（就像您在编辑之前所做的那样）并将要删除的项目的索引放入recently_freed堆栈中。插入时，如果可能，我会从该堆栈中选择项目。如果您必须进行碎片整理，我建议您根据该堆栈的大小进行碎片整理。

size_t pollfd_array_defrag(struct pollfd *array, size_t size) {
    size_t new_size = 0;
    for (size_t x = 0; x + 1 < size; x++) {
        if (array[x].fd < 0) {
            continue;
        }

        array[new_size++] = array[x];
    }
    return new_size;
}

int main(void) {
    size_t size = 0, index;
    struct pollfd *array = NULL;
    struct index_stack *recently_freed = NULL;

    /*----------------------------*
     * ... snip for insertion ... */
    if (recently_freed && recently_freed->size > 0) {
        index = index_stack_pop(&recently_freed);
    }
    else {
        struct pollfd *temp = pollfd_array_resize(array, size);

        if (temp == NULL) {
            /* TODO: Handle memory allocation errors */
        }

        array = temp;
        index = size++;
    }

    array[index] = (struct pollfd) { .fd = your_fd,
                                     .events = your_events,
                                     .revents = your_revents };

    /*--------------------------*
     * ... snip for removal ... */
    index_stack_push(&recently_freed, index);
    array[index] = -1;

    /*----------------------------------*
     * ... snip for defragmentation ... */
    if (is_power_of_two(recently_freed->size) && recently_freed->size * 2 + 1 > size) {
        size = pollfd_array_defrag(array);
        /* array will shrink in future insertions */
        index_stack_destroy(&recently_freed);
        recently_freed = NULL;
    }
}

c - Efficient way to handle/define the array of struct pollfd for poll syscall

1 回答 1

Related

Reference