c - calloc(): Do the individual values matter for performance?

Question

I'm currently writing an embedded application in C where performance is critical.

Currently, I'm allocating lots of empty memory like this: calloc(1, num_bytes) - however, I simply calculate num_bytes as the product of a number of items and the size of each item earlier in the code as it's code that used to call malloc.

calloc seems unique in that it is the only memory allocation function of the -alloc family which takes two arguments for the size. Is there a good reason for doing this? Are there performance implications for specifying different arguments? What was the rationale in choosing this argument layout?

score 3 · Accepted Answer

One advantage of having the separate arguments is that it automatically guards against integer overflow:

// On a 32-bit system, the calloc will almost certainly fail, but the malloc
// will succeed to overflow, likely leading to crashes and/or security holes
// (e.g. if the number of items to allocate came from an untrusted source)
void *a = calloc(64, 67108865);  // 2^32/64 + 1
void *b = malloc(64 * 67108865);  // will allocate 64 bytes on 32-bit systems

For large allocations, there can also be a performance advantage of doing a calloc instead of a malloc and memset combination, since the calloc implementation can use its internal knowledge of the heap to avoid unnecessary work or have improved cache performance.

For example, if the allocator decides to use an OS function such as mmap(2) or VirtualAlloc to acquire more virtual address space, that memory will come pre-zeroed for security reasons. See this question for a detailed explanation. For small allocations, you're unlikely to notice much of a difference.

Some calloc implementations just call malloc and memset internally, so there's no advantage other than a potential overflow check.

score 1 · Accepted Answer

I suppose that the argument layout of calloc() is to allow the allocation of object sizes greater than the storage capacity of a single size_t parameter type (which might be as small as 64KiB).

Whether performance is affected depends mostly on how the arguments are passed to calloc() in your particular environment. Usually, more arguments to pass means more data to be transferred between the caller and the callee -- for example, more arguments will need to be pushed to the callee's stack, yielding a couple of extra instructions to push the arguments in. But I believe that this extra overhead won't be a bottleneck in your program, specially when compared to the execution time of the memory allocator itself.

If you're worried about the performance of calloc(), malloc() might be faster simply due to the fact that it does not initialize the allocated buffer as calloc() do.

score 0 · Accepted Answer

I'm currently writing an embedded application in C where performance is critical.

I think that calloc optimization should be pretty low as priority. But try to see whether it's possible to employ malloc instead (avoiding the zero-initialization), avoid alloc altogether by re-using memory, and possibly allocating memory padded to a platform-specific boundary.

These are all very minor optimization, though (except maybe for alloc reuse). I'd focus on the algorithm instead.

c - calloc(): Do the individual values matter for performance?

3 回答 3

Related

Reference