linux - PERCPU: allocation failed, size=256 align=256, failed to allocate new chunk

Question

PERCPU: allocation failed, size=256 align=256, failed to allocate new chunk.

Is The amount of space for per CPU allocations limited?

How much percpu-space can I use in Linux kernel module programming?

Now I'm trying to create as many workqueue_struct as possible. My kernel is 3.10.

My result: I can create about 100000 workqueue_structs, then I find error info (same as in the title) when I use the dmesg command.

My code:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/kthread.h>//kthread_create is_err
#include <linux/slab.h>//kfree
#include <linux/sched.h>//schedule
#include <linux/delay.h>
#include <linux/list.h>
#include <linux/workqueue.h>

u64 i = 0;
static LIST_HEAD(myworkqueuehead);
static struct task_struct *task;

struct MyworkqueueType {
    struct list_head entry;
    struct workqueue_struct *wq;
    u64 number;
};

void myfree(void)
{
    struct MyworkqueueType *tempwqtype,*n;
    list_for_each_entry_safe(tempwqtype, n, &myworkqueuehead, entry)
    {
        if(tempwqtype)
        {
            if(tempwqtype->wq){
                //printk("myfree():number=%lld\n",tempwqtype->number);
                //printk("list_del()\n");
                list_del(&(tempwqtype->entry));
                //printk("destroy_workqueue()\n");
                destroy_workqueue(tempwqtype->wq);
                //printk("free tempwqtypetype:kfree(tempwqtype)\n");
                kfree(tempwqtype);
                //printk("after free tempwqtypetype\n");
            }else{  
                printk("tempwqtype->wq is null\n");
            }
        }else{
            printk("tempwqtype is null\n");
        }
    }
    printk("has freed all the workqueue space...\n");
}


static int test(void *data)
{
    printk("kthread  create_wq start to run test()...\n");
    while(1)
    {
        struct MyworkqueueType *myworkqueue;
        if(kthread_should_stop())
        {
            printk("create_wq kthread begin to do myfree()...\n");
            myfree();
            printk("create_wq kthread stop...\n");
            return 0;
        }
        myworkqueue = kzalloc(sizeof(*myworkqueue), GFP_KERNEL);
        if(myworkqueue){
            struct workqueue_struct *wq = alloc_workqueue("myworkqueue",0,0);
            //struct workqueue_struct *wq = create_workqueue("myworkqueue");
            if(!wq)
            {
                struct MyworkqueueType *mytype;
                kfree(myworkqueue);
                printk("\ncreate workqueue fail...\n");
                mytype = list_entry(myworkqueuehead.prev, struct MyworkqueueType, entry);
                printk("current workqueue number=%lld.start to sleep...\n",mytype->number);
                msleep(5000);
                schedule();
                continue;
            }
            ++i;
            myworkqueue->number = i;
            myworkqueue->wq = wq;
            INIT_LIST_HEAD(&myworkqueue->entry);
            list_add_tail(&myworkqueue->entry,&myworkqueuehead);
            printk("%lld ",i);
        }
        else
        {
            printk("\nalloc struct MyworkqueueType fail...\n");
            printk("current workqueuenum = %lld",i);
            kfree(myworkqueue);
            msleep(5000);
            schedule();
            continue;
        }

    }
}

static int __init maxwqnum_init(void)
{
    printk("-----------maxwqnum-------------\n");
    task=kthread_create(test,NULL,"create_wq");
    if(IS_ERR(task))
    {
        printk("create task_struct create_wq fail...\n");
        kfree(task);
        return 0;
    }
    printk("create task_struct create_wq success...\n");
    wake_up_process(task);
    return 0;
}

static void __exit maxwqnum_cleanup(void)
{
    kthread_stop(task);
    printk("-----------leaving maxwqnum-------------\n");
}

module_init(maxwqnum_init);
module_exit(maxwqnum_cleanup);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("mjq");
MODULE_DESCRIPTION("just a test!");
MODULE_SUPPORTED_DEVICE("WORKQUEUE");

score 2 · Accepted Answer

The largest chunk available from a module from the per-cpu pool depends upon the current usage by other modules that are already loaded in the Linux kernel.

The size of the percpu pool depends upon whether the following configs are defined in the kernel config.

CONFIG_HAVE_SETUP_PER_CPU_AREA
CONFIG_SMP

A typical initial size of the per-cpu pool at boot-up is 32KB per cpu.

It can vary if the architecture specific setup_per_cpu_area() function is defined. The exact amount of memory reserved for the percpu pool is logged to the console during the Linux kernel boot. For example, Linux kernel 3.2 on my Intel Core 2 Duo P8700 machine, logs the following :

PERCPU: Embedded 13 pages/cpu @f77d1000 s31616 r0 d21632 u53248

The percpu pool is 13 pages i.e. 52KB for each cpu, 104KB total. The other numbers are base address of the pool, static_size, reserved_size, dyn_size and unit_size respectively.

UPDATE :

Compiling the Linux kernel-module(from the code in the above question) and insmoding it, results in the following error :

[867955.300798] create workqueue fail...
[867955.300804] current workqueue number=198634.start to sleep...
[867960.315934] PERCPU: allocation failed, size=92 align=256, failed to allocate new chunk
[867960.315948] Pid: 26103, comm: create_wq Tainted: G           O 3.2.0-51-generic #77-Ubuntu
[867960.315955] Call Trace:
[867960.315973]  [<c1563ac4>] ? printk+0x2d/0x2f
[867960.315986]  [<c110335e>] pcpu_alloc+0x30e/0x340
[867960.315995]  [<c110339f>] __alloc_percpu+0xf/0x20
[867960.316032]  [<c10641b0>] __alloc_workqueue_key+0xd0/0x430
[867960.316047]  [<c1122f75>] ? kmem_cache_alloc_trace+0x105/0x140
[867960.316065]  [<f93e50e6>] test+0x56/0x194 [kmod]
[867960.316078]  [<f93e5090>] ? myfree+0x90/0x90 [kmod]
[867960.316091]  [<c1069ddd>] kthread+0x6d/0x80
[867960.316104]  [<c1069d70>] ? flush_kthread_worker+0x80/0x80
[867960.316118]  [<c158033e>] kernel_thread_helper+0x6/0x10

Essentially, as additional per-cpu blocks are requested, the dyn_size can grow as required using calls to pcpu_alloc_chunk(). This internally uses standard kmalloc() calls to obtain additional memory as required. This will continue as long as a block of memory of required size and alignment continues to be available. Eventually this will fail depending upon the usage/fragmentation of the system memory, which is when you see the error.

How `pcpu_alloc()` works?

At initial boot-up, the per-cpu subsystem reserves a small pool of memory from the global memory available to the Linux kernel.

PERCPU: Embedded 13 pages/cpu @f77d1000 s31616 r0 d21632 u53248

This is what the log describes.

Static 31616 + Dynamic 21632 = Total 53248 i.e. 52KB(13pages of 4KB each).

As more and more per-cpu allocations occur using pcpu_alloc(), the dynamic pool keep growing in size. It can be non-contiguous and even sparse in memory. But as long as the alignment and size requirements requested are satisfied this continues successfully. This is because the allocation are made using kmalloc()/vmalloc().

Eventually one of these calls fails as a memory hole satisfying the size/alignment requested is NOT available. That is pretty much it. Just as one cannot predict whether a memalign() call will succeed or not, it is hard to accurately determine when pcpu_alloc() will fail. Especially since even other modules (and the Linux kernel itself) can be calling pcpu_alloc().

For more details refer Linux-kernel/mm/percpu.c.

linux - PERCPU: allocation failed, size=256 align=256, failed to allocate new chunk

1 回答 1

How pcpu_alloc() works?

Related

Reference

How `pcpu_alloc()` works?