0

我是 Linux 内核的大佬。我修改了 linux kernel 3.3 的调度子模块,尝试在 Beagleboard 上启动内核。我遇到了“不一致的锁定状态”的错误。任何人都可以帮我分析以下调试信息吗?谢谢!

[    0.163452] =================================
[    0.167999] [ INFO: inconsistent lock state ]
[    0.172576] 3.3.0-rc7-00008-g8bd3d32-dirty #27 Not tainted
[    0.178314] ---------------------------------
[    0.182891] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[    0.189178] swapper/0/0 [HC0[0]:SC0[0]:HE1:SE1] takes:
[    0.194549]  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4
[    0.202117] {IN-HARDIRQ-W} state was registered at:
[    0.207214]   [<c008fa7c>] __lock_acquire+0xc4c/0x1e28
[    0.212615]   [<c00912dc>] lock_acquire+0x98/0x100
[    0.217651]   [<c04761bc>] _raw_spin_lock+0x2c/0x3c
[    0.222778]   [<c0072890>] scheduler_tick+0x34/0x134
[    0.227996]   [<c0052738>] update_process_times+0x58/0x68
[    0.233642]   [<c00881ac>] tick_periodic+0x48/0xc4
[    0.238677]   [<c00882c0>] tick_handle_periodic+0x24/0x98
[    0.244323]   [<c00264a0>] omap2_gp_timer_interrupt+0x24/0x34
[    0.250366]   [<c00a1344>] handle_irq_event_percpu+0x5c/0x22c
[    0.256378]   [<c00a1550>] handle_irq_event+0x3c/0x5c
[    0.261657]   [<c00a39f8>] handle_level_irq+0xac/0x138
[    0.267059]   [<c00a0b58>] generic_handle_irq+0x30/0x48
[    0.272521]   [<c0014d18>] handle_IRQ+0x4c/0xac
[    0.277313]   [<c000872c>] omap3_intc_handle_irq+0x44/0x4c
[    0.283050]   [<c0476a64>] __irq_svc+0x44/0x60
[    0.287719]   [<c06286e4>] start_kernel+0x204/0x354
[    0.292846]   [<80008044>] 0x80008044
[    0.296691] irq event stamp: 2802
[    0.300201] hardirqs last  enabled at (2801): [<c0476348>] _raw_write_unlock_irq+0x24/0x2c
[    0.308807] hardirqs last disabled at (2802): [<c0476294>] _raw_spin_lock_irqsave+0x1c/0x58
[    0.317504] softirqs last  enabled at (2756): [<c004a6dc>] irq_exit+0x94/0x9c
[    0.324951] softirqs last disabled at (2751): [<c004a6dc>] irq_exit+0x94/0x9c
[    0.332397] 
[    0.332397] other info that might help us debug this:
[    0.339294]  Possible unsafe locking scenario:
[    0.339294] 
[    0.345581]        CPU0
[    0.348175]        ----
[    0.350769]   lock(&rq->lock);
[    0.354003]   <Interrupt>
[    0.356781]     lock(&rq->lock);
[    0.360198] 
[    0.360198]  *** DEADLOCK ***
[    0.360229] 
[    0.366577] 2 locks held by swapper/0/0:
[    0.370697]  #0:  (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4
[    0.378875]  #1:  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4
[    0.386871] 
[    0.386871] stack backtrace:
[    0.391571] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c008e6fc>] (print_usage_bug+0x1d8/0x)
[    0.401062] [<c008e6fc>] (print_usage_bug+0x1d8/0x2c0) from [<c008ebac>] (mark_lock+0x3c8/0x64c)
[    0.410217] [<c008ebac>] (mark_lock+0x3c8/0x64c) from [<c0091c24>] (mark_held_locks+0xb0/0x144)
[    0.419250] [<c0091c24>] (mark_held_locks+0xb0/0x144) from [<c0091d60>] (trace_hardirqs_on_calle)
[    0.429565] [<c0091d60>] (trace_hardirqs_on_caller+0xa8/0x19c) from [<c000f284>] (do_vfp+0x8/0x2)
[   12.759429] BUG: spinlock lockup on CPU#0, swapper/0/0
[   12.764801]  lock: c0de0380, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
[   12.772613] [<c001b7a8>] (unwind_backtrace+0x0/0xf0) from [<c0261600>] (do_raw_spin_lock+0xa0/0x)
[   12.782135] [<c0261600>] (do_raw_spin_lock+0xa0/0x134) from [<c0072890>] (scheduler_tick+0x34/0x)
[   12.791625] [<c0072890>] (scheduler_tick+0x34/0x134) from [<c0052738>] (update_process_times+0x5)
[   12.801391] [<c0052738>] (update_process_times+0x58/0x68) from [<c00881ac>] (tick_periodic+0x48/)
[   12.811004] [<c00881ac>] (tick_periodic+0x48/0xc4) from [<c00882c0>] (tick_handle_periodic+0x24/)
[   12.820587] [<c00882c0>] (tick_handle_periodic+0x24/0x98) from [<c00264a0>] (omap2_gp_timer_inte)
[   12.831176] [<c00264a0>] (omap2_gp_timer_interrupt+0x24/0x34) from [<c00a1344>] (handle_irq_even)
[   12.842102] [<c00a1344>] (handle_irq_event_percpu+0x5c/0x22c) from [<c00a1550>] (handle_irq_even)
[   12.852325] [<c00a1550>] (handle_irq_event+0x3c/0x5c) from [<c00a39f8>] (handle_level_irq+0xac/0)
[   12.861907] [<c00a39f8>] (handle_level_irq+0xac/0x138) from [<c00a0b58>] (generic_handle_irq+0x3)
[   12.871704] [<c00a0b58>] (generic_handle_irq+0x30/0x48) from [<c0014d18>] (handle_IRQ+0x4c/0xac)
[   12.880828] [<c0014d18>] (handle_IRQ+0x4c/0xac) from [<c000872c>] (omap3_intc_handle_irq+0x44/0x)
[   12.890258] [<c000872c>] (omap3_intc_handle_irq+0x44/0x4c) from [<c0476a64>] (__irq_svc+0x44/0x6)
[   12.899566] Exception stack(0xc0677d98 to 0xc0677de0)
[   12.904846] 7d80:                                                       edd47a1a 00000000
[   12.913360] 7da0: c0076e30 c0692630 c0693ad8 600001d3 c0676050 00000001 00000a00 c0476acc
[   12.921874] 7dc0: c0676000 00000468 c0704b70 c0677de0 c0476ac4 c000f290 60000153 ffffffff
[   12.930419] [<c0476a64>] (__irq_svc+0x44/0x60) from [<c000f290>] (do_vfp+0x14/0x20)
[   12.938385] [<c000f290>] (do_vfp+0x14/0x20) from [<c0476ac4>] (__und_svc+0x44/0x80)
[   12.946380] [<c0476ac4>] (__und_svc+0x44/0x80) from [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8)
[   12.955505] [<c0076e30>] (enqueue_task_fair+0x1dc/0x5e8) from [<c00702c0>] (enqueue_task+0x64/0x)
[   12.964935] [<c00702c0>] (enqueue_task+0x64/0x74) from [<c00739e0>] (wake_up_new_task+0xdc/0x1d4)
[   12.974182] [<c00739e0>] (wake_up_new_task+0xdc/0x1d4) from [<c0042120>] (do_fork+0xe4/0x328)
[   12.983062] [<c0042120>] (do_fork+0xe4/0x328) from [<c0015068>] (kernel_thread+0x6c/0x7c)
[   12.991577] [<c0015068>] (kernel_thread+0x6c/0x7c) from [<c0464af4>] (rest_init+0x1c/0xd0)
[   13.000183] [<c0464af4>] (rest_init+0x1c/0xd0) from [<c06287c4>] (start_kernel+0x2e4/0x354)
[   13.008880] [<c06287c4>] (start_kernel+0x2e4/0x354) from [<80008044>] (0x80008044)
4

1 回答 1

3

首先,这是一个 lockdep 警告。这表明您的内核可能存在不一致的锁定问题,这可能导致死锁。lockdep 创建同一类锁的逻辑组,并对它们进行一些规则检查。如果违反任何规则,它会发出警告。lockdep 基本上有两种状态规则 a) 单锁状态规则;b) 多锁依赖规则。从您提供的警告日志中:

[    0.370697]  #0:  (&p->pi_lock){+.....}, at: [<c0073920>] wake_up_new_task+0x1c/0x1d4
[    0.378875]  #1:  (&rq->lock){?.....}, at: [<c00739b8>] wake_up_new_task+0xb4/0x1d4

lockdep 认为 p->pi_lock 和 rq->lock 在逻辑上是同一个类,第一个锁在 try_to_wake_up() 处持有,后面的锁在函数 __task_rq_lock() 处持有。所以,整个事情变成:

    try_to_wake_up() -> p->pi_lock is held
        ttwu_remote()
            __task_rq_lock() -> rq->lock is held

因此,在一条路径上,会使用两个相同类别的锁。作为多锁依赖规则的一部分,lockdep 会发出警告。

注意:我不确定您所做的修改,因此不确定警告是否为误报。我所做的只是解释这个问题。

于 2013-05-21T17:50:47.587 回答