2

我正在使用asm/i387.hkernel_fpu_begin中的andkernel_fpu_end函数来保护 FPU 寄存器状态,以便在 Linux 内核模块中进行一些简单的浮点运算。

我很好奇在kernel_fpu_begin函数之前调用函数两次的行为,kernel_fpu_end反之亦然。例如:

#include <asm/i387.h>

double foo(unsigned num){
    kernel_fpu_begin();

    double x = 3.14;
    x += num;

    kernel_fpu_end();

    return x;
}

...

kernel_fpu_begin();

double y = 1.23;
unsigned z = 42;
y -= foo(z);

kernel_fpu_end();

foo函数中,我调用kernel_fpu_beginand kernel_fpu_end; 但kernel_fpu_begin在调用 to 之前已经被调用过foo。这会导致未定义的行为吗?

此外,我什至应该在函数kernel_fpu_end内部调用吗?调用后foo我返回一个doublekernel_fpu_end,这意味着访问浮点寄存器是不安全的,对吗?

我最初的猜测是不要在函数内部使用kernel_fpu_beginandkernel_fpu_end调用foo;但是如果foo双重强制转换返回为unsigned会怎样——程序员不知道在之外使用kernel_fpu_begin和?kernel_fpu_endfoo

4

3 回答 3

7

Short answer: no, it is incorrect to nest kernel_fpu_begin() calls, and it will lead to the userspace FPU state getting corrupted.

Medium answer: This won't work because kernel_fpu_begin() use the current thread's struct task_struct to save off the FPU state (task_struct has an architecture-dependent member thread, and on x86, thread.fpu holds the thread's FPU state), and doing a second kernel_fpu_begin() will overwrite the original saved state. Then doing kernel_fpu_end() will end up restoring the wrong FPU state.

Long answer: As you saw looking at the actual implementation in <asm/i387.h>, the details are a bit tricky. In older kernels (like the 3.2 source you looked at), the FPU handling is always "lazy" -- the kernel wants to avoid the overhead of reloading the FPU until it really needs it, because the thread might run and be scheduled out again without ever actually using the FPU or needing its FPU state. So kernel_fpu_end() just sets the TS flag, which causes the next access of the FPU to trap and cause the FPU state to be reloaded. The hope is that we don't actually use the FPU enough of the time for this to be cheaper overall.

However, if you look at newer kernels (3.7 or newer, I believe), you'll see that there is actually a second code path for all of this -- "eager" FPU. This is because newer CPUs have the "optimized" XSAVEOPT instruction, and newer userspace uses the FPU more often (for SSE in memcpy, etc). The cost of XSAVEOPT / XRSTOR is less and the chance of the lazy optimization actually avoiding an FPU reload is less too, so with a new kernel on a new CPU, kernel_fpu_end() just goes ahead and restores the FPU state. (

However in both the "lazy" and "eager" FPU modes, there is still only one slot in the task_struct to save the FPU state, so nesting kernel_fpu_begin() will end up corrupting userspace's FPU state.

于 2013-04-17T19:09:00.730 回答
0

我正在用我理解的正在发生的事情来评论asm/i387.h Linux 源代码(版本 3.2)。

static inline void kernel_fpu_begin(void)
{
        /* get thread_info structure for current thread */
        struct thread_info *me = current_thread_info();

        /* preempt_count is incremented by 1
         * (preempt_count > 0 disables preemption,
         *  while preempt_count < 0 signifies a bug) */
        preempt_disable();

        /* check if FPU has been used before by this thread */
        if (me->status & TS_USEDFPU)
                /* save the FPU state to prevent clobbering of
                 * FPU registers, then reset the TS_USEDFPU flag */
                __save_init_fpu(me->task);
        else
                /* clear the CR0.TS bit to prevent
                 * unnecessary FPU task context saving */
                clts();
}

static inline void kernel_fpu_end(void)
{
        /* set CR0.TS bit (signifying the processor switched
         * to a new task) to enable FPU task context saving */
        stts();

        /* attempt to re-enable preemption
         * (preempt_count is decremented by 1);
         * reschedule thread if needed
         * (thread will not be preempted if preempt_count != 0) */
        preempt_enable();
}

The FXSAVE instruction is typically used to save the FPU state. However, I believe the memory destination stays the same every time kernel_fpu_begin is called within the same thread; unfortunately that would mean that FXSAVE will overwrite the previously saved FPU state.

Therefore I suspect that you CANNOT safely nest kernel_fpu_begin calls.

What I still cannot understand though is how the FPU state is being restored, since the kernel_fpu_end call does not appear to execute a FXRSTOR instruction. Also, why is the CR0.TS bit set in the kernel_fpu_end call if we are no longer using the FPU?

于 2013-04-15T19:02:15.627 回答
-1

是的,正如您定义的一些双变量 &foo也返回双值;您还必须在外面使用kernel_fpu_beginkernel_fpu_end打电话foo


类似的问题也有这种情况,在某些情况下您可以在不使用 kernel_fpu_beginkernel_fpu_end调用的情况下进行编码。

于 2013-04-11T06:47:32.513 回答