c - 编写 MIPS 机器指令并从 C 中执行它们

Question

我正在尝试用 C 和 MIPS 编写一些自我修改的代码。

由于我想稍后修改代码，因此我正在尝试编写实际的机器指令（而不是内联汇编）并尝试执行这些指令。有人告诉我，可以只 malloc 一些内存，在那里写指令，将 C 函数指针指向它，然后跳转到它。（我包括下面的例子）

我已经用我的交叉编译器（sourcery codebench 工具链）尝试了这个，但它不起作用（是的，事后看来，我认为它确实看起来很幼稚）。我怎么能正确地做到这一点？

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>


void inc(){
    int i = 41;
    uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function
    *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con)
    *(addone + 1) = 0x23e00000; //this is jr $ra

    int (*f)(int x) = addone; //our function pointer
    i = (*f)(i);
    printf("%d",i);    
}

int main(){
    inc();
exit(0);}

我在这里遵循 gcc 调用约定，其中参数传递给 $a0，并且函数的结果预计在 $v0 中。我实际上不知道返回地址是否会被放入 $ra （但我无法测试它，因为我无法编译。我使用 int 作为我的指令，因为我正在编译 MIPS32（因此是 32 位 int应该够了）

score 2 · Accepted Answer

You are using pointers inappropriately. Or, to be more accurate, you aren't using pointers where you should be.

Try this on for size:

uint32_t *addone = malloc(sizeof(*addone) * 2);
addone[0] = 0x20820001; // addi $v0, $a0, 1
addone[1] = 0x23e00000; // jr $ra

int (*f)(int x) = addone; //our function pointer
i = (*f)(i);
printf("%d\n",i);

You may also need to set the memory as executable after writing to it, but before calling it:

mprotect(addone, sizeof(int) * 2, PROT_READ | PROT_EXEC);

To make this work, you may additionally need to allocate a considerably larger block of memory (4k or so) so that the address is page-aligned.

score 2 · Accepted Answer

You also need to make sure that the memory in question is executable, and makes sure it gets flushed properly from the dcache after writing it and loaded into the icache before executing it. How to do that depends on the OS running on your mips machine.

On Linux, you would use the mprotect system call to make the memory executable, and the cacheflush system call to do the cache flushing.

edit

Example:

#include <unistd.h>
#include <sys/mman.h>
#include <asm/cachecontrol.h>

#define PALIGN(P)  ((char *)((uintptr_t)(P) & (pagesize-1)))
uintptr_t  pagesize;

void inc(){
    int i = 41;
    uint32_t *addone = malloc(sizeof(*addone) * 2); //we malloc space for our asm function
    *(addone) = 0x20820001; // this is addi $v0 $a0 1, which adds one to our arg (gcc calling con)
    *(addone + 1) = 0x23e00000; //this is jr $ra

    pagesize = sysconf(_SC_PAGESIZE);  // only needs to be done once
    mprotect(PALIGN(addone), PALIGN(addone+1)-PALIGN(addone)+pagesize,
             PROT_READ | PROT_WRITE | PROT_EXEC);
    cacheflush(addone, 2*sizeof(*addone), ICACHE|DCACHE);

    int (*f)(int x) = addone; //our function pointer
    i = (*f)(i);
    printf("%d",i);    
}

Note that we make the entire page(s) containing the code both writable and executable. That's because memory protection works per page, and we want malloc to be able to continue to use the rest of the page(s) for other things. You could instead use valloc or memalign to allocate entire pages, in which case you could make the code read-only executable safely.

score 2 · Accepted Answer

The OP's code as written compiles without errors with Codesourcery mips-linux-gnu-gcc.

As others have mentioned above, self modifying code on MIPS requires the instruction cache to be synchronized with the data cache after the code is written. The MIPS32R2 version of the MIPS architecture added the SYNCI instruction which is a user mode instruction that does what you need here. All modern MIPS CPUs implement MIPS32R2, including SYNCI.

Memory protection is an option on MIPS, but most MIPS CPUs are not built with this feature selected, so using the mprotect system call is likely not needed on most real MIPS hardware.

Note that if you use any optimization besides -O0 the compiler can and does optimize away the stores to *addone and the function call, which breaks your code. Using the volatile keyword prevents the compiler from doing this.

The following code generates correct MIPS assembly, but I don't have MIPS hardware handy to test it on:

int inc() {
    volatile int i = 41;
    // malloc 8 x sizeof(int) to allocate 32 bytes ie one cache line,
    // also ensuring that the address of function addone is aligned to
    // a cache line.
    volatile int *addone = malloc(sizeof(*addone) * 8);
    *(addone)     = 0x20820001; // this is addi $v0 $a0 1
    *(addone + 1) = 0x23e00000; //this is jr $ra
    // use a SYNCI instruction to flush the data written above from
    // the D cache and to flush any stale data from the I cache
    asm volatile("synci 0(%0)": : "r" (addone));
    volatile int (*f)(int x) = addone; //our function pointer
    int j = (*f)(i);
    return j;
}

int main(){
    int k = 0;
    k = inc();
    printf("%d",k);    
    exit(0);
}

score 1 · Accepted Answer

调用函数比仅仅跳转到一条指令要复杂得多。

参数如何传递？它们是存储在寄存器中，还是推送到调用堆栈？
如何返回值？
返回跳转的返回地址放在哪里？如果你有一个递归函数，$ra不要削减它。
当被调用函数完成时，调用者或被调用者是否负责弹出堆栈帧？

Different calling conventions have different answers to these questions. Though I've never tried anything like what you're doing, I would assume you'd have to write your machine code to match a convention, then tell the compiler that your function pointer uses that convention (different compilers have different ways of doing this - gcc does it with function attributes).

c - 编写 MIPS 机器指令并从 C 中执行它们

4 回答 4

Related

Reference