-1

So, for a class assignment we are writing a Y86 (toy processor) disassembler in C++. Easy enough, I have almost everything done, except for disassembling instructions into a .quad directive.

The quad directive takes a numeric or hexadecimal value, and then converts it into an 8-byte "instruction" (it's not really an instruction, .quad is the only thing in the processor that takes 8 bytes so if you come across an 8 byte line you automatically know you're looking at a quad) that is representative of the value. Here's an example below since my explanation may not be great:

https://image.prntscr.com/image/h5xAoE4YRryl7HSJ13o5Yg.png

It's easy enough to see that the first two quads there are bit shifted 2 to the right on disassembly, but then the next two are bit-shifted 2 to the left. What's the pattern I'm missing here? Here's some more examples of disassembled quads:

0x0a0: 0300000000000000     | value:            .quad   3
0x0a8:                      | list:
0x0a8: ffffffffffffffff     |                   .quad   -1
0x0b0: 0300000000000000     |                   .quad   3
0x0b8: 0500000000000000     |                   .quad   5
0x0c0: 0900000000000000     |                   .quad   9
0x0c8: 0300000000000000     |                   .quad   3
0x0d0: 2800000000000000     |                   .quad   40
0x0d8: 3000000000000000     |                   .quad   48
0x0e0: fcffffffffffffff     |                   .quad   -4
0x0e8: 0300000000000000     |                   .quad   3
0x0f0: 0700000000000000     |                   .quad   7
0x0f8: 0200000000000000     |                   .quad   2
0x100: 0300000000000000     |                   .quad   3
0x108: f6ffffffffffffff     |                   .quad   -10
0x110: f8ffffffffffffff     |                   .quad   -8

Essentially, I'm trying to write an algorithm that will take what's on the left in those screenshots (assembled processor code) and return ".quad 0xblahblah," but I can't figure out what it's doing to the hex values in order to get them like that.

My current C++ code is as follows:

            unsigned int x;
            stringstream oss;
            oss << "0x" << std::uppercase << std::left << std::setw(20) << std::hex << hex;
            string result = oss.str();

            std::istringstream converter(result);
            converter >> std::hex >> x;

But when it should be returning the .quads you see in the first screenshot I posted, it's returning this:

0x0d000d000d000000    
0xc000c000c0000000    
0x000b000b000b0000    
0x00a000a000a00000   

Which is the exact value of the assembled machine code, when I need to figure out what it's doing to end up with

0x000d000d000d0000    
0x00c000c000c00000    
0x0b000b000b000000    
0xa000a000a0000000  

As in the example screenshot.

4

1 回答 1

0

很容易看出,前两个四边形在反汇编时向右位移了 2 位,但接下来的两个四边形向左位移了 2 位。

没有 2 位移位。如果不密切注意,就会出现 2 个半字节(8 位)的移位。

我在这里缺少什么模式?

这不是位移,而是反向字节排序。

尝试尝试使用计数模式,例如 0123456789AB,而不是重复模式,例如 000A000A000A

并注意最重要的单词,在几乎所有示例中都是 0x0000。它出现在字节序列的末尾,但在解码中变为前导零(甚至不打印)。

于 2019-02-05T17:33:30.940 回答