-
-
Notifications
You must be signed in to change notification settings - Fork 52
Internal Encoding of x86 Encodings
For cached decodings, this emulator uses the layout:
- 8 bits: operation;
- 8 bits:
- b7: address size — 16- or 32-bit;
- b6: set if this instruction has a displacement or offset attached;
- b5: set if this instruction has an immediate operand attached;
- [b4, b0]: the source operand's
Source
;
- 16 bits:
- [b15, b14]: this instruction's data size;
- [b13, b10]: the length of this instruction in bytes (or 0 to indicate a length extension word is present);
- [b9, b5]: the top five bits of this instruction's SIB;
- [b4, b0]: the destination operand's
Source
.
The low three bits of the SIB are stored in the low three bits of its operand's Source
if necessary; the Source
enum treats all values from 11000b
upwards as having the equivalent meaning of Indirect
for this reason.
Extension words are 16 bits in length for 16-bit decodings and 32 bits in length for 32-bit decodings. Up to three may be present, in the order:
- an immediate operand;
- an offset or displacement; and
- a length extension.
If a length extension is present then it is laid out as:
- [b15/b31, b6]: instruction length in bytes;
- [b5, b4]: repetition attached to this instruction — repe/repne;
- [b3, b1]: segment override attached to this instruction;
- b0: whether the lock prefix was found.
Therefore each decoded instruction is:
- between 4 and 10 bytes for 16-bit decodings; and
- between 4 and 16 bytes for 32-bit decodings.
sizeof(Instruction)
is therefore either 10
or 16
; it provides packing_size
to give the size in bytes that are actually in use. Instruction
is plain-old-data with a trivial destructor so it is safe to place them into memory such that instruction n+1 is placed at the address of instruction n + its packing_size()
. Extension words therefore need be paid for only when required.