-
-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple operand decoding #278
Conversation
0633749
to
1241203
Compare
a3e7003
to
ad99855
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent PR, all good changes. Reviewed everything except for the fuzzing logic, which I want to take a look at when I'm less tired, presumably sometime on the weekend.
Do we intend to keep the API as-is, or do we turn the ZydisDecoderDecodeBuffer
function into ZydisDecoderDecodeBufferEx
later, providing a simpler variant that automatically decodes the operands as well, naming it ZydisDecoderDecodeBuffer
later? I originally thought this would be the way to go, but looking at the formatter interface, we'd also have to support the struct filled by the simple variant in that one, resulting in an explosion of complexity that I'd strongly prefer to avoid.
I'm kind of leaning towards keeping the interface as proposed in this PR as our proper interface and instead providing an even simpler function that fills an even larger struct that also does formatting in one wash, the audience of which would primarily be beginner users.
In any case, I'd be happy to keep as-is and discuss this in a separate issue later.
/** | ||
* Internal EVEX-specific information. | ||
*/ | ||
struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intentionally not ifdef away these when EVEX support is compiled out? I guess we don't really need to if the struct is purely internal, just wondering if you meant to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was intentional. I thought about this for quite some time. As we rely on the user to allocate our structs, we have to make sure the input struct size matches our internal expectations. In a pure C/C++ environment, we could probably ifdef
-ing these members away without any problem, but that's a different story for bindings. We would always have to check if the struct size is matching (e.g. going the Windows API way and introducing a size
field in the ZydisDecodedInstruction
that needs to be initialized by the user - extra complexity which I don't really like).
I think we should discuss this in detail. One good solution might be to introduce a ZYDIS_SIZE_OPTIMIZED_UNSAFE_FOR_BINDINGS
option. This would indicate that Zydis
is used in a pure C/C++ project and always built on demand (-> we know the feature defines of headers and code files are the same). This would as well allow us to e.g. change the ZydisDecodedOperand
struct to unions
or to introduce bit-fields in the public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should discuss this in detail.
Agreed, but this definitely deserves a dedicated issue for proper discussion -- let's not do that in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not do that in this PR
Will create a separate issue tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice changes :)
Some thoughs about API (@athre0z):
Do we intend to keep the API as-is, or do we turn the ZydisDecoderDecodeBuffer function into ZydisDecoderDecodeBufferEx later, providing a simpler variant that automatically decodes the operands as well, naming it ZydisDecoderDecodeBuffer later? I originally thought this would be the way to go, but looking at the formatter interface, we'd also have to support the struct filled by the simple variant in that one, resulting in an explosion of complexity that I'd strongly prefer to avoid.
I'm kind of leaning towards keeping the interface as proposed in this PR as our proper interface and instead providing an even simpler function that fills an even larger struct that also does formatting in one wash, the audience of which would primarily be beginner users.
What about this:
ZydisDecoderDecodeBuffer
-> function that decodes instruction + operands (simple utility function)ZydisDecoderDecodeInstruction
-> minimal decoding (renamedZydisDecoderDecodeBuffer
from this PR)ZydisDecoderDecodeOperands
-> as is in this PR
In this concept signature for ZydisDecoderDecodeBuffer
would change to accept operands, so it would be another breaking change since v3. This has some advantages though:
- problem you highlighted (with having one big struct) is avoided
- function still does what it did in v3 - decodes instruction and operands, just the output is split into new v4-fashion
- this is easy to explain in porting guide and keeps migration rather staigtforward. For most existing users documentation of new
ZydisDecoderDecodeBuffer
would introduce them to a new concept of decoding instruction and operands separately while immediately informing them it's optional to do so because they can keep usingZydisDecoderDecodeBuffer
after making minimal adjustments to existing codebases.
@@ -716,6 +716,7 @@ ZYAN_STATIC_ASSERT(ZYDIS_RW_ACTION_REQUIRED_BITS <= 8); | |||
# define ZYDIS_INSTRUCTION_DEFINITION_BASE \ | |||
ZyanU16 mnemonic ZYAN_BITFIELD(ZYDIS_MNEMONIC_REQUIRED_BITS); \ | |||
ZyanU8 operand_count ZYAN_BITFIELD( 4); \ | |||
ZyanU8 operand_count_visible ZYAN_BITFIELD( 3); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this change, it allows converting some obsolete checks inside ZydisIsDefinitionCompatible
into assertion.
I like the idea of providing one function that combines instruction- and operand-decoding. However I would probably just name the utility function to something like |
Both sound good to me. However if latter one is preferred I'd go with |
a8fcd64
to
1199d4b
Compare
1199d4b
to
e8f71e6
Compare
else | ||
if (!operand->ignore_seg_override && | ||
instruction->attributes & ZYDIS_ATTRIB_HAS_SEGMENT_ES) | ||
{ | ||
operands[i].mem.segment = ZYDIS_REGISTER_ES; | ||
} | ||
else | ||
if (!operand->ignore_seg_override && | ||
instruction->attributes & ZYDIS_ATTRIB_HAS_SEGMENT_FS) | ||
{ | ||
operands[i].mem.segment = ZYDIS_REGISTER_FS; | ||
} | ||
else | ||
if (!operand->ignore_seg_override && | ||
instruction->attributes & ZYDIS_ATTRIB_HAS_SEGMENT_GS) | ||
{ | ||
operands[i].mem.segment = ZYDIS_REGISTER_GS; | ||
} | ||
else | ||
{ | ||
if (operands[i].mem.segment == ZYDIS_REGISTER_NONE) | ||
{ | ||
if ((operands[i].mem.base == ZYDIS_REGISTER_RSP) || | ||
(operands[i].mem.base == ZYDIS_REGISTER_RBP) || | ||
(operands[i].mem.base == ZYDIS_REGISTER_ESP) || | ||
(operands[i].mem.base == ZYDIS_REGISTER_EBP) || | ||
(operands[i].mem.base == ZYDIS_REGISTER_SP) || | ||
(operands[i].mem.base == ZYDIS_REGISTER_BP)) | ||
{ | ||
operands[i].mem.segment = ZYDIS_REGISTER_SS; | ||
} | ||
else | ||
{ | ||
operands[i].mem.segment = ZYDIS_REGISTER_DS; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of the rightward drift here, perhaps just going with inverted conditions and goto
s here might be preferable? Could also do do { .. } while (ZYAN_FALSE);
and break out of it, but that's essentially just a less honest goto.
f756d6a
to
e8f71e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! :-) One remaining open comment, but it's just style / not a blocker.
Edit: Ah, wait! We should also update the porting guide.
Will update the porting guide before merge. Open points for later:
|
Closes #110