-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding support for 32-bit architectures #10
Comments
Thanks for the interest! That class is used for 32-bit PE support, but as you've noticed there is not a corresponding ELF implementation and no 32-bit ARM support all. Relatively speaking, adding a new ABI is straightforward:
For 32-bit ARM, there's a little bit more to do because we don't have any support for it yet:
Just a heads up, I'm currently inquiring internally about how to accept an outside contribution to this repository and it will probably require you to sign a CLA. |
Thank you for your response. |
I've managed to implement the x86 32-bit ELF support and all the test-cases do pass. The only thing that I'm missing from your list is the part about updating The code is available here |
That code is used to determine if a call instruction has a known target or is indirect. You can use mc-asm's command line interface to print out what LLVM instruction names get used for a given assembly input: Here's an example for x86-64: $ echo "call direct; call rax; call qword ptr [rax]" | python3 -m mcasm --syntax=intel --target=x86_64-pc-linux --filter=emit_instruction -
⚡️ emit_instruction
├── state (ParserState)
│ └── loc (SourceLocation)
│ ├── lineno = 1
│ └── offset = 1
├── inst (Instruction)
│ ├── desc (InstructionDesc)
│ │ ├── implicit_uses (list)
│ │ │ ├── [0] (Register)
│ │ │ │ ├── id = 58
│ │ │ │ ├── is_physical_register = True
│ │ │ │ └── name = 'RSP'
│ │ │ └── [1] (Register)
│ │ │ ├── id = 66
│ │ │ ├── is_physical_register = True
│ │ │ └── name = 'SSP'
│ │ └── is_call = True
│ ├── name = 'CALL64pcrel32'
│ ├── opcode = 661
│ └── operands (list)
│ └── [0] (SymbolRefExpr)
│ ├── location (SourceLocation)
│ │ ├── lineno = 1
│ │ └── offset = 6
│ ├── symbol (Symbol)
│ │ └── name = 'direct'
│ └── variant_kind = SymbolRefExpr.VariantKind.None_
├── data = b'\xe8\x00\x00\x00\x00'
└── fixups (list)
└── [0] (Fixup)
├── kind_info (FixupKindInfo)
│ ├── bit_size = 32
│ ├── is_pc_rel = 1
│ └── name = 'reloc_branch_4byte_pcrel'
├── offset = 1
└── value (BinaryExpr)
├── lhs (SymbolRefExpr)
│ ├── location (SourceLocation)
│ │ ├── lineno = 1
│ │ └── offset = 6
│ ├── symbol (Symbol)
│ │ └── name = 'direct'
│ └── variant_kind = SymbolRefExpr.VariantKind.None_
├── opcode = BinaryExpr.Opcode.Add
└── rhs (ConstantExpr)
└── value = -4
⚡️ emit_instruction
├── state (ParserState)
│ └── loc (SourceLocation)
│ ├── lineno = 1
│ └── offset = 14
├── inst (Instruction)
│ ├── desc (InstructionDesc)
│ │ ├── implicit_uses (list)
│ │ │ ├── [0] (Register)
│ │ │ │ ├── id = 58
│ │ │ │ ├── is_physical_register = True
│ │ │ │ └── name = 'RSP'
│ │ │ └── [1] (Register)
│ │ │ ├── id = 66
│ │ │ ├── is_physical_register = True
│ │ │ └── name = 'SSP'
│ │ └── is_call = True
│ ├── name = 'CALL64r'
│ ├── opcode = 662
│ └── operands (list)
│ └── [0] (Register)
│ ├── id = 49
│ ├── is_physical_register = True
│ └── name = 'RAX'
├── data = b'\xff\xd0'
└── fixups = []
⚡️ emit_instruction
├── state (ParserState)
│ └── loc (SourceLocation)
│ ├── lineno = 1
│ └── offset = 24
├── inst (Instruction)
│ ├── desc (InstructionDesc)
│ │ ├── implicit_uses (list)
│ │ │ ├── [0] (Register)
│ │ │ │ ├── id = 58
│ │ │ │ ├── is_physical_register = True
│ │ │ │ └── name = 'RSP'
│ │ │ └── [1] (Register)
│ │ │ ├── id = 66
│ │ │ ├── is_physical_register = True
│ │ │ └── name = 'SSP'
│ │ ├── is_call = True
│ │ └── may_load = True
│ ├── name = 'CALL64m'
│ ├── opcode = 659
│ └── operands (list)
│ ├── [0] (Register)
│ │ ├── id = 49
│ │ ├── is_physical_register = True
│ │ └── name = 'RAX'
│ ├── [1] = 1
│ ├── [2] (Register)
│ ├── [3] = 0
│ └── [4] (Register)
├── data = b'\xff\x10'
└── fixups = [] You can see how there's different LLVM instruction names despite it being the same assembly mnemonic. My hope is that 32-bit ARM also has different LLVM instruction names for direct calls versus indirect calls, but I'm only really familiar with 64-bit ARM. |
Another thing I've noticed is that there'll probably need to be a change to mc-asm to expose the isa-specific MCExprs used in fixups. For example, there's important data missing when parsing this assembly:
... but I'm not familiar enough with 32-bit ARM to know if these relocations are commonly used or not. |
Hello! Was looking at using GTIRB-rewriting on some 32-bit binaries and stumbled across this thread. Reading through this thread, it seems that ARM32 is not 100% implemented. But it seems like x86 is? If so, could we please merge in the x86 support? That would be grand! |
@jkrshnmenon, is there any update on this? I can dig up a CLA for you to sign to get at least the 32-bit x86 support merged if you think that's ready. |
@jranieri-grammatech Apologies for the lack of communication here. But I think the 32-bit x86 support is ready to get merged. |
@jkrshnmenon The CLA has been added to the repository and contains instructions about where to submit it. |
@jranieri-grammatech Thank you for letting me know. I will run it by my employer just to make sure everything is in order before signing it. |
"Hi, I noticed there's been some progress on 32-bit ARM support in the project, which I've reviewed with interest. Could you provide an update on the current status? If I’d like to contribute further improvements, are there specific areas or issues I should focus on?" |
Hi,
I was looking into using gtirb-rewriting along with ddisasm on some 32 bit applications (x86 and arm), and I saw that ddisasm does support both these architectures, however, gtirb-rewriting does not.
I see that an ABI class exists that is intended for x86_32 architecture, but it doesn't seem to be used anywhere.
I wanted to ask how much effort you expect you might need to implement support for 32 bit x86 and ARM applications ?
If it is a reasonable amount, I'd like to give it a shot if I can get some guidance on what needs to be done.
Looking forward to hearing from you.
The text was updated successfully, but these errors were encountered: