Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neko-booted binaries cannot be processed by install_name_tool on Mac #130

Open
andyli opened this issue May 9, 2016 · 12 comments
Open

neko-booted binaries cannot be processed by install_name_tool on Mac #130

andyli opened this issue May 9, 2016 · 12 comments

Comments

@andyli
Copy link
Member

andyli commented May 9, 2016

install_name_tool: the __LINKEDIT segment does not cover the end of the file (can't be processed)

This affects the homebrew package: Homebrew/homebrew-core#982

@kulick: Do you think that's something can be solved by properly handling ELF (or equivalent header thing) on Mac?

@andyli
Copy link
Member Author

andyli commented May 9, 2016

Look like we have to fix the Mach-O headers in some way.
The Mach-O reference page is full of info that I don't really want to understand... sigh. And that page doesn't render correctly even on Safari...

@UniqMartin
Copy link

Homebrew maintainer here. I know almost nothing about Neko (though I'm curious and interested to learn), but I know a good deal of Mach-O internals. Feel free to ping me if I can be of any help in resolving this issue and making Neko produce valid Mach-O binaries.

@andyli
Copy link
Member Author

andyli commented May 9, 2016

@UniqMartin Your help would be fantastic! The nekotools boot source code is pretty short. What it does is basically write these in sequence:

  1. neko(.exe), which is the Neko VM binary
  2. bytecode.n, a program in Neko bytecode
  3. the string "NEKO"
  4. size of neko(.exe)

After all these are written, on Linux, it update the ELF header. Now, on Mac, we have to update the Mach-O header in some way, too.

@UniqMartin
Copy link

UniqMartin commented May 9, 2016

I'm afraid this simple approach won't work for Mach-O binaries, at least if your goal is to produce valid Mach-O files that can be safely manipulated by utilities like install_name_tool. The problem is twofold:

  1. There is a strict requirement that the __LINKEDIT segment extend to the end of the file. (Among other things, this is needed for code signing to work properly, though that's not yet relevant for Neko.) This means that the byte code needs to be placed somewhere else.

  2. There is the option of putting the byte code into a dedicated section in the __TEXT segment (in Mach-O parlance, a section is a part of a segment). Locating that section from within the VM code would be trivial thanks to the getsectdata helper function.

    But the structure of a fully-linked Mach-O binary is such that it is hard to impossible to insert a section of variable size (or extend an existing section). Doing this would require moving subsequent sections which would invalidate a lot of cross-references (often absolute file offsets).

Given the above constrains, here's how I think a nekotools boot implementation could work on OS X:

  1. When building the VM:
    1. Generate an archive that contains all the code that would otherwise end up in the neko binary.
    2. Remember the linker options that would be needed for creating a neko binary from that archive.
    3. Install this archive (e.g. neko.a) together with everything else.
  2. When creating a binary with nekotools boot:
    1. Do something along the lines of: cc -Wl,-sectcreate,__TEXT,__nekobytecode,bytecode.n neko.a <linker-options> -o <output-binary>.
    2. Apply install_name_tool, lipo, codesign etc. to that binary if it ever becomes necessary.

Does this make sense? Does this approach meet all the requirements you might have?

After all these are written, on Linux, it update the ELF header.

How do you ensure that the .nekobytecode section that you create as a placeholder ends up being the last section in the ELF file?

@andyli
Copy link
Member Author

andyli commented May 9, 2016

Thanks for the details!

@ncannasse I think we need your input here.

@ncannasse
Copy link
Member

The ELF patching was provided by @kulick, and yes handling other binary formats looks like it will require a specific implementation.

@andyli
Copy link
Member Author

andyli commented May 10, 2016

@UniqMartin Is there any C library that can help us achieve the linking part such that we don't have to depend on a C compiler?

@UniqMartin
Copy link

@andyli If there is, I'm unfortunately not aware of any that could be easily leveraged.

We basically only need the linker part (I just used cc for convenience in my example), thus either Apple's ld64 or LLVM's lld (both of which are Open Source) should be up to the task. To my knowledge, neither of them have a library interface that could be used to make them perform the linking step directly from a nekotools boot process.

If spawning an external process is acceptable, I think relying on the user's already installed linker or falling back to a ld64 dependency on OS X (there's a Homebrew formula for it) is feasible. But please let me know what your requirements and constraints are. Maybe there's an alternative I haven't thought of yet.

@andyli
Copy link
Member Author

andyli commented May 11, 2016

In a perfect world, nekotools boot could produce a valid binary without the help of external tools like a linker or a C compiler. But it seems it requires some serious effort. So, for now, I will use another approach to build nekoc, nekoml, and nekotools.

In a8c71ad, I've added a new feature, which is nekotools boot -c file.n. I reused the same command with an additional -c switch, to perform completely different thing. It outputs a C source file, which contains the input "file.n" encoded as an unsigned char[] and use the neko API to run it. We compile the C file with a C compiler so the output will be a valid binary file. The CMakeLists.txt is updated to use this new feature to build nekoc, nekoml, and nekotools.

I'm leaving this issue open for future contributor to fix the original nekotools boot file.n issue.

@UniqMartin
Copy link

In a8c71ad, I've added a new feature, which is nekotools boot -c file.n.

That's quite a big and invasive change. 😉 The end result should be pretty similar to putting the byte code into its own section, teaching the VM how to look up the address of that section, and running it.

@andyli
Copy link
Member Author

andyli commented May 11, 2016

Yes, but at least it is file format independent and less magical (to me). : )

@kulick
Copy link
Contributor

kulick commented May 11, 2016

Sorry for the slow response. Hi, Andy! :)

Yeah, these kinda problems sorta fall out of the current design for making free standing neko bytecode-based executables by just "tacking the bytecode on at the end". That isn't really a safe thing to do with modern OSes and their executable formats/runtime execution loaders.

The solution that I built for Linux ELF was complicated and annoying, but I guess better than things not working.

I like your solution, Andy, in that it is much more robust and less sensitive to these ABI issues. That said, it is more difficult to use and requires anyone that wants to make a free standing executable from neko bytecode to have a toolchain and a basic understanding of how to use your new output and build it into code. Using it to create the neko build freestanding binaries is a great solution, IMO.

Your solution is obviously not mutually exclusive to the strategy that I employed and if someone with more knowledge of the Mac OS X ABI wanted to try to make that neko interpreter smarter, they could. I looked at implementing the Linux ELF solution with libbfd, but I punted on that since that library is significantly larger than the neko interpreter itself. It seemed too painful to drop in a library that would make the interpreter sooo much bigger, so I just hacked up a "light" version of the functionality. :|

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants