Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple image loading from stb_image; tidy up YARGImage and FixedArray + its derivatives #207

Open
wants to merge 4 commits into
base: native
Choose a base branch
from

Conversation

TheNathannator
Copy link
Collaborator

This ensures YARG.Core has no direct dependencies on any native libraries. Currently, if the JIT generates YARGImage.Load(FixedArray<byte>) in any of the auxiliary projects for YARG.Core, it will throw an exception due to STB2CSharp binaries not being included in the YARG.Core repo; only main YARG includes it at the current time.

While this issue will pretty much never be reached in practice, the potential plans for adding a YARG.Core.Native project become significantly more complicated when attempting to migrate stb_image to there, due to conflicting needs between different projects and the technical difficulties in trying to account for those differences:

  • The auxiliary projects would need only the native binary for the current platform in order to load correctly. Having all platforms' binaries in the same place would probably confuse the runtime on which to pick, and putting each in a folder would likely make it not be able to be found.
  • Main YARG needs binaries for all platforms so they can each function correctly.

This is nigh-impossible to provide proper build automation for with the hybrid CMake + .NET SDK build chain I was trying out, as MSBuild has no way of defining arbitrary properties outside of the command-line. It's much easier to keep YARG.Core functionally separate from YARG.Core.Native, so that they can be built and handled separately.

Copy link
Collaborator

@sonicfind sonicfind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just can't accept these changes.

Comment on lines +18 to 23
// Note: not DisposeUnmanaged, as object references are not guaranteed to be valid during finalization
protected override void DisposeManaged()
{
_accessor.SafeMemoryMappedViewHandle.ReleasePointer();
_accessor.Dispose();
_file.Dispose();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now if you forget to call Dispose(), you've potentially introduced a memory leak. Any usage of AcquirePointer must always be paired with a call to ReleasePointer. That's no longer the case.

And you can't just put it into the DisposeUnmanaged function either because DisposeManaged is called first, meaning the disposals would be out of order. The only solution is to put into DisposeUnmanaged... which makes DisposeManaged completely useless... which brings us right back to what it was before.

I am not accepting this change whatsoever.

Copy link
Collaborator Author

@TheNathannator TheNathannator Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be in DisposeUnmanaged because managed object references are not guaranteed to be in a valid state in finalizers. As it was implemented previously, depending on runtime implementation details it very well could fail regardless because there are no guarantees made for anything.

The IDisposable best-practices post on StackOverflow I like to point people to when discussing how to implement it points to Eric Lippert's When Everything You Know is Wrong blog post as extra reading, which explains the pitfalls of finalizers. The second part covers why we have to use DisposeManaged:

image

Just because this did work previously doesn't mean it always will. Finalizers are a last-ditch effort, not a reliable mechanism.

Copy link
Collaborator

@sonicfind sonicfind Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we should explicitly suppress the finalization of the those member variables on construction instead. Force GC to handle it in the order we expect. We just cannot forgo releasing the pointer in any situation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just revert these Array changes entirely. They were already set up exactly how they're supposed to. Just because you can do the whole "unmanaged" vs. "managed" dispose thing doesn't mean you're supposed to - and this is explicitly NOT one of those instances.

Copy link
Collaborator

@sonicfind sonicfind Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention: placing the static functions at the top of each file was an intentional choice to clearly communicate to readers of the class how to construct any of them. So moving them to the bottom only makes the most important information harder to find.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DisposeManaged and DisposeUnmanaged have different semantics due to the finalizer pitfalls I mentioned above. If MemoryMappedArray wasn't a thing, there would be no DisposeManaged, but because that uses managed object references, it must exist.

Copy link
Collaborator Author

@TheNathannator TheNathannator Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see the second comment at first, gotta love when you miss things becaue GitHub decides to use a cached version of the page lol

Me shuffing things around was an organizational thing. The general practice from what I've seen is always fields first (static ones first), then properties, then constructors, then methods (with situational breakage of these guidelines in places that make more sense). While this doesn't need to be adhered to, I hardly even stop myself since it's kind of second nature to me lol (the OCD doesn't help either).

If you want something to be clearly communicated to users, write documentation comments. This codebase severely lacks documentation, and the only way to properly know how to use half of the types is to read the implementation code and existing usages of the code.

Copy link
Collaborator

@sonicfind sonicfind Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, absolutely not. None of this helps. YARGImage was designed to interact within Core and with native in as simple in a manner as possible. Now that there's a middleman involved, you've made it significantly more prone to error.

  1. It relies on users of the type to even remember to design a class that uses this interface, increasing the difficulty of implementing Core.
  2. It relies on the designer of said class to know the internals of whatever binary they use to know what conversions may need to be done (which is why ImageFormat had those prior values).
  3. It relies on the decoder itself to maintain the same level of compatibility with filetypes that were used beforehand. Otherwise, you could get wild inconsistencies between programs.
  4. Without the allocation and freeing of image data being embedded in this type, you're now at the mercy of the designer of the decoder as to whether those are even handled properly. Could very well have made a bug.
  5. The decoder value being a publicly accessible static variable is VERY VERY dangerous on all fronts. It'd be incredibly simple to insert some other functionality into that slot at any time - including during an image load or before an image is disposed. No security; no thread safety; just overall a bad move.

What I DO like are the name changes for the static functions that create YARGImage and the data.Length < 32 check for DXT. But honestly, that's it.

Copy link
Collaborator

@sonicfind sonicfind Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were going to do anything with the native calls, then you should've moved the functions over to the new native plugin instead.

Copy link
Collaborator Author

@TheNathannator TheNathannator Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other option is there for me to take? Without decoupling YARGImage from stb_image, there's no way to move it to YARG.Core.Native without introducing major build toolchain issues, which I already detailed in the main post. It's significantly easier to make YARG.Core have no direct native code dependencies than try to shuffle around binaries for every platform automatically within the build chain. The interface is somewhat of a hack, but I'm just not sure what else could be done within the MSBuild ecosystem.

  1. This can be addressed by making FixedArray<byte> abstract; the only reason it isn't currently as far as I can tell is because of FixedArray.Alias, which can be easily addressed with an AliasedArray derivative.

  2. I don't particularly understand this point, when interoperating with any code you're gonna have to understand at least some details of their API/code. This is especially true for native libraries, and while the IImageDecoder interface is directly based on stb_image's API, there is no assumption that that is the only possible library used in the first place due to the removal of the specific ImageFormat values, and usage of ImageFormat in place of int components.

  3. / 4. This is the reason why the decoder returns a FixedArray<byte> derivative instead of a straight pointer: deallocation is now embedded into the array instance itself, and not into the decoder. It could just as easily be built into the decoder by storing the instance for that instead, but that complicates LoadDXT significantly.

  4. This is also why Decoder is loaded into a local variable before being used/checked in LoadFile(FixedArray<byte>), as that guarantees the same decoder is used throughout the function. While it certainly could be better (e.g. comments added, private static volatile backing field used, maybe even [ThreadStatic] (but that brings its own can of worms)), I consider it to be adequate for our current uses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants