Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat multipart file upload support #447

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

Inksprout
Copy link

@Inksprout Inksprout commented Mar 23, 2023

This PR aims to add partial support for Multipart/form-data requests to be added to the Pact File.
It introduces a WithMultipartSingleFileUpload() method that allows the user to to specify a request of multipart/form-data in which a single part is uploaded, which is a file.

This functionality is a common use case and uses the underlying Rust FFI method designed for this express purpose.
Note this update does not enable Pact-Net to fully support all types of multipart/form-data requests as it does not support requests with multiple parts included.

see issue: #410

TODO

  • Add documentation for the new method

@Inksprout Inksprout marked this pull request as ready for review March 23, 2023 22:50
Copy link
Member

@mefellows mefellows left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. I only have bike shed comments to offer at this stage :)

src/PactNet/Drivers/IHttpInteractionDriver.cs Outdated Show resolved Hide resolved
src/PactNet/Drivers/InteropActionExtensions.cs Outdated Show resolved Hide resolved
src/PactNet/Interop/StringResult.cs Outdated Show resolved Hide resolved
src/PactNet/RequestBuilder.cs Show resolved Hide resolved
@adamrodger
Copy link
Contributor

Nice, I'll give this a review later on 👍

Copy link
Contributor

@adamrodger adamrodger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the comments, I'd like to see an additional test for the verifier side so that we can confirm the entire round-trip works.

src/PactNet.Abstractions/IRequestBuilder.cs Outdated Show resolved Hide resolved
src/PactNet/Interop/NativeInterop.cs Outdated Show resolved Hide resolved
src/PactNet/Interop/StringResult.cs Outdated Show resolved Hide resolved
src/PactNet/RequestBuilder.cs Outdated Show resolved Hide resolved
src/PactNet/RequestBuilder.cs Outdated Show resolved Hide resolved
}
],
"request": {
"body": "LS1MQ1NqcTNxdWtmYmc2WjJTDQpDb250ZW50LURpc3Bvc2l0aW9uOiBmb3JtLWRhdGE7IG5hbWU9ImZpbGUiOyBmaWxlbmFtZT0idGVzdF9maWxlLmpwZWciDQpDb250ZW50LVR5cGU6IGltYWdlL2pwZWcNCg0K/9j/4AAQSkZJRgABAQAAAQABAAD/4gHYSUNDX1BST0ZJTEUAAQEAAAHIAAAAAAQwAABtbnRyUkdCIFhZWiAAAAAAAAAAAAAAAABhY3NwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAA9tYAAQAAAADTLQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlkZXNjAAAA8AAAACRyWFlaAAABFAAAABRnWFlaAAABKAAAABRiWFlaAAABPAAAABR3dHB0AAABUAAAABRyVFJDAAABZAAAAChnVFJDAAABZAAAAChiVFJDAAABZAAAAChjcHJ0AAABjAAAADxtbHVjAAAAAAAAAAEAAAAMZW5VUwAAAAgAAAAcAHMAUgBHAEJYWVogAAAAAAAAb6IAADj1AAADkFhZWiAAAAAAAABimQAAt4UAABjaWFlaIAAAAAAAACSgAAAPhAAAts9YWVogAAAAAAAA9tYAAQAAAADTLXBhcmEAAAAAAAQAAAACZmYAAPKnAAANWQAAE9AAAApbAAAAAAAAAABtbHVjAAAAAAAAAAEAAAAMZW5VUwAAACAAAAAcAEcAbwBvAGcAbABlACAASQBuAGMALgAgADIAMAAxADb/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/2wBDAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQH/wAARCAABAAEDAREAAhEBAxEB/8QAFAABAAAAAAAAAAAAAAAAAAAABv/EABQQAQAAAAAAAAAAAAAAAAAAAAD/xAAVAQEBAAAAAAAAAAAAAAAAAAAJCv/EABQRAQAAAAAAAAAAAAAAAAAAAAD/2gAMAwEAAhEDEQA/AC6lAjD/2Q0KLS1MQ1NqcTNxdWtmYmc2WjJTLS0NCg==",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this isn't what I expected to see... How can we validate that the body is actually a multi-part one in the correct format if the entire body is base64 encoded? That's made the interaction pretty opaque, which will make it hard to debug if the verifier fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decoded this body and I'm confused by something. The body is:

--LCSjq3qukfbg6Z2S
Content-Disposition: form-data; name="file"; filename="test_file.jpeg"
Content-Type: image/jpeg

<binary data>
--LCSjq3qukfbg6Z2S--

but where does the Content-Type: image/jpeg come from? The public API doesn't allow you to specify the content type of a part as far as I can see. Is that being auto-detected or something? Either way, you should probably be able to set it yourself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a similar issue to the "file" one you mentioned, I have updated the code to include this as a parameter the user sets. There is a limitation here I am planning to document however, the content type matching in the Rust library uses the Shared MIME-info Database (https://specifications.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html) to do the binary comparison. This library is GNU licensed, so we can't embed it in the framework, it has to be installed in the system. It is not supported in windows, so while content matching will work as expected in a Unix type build pipeline, the content type will only come through as 'application/octet-stream' when using windows. You can see the test I added reflects this, as I have to check and set the content type appropriately so that test can pass in each of the test environments the repo has.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, that's quite the gotcha I wasn't aware of. So running the same test on Windows and Linux will produce different pact files? That doesn't sound good.

What if you run it on Linux with and without the mime detection library installed? Does it error or give you different results also?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realised the auto detect probably breaks my use case as well, because we support multiple types of upload (e.g. you could upload a CSV or XLSX file) and we need control over the mime type so we can handle the file properly on the receiver side. There's no way to guarantee that the auto detect generates the mime we need, especially if it's a vendor specific mime.

We've had issues raised by other users when we hard coded the content type for JSON bodies as well, because sometimes people want to add extensions such as application/json+patch or they have to integrate with other systems which don't behave themselves.

Also, the round trip integration test that I've asked for will fail in CI because we run it on all 3 major OSs and verify the file contents afterwards. A test library that produces different results on different OSs isn't really desirable. We shouldn't be building OS specific switches into the tests either, that's an obvious code smell.

I think that's big enough to be a blocker to me if the tests aren't reproducible and the reason for that is unclear to the user. That's a foot gun that's too easy to trigger, and then all the issues will start coming in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really willing to skip the tests on Windows for a .Net library.

Windows is the main development OS for .Net by far, but obviously they also need to work in other OSs as people increasingly run in dev containers or run their CI on different OS.

A pretty typical workflow, and the one everyone uses at my work, is for people to develop on Windows but run CI and production in Linux containers. We need the same tests to produce the same results in that situation.

I honestly think this change is blocked into we get an FFI behaviour change. Fortunately the issue is linked now so we can track it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pretty typical workflow, and the one everyone uses at my work, is for people to develop on Windows but run CI and production in Linux containers. We need the same tests to produce the same results in that situation.

If I understand correctly, the issue here will be that the CI will publish a contract with the correct MIME type e.g. image/jpeg and any verification process on a Windows development machine will fail, because the type will be detected as application/octet-stream.

If I further understand the implementation detail, even if you attempted to do the manual workaround you suggested Adam (setting content-type header, length and explicitly laying out the multi-part boundaries) the FFI will still do the mime byte check and fail with the same issue.

Am I right in assuming that @rholshausen ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that I don't think the current behaviour conforms to the spec, which says

If there is a Content-Type header, the value of the header determines the content type.

That means I'd expect to be always be able to control the content type by passing the appropriate header. In the case of multi-part then obviously the content type header applies to each part separately.

It also doesn't conform to this part:

default to either application/json (as V1) or text/plain.

Because the default is application/octet-stream instead of text/plain, and the spec notes that the magic byte test (although the spec incorrectly calls it 'magic number test') is optional.

I think the default should be that the user has to specify a content type, with an opt-in for auto-detection. That way we can document that it may work differently on different OSs and the user can conciously make that choice. If it doesn't work, e.g. because they use different OS or the OS fails auto-detection, then they've always got the option to specify manually.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That section of the spec is referring to the Content-Type of the body of a request, however the issue here is with the Content-Type of the file being uploaded. Requests with a file as part of them include both content types. In the case of using the pactffi_with_multipart_file method from the ffi, the detection of the body Content-Type should work according to the spec. The Content-type of the file being uploaded is what the pactffi_with_multipart_file method helps match. This is the content type that is not able to be detected correctly when using Windows. Hope that helps clear up the issue.

@Inksprout
Copy link
Author

Inksprout commented Mar 28, 2023

In addition to the comments, I'd like to see an additional test for the verifier side so that we can confirm the entire round-trip works.

@adamrodger I wasn't 100% sure where to add these but I have added tests into Samples/EventAPI consumer and provider to generate and verify a pact file with this type of body. It includes the API endpoint used. let me know what you think or if you had something else in mind :)

@Inksprout Inksprout requested review from mefellows and adamrodger and removed request for mefellows and adamrodger March 28, 2023 05:09
@annalauraa
Copy link

Any estimate on when this will be completed?

@mefellows
Copy link
Member

No, the issue is upstream here: pact-foundation/pact-reference#171.

If you are willing to help we'd appreciate it.

@YOU54F
Copy link
Member

YOU54F commented May 28, 2024

Will need to split out these changes but just wanted to say I revisited this, with some updates to pact-ref which are proposed here for consistent cross platform payload matching - pact-foundation/pact-reference#429

continuation of this PR YOU54F#1

also adds in linux arm64 + musl arm64/x64 targets and slimmed binaries and tests them with docker (exception for arm64, which is tested locally as dotnet doesn't support running under qemu), so I need to pull those out. Just wanted to show that we are getting to a better place and hopefully closer to getting this merged.

Thanks @Inksprout for starting it off! Hope you are doing well.

Thanks for the review so far and concerns raised @adamrodger

@adamrodger
Copy link
Contributor

I think this PR should be closed and an RFC raised with a proposed new API so we can discuss. Nobody likes pushing up a PR only to be told the API won't work or something, so let's get agreement on that first.

I think I started outlining a potential API in the feature request issue for this, so we can carry on the discussion there. The important thing is that there needs to be some kind of fluent builder to allow you to add multiple parts with independent content types and that needs to fit into the existing typestate pattern to prevent you from then being able to set any other body type.

@YOU54F
Copy link
Member

YOU54F commented May 29, 2024

I think this PR should be closed and an RFC raised with a proposed new API so we can discuss.

Yeah that is fine with me, I'm not attached to this API in particular, I just used it as a test bed to test out the content type matching in order to get consistent behaviour across platforms and archs

It would be nice to thank the original contributor for their attempt however, when it is closed.

Nobody likes pushing up a PR only to be told the API won't work or something, so let's get agreement on that first.

I don't think that is a universal statement, I don't mind at all. I tend to experiment on my own and propose something via code, and take from that the learning of getting up to that point, rather than heavily investing in proposing something relatively concrete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants