-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a MultipartEntity definition for vibe.inet.webform. #2
base: master
Are you sure you want to change the base?
Add a MultipartEntity definition for vibe.inet.webform. #2
Conversation
One thing that is sub optimal about the way of defining the whole multipart entity up-front is that it inevitably requires a number of dynamic memory allocations. In high-throughput scenarios, this has repeatedly turned out to be a serious bottleneck or resulting in pseudo memory leaks with the GC implementation. A way to mitigate this would be to change the entity/part types from
|
source/vibe/inet/webform.d
Outdated
* See_Also: https://datatracker.ietf.org/doc/html/rfc2046#section-5.1 | ||
* See_Also: https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html | ||
* See_Also: https://datatracker.ietf.org/doc/html/rfc2388 | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small code style issue - the multi-line comment style used in the code base is omitting the " * " prefix for the intermediate lines and instead uses a single tab indentation. Also, the first paragraph should just be a single short summary sentence, because it will be listed in the "description" column of the overview tables of the API documentation.
I wasn't fully aware of that. Originally I was hoping the use of classes would help make it easier to pass the object by reference and avoid copies, but wasn't aware that dynamic allocations are so much more expensive. What is it about the GC that makes them so costly? Also, why use templates for the attribute types?
Don't concrete types make it easier to utilize the values when writing them to a connection, such as in vibe-http, client.d? Maybe I'm confused about the actual status of types like The |
There are two escalation stages with the current GC. The first one happens because the collection stage uses a stop-the-world approach and allocations require taking a global GC lock. In situations with high GC pressure or usage frequency, this can nullify any multi-threading performance gains. The second stage happens when for some reason (probably memory pool fragmentation, but maybe also due to the way the GC selects a pool for a new allocation, I haven't looked close enough into the implementation) the GC cannot consistently reuse memory for new allocations. This then leads to infinite grown of allocated pools, which is a de-facto memory leak. I've observed this being triggered both in vibe.d web application contexts and in our photo application as soon as rapid allocations for more than a few bytes happen from multiple threads at once. This happens rather rarely, but when it does it's hard to track down.
The idea is that it allows the user to avoid any heap/GC allocations whenever possible. So for example a static array or an
I forgot to add the proper template constraints in my example. For Without an example this does become a bit more complicated to understand from the user's point of view, unfortunately, but having the high-level overloads for the common cases should mitigate that hopefully. |
In this particular case, I believe we won't be able to treat the parts as a range interface, because they lack a common element type. For example, a form might consist of 3 fields, a text input, a checkbox, and a multi-file upload. Thus a statement like According to the spec, the default It's a bit of a shame that there's no MIME RFC2046 library that already exists that could be used for this purpose. (Hmm, perhaps I should begin there?) |
… InputStream types.
Yeah, I'd say it makes sense to go from there. For the internal representation we should have that anyway, as I'd say that with changing to |
I ran into a problem with this approach. Because the Header type is templated, and the stream type that a file can be in is also templated (inside The only way I can think of to make this work is to provide the However, because the How important is it to offer arbitrary range types in the interface? If it's very important, then maybe |
@s-ludwig So what do you think about the problem of using SumType? In a nutshell, if I have a list/range of Multiparts (which can be the body of a multipart entity), then I need a type for that variable. However, if I provide a type, then it has to have a concrete header and body type. But this means it cannot support ranges, which themselves have no concrete type. E.g. you can't have a concrete range of different types of ranges. You would either have to lock-down supported range types, like InterfaceProxy!InputStream, or use Variant. Which is better? |
Sorry for replying late, I actually started to write a reply two weeks ago but then got sidetracked. But now actually, after re-reading everything, I think that what I meant in my first comment was actually to have the So it would look like this: Parts can then be given as multiple individually typed arguments. Each argument could be a range of parts, or a single part. The user could choose the range element type to be whatever is needed. Although it may not actually be necessary at all, we could also support using |
I originally had tried something like this, and the code is still commented out, but it would look something like this:
In this case, the compiler expands the variadic parameter
Shortening the header types as H and body type as B, the error is:
In essence, the same problem remains. A concrete type that can fit all the diverse body types is still needed. And those body types might wrap a file or other arbitrary range. Thus, the body variable, if it does not wish to use Variant, would still need a SumType that encompasses all the possible body content types that are permitted: string, ranges, inputstreams, etc. Even Variant doesn't solve the problem, because when sending the actual form on an HTTP connection, you need a way to write the contents of the multipart parts, and without a type, that's not really an option. The only ideas I have right now would be to support |
… discussion around parameterization.
I added a new commit that hopefully makes things easier to reason about this. I got rid of Starting from something that's working, maybe it would be easier to specify, one at a time, what template parameters should exist. |
This should be something like this:
The named argument initialization probably cannot be used, yet, due to backwards compatibility and I'm not sure how well it plays with tuple typed fields, but in any case, the edit: The
|
Wouldn't that make it impossible to declare an object in practice? Imagine trying to a assign to I don't think this level of desired flexibility can be accomplished without Variant. One either needs to sharply limit the types that a MultiPartEntity body can have, or use Variant. Variant is also only a half-solution, because one still needs to know how to write the contents of the body to a Stream. So it does matter whether it's a File, Stream, etc. |
The nice thing is that the user can prepare the parts however it fits best: // functional style:
auto entity = multiPartEntity(header,
formPart("user", "peter"),
formPart("file", NativePath("foo.dat"));
auto entity = multiPartEntity(header,
formPart("user", "peter"),
paths.map!(p => formPart("file", p)));
// imperative style:
typeof(formPart("", NativePart.init))[] files;
foreach (path; paths)
files ~= formPart("file", path);
auto entity = multiPartEntity(header,
formPart("user", "peter"),
files);
alias MyPartTypes = SumType!(typeof(formPart("", "")), typeof(formPart("", NativePath.init));
MyPartTypes[] parts;
parts ~= MyPartTypes(formPart("foo", "bar"));
parts ~= MyPartTypes(formPart("foo", NativePath("bar.dat")));
auto entity = multiPartEntity(header, parts); The necessity for |
This is an initial attempt to implement a standards-compliant multipart/form-data request.
See vibe-d/vibe-http#32
The initial commit here is preliminary and will require iteration, due to a lack of familiarity on the author's part of Vibe's internal conventions and patterns.