Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow requests via std.http.Client when using TLS #15029

Closed
hqnna opened this issue Mar 21, 2023 · 10 comments · Fixed by #15179
Closed

Slow requests via std.http.Client when using TLS #15029

hqnna opened this issue Mar 21, 2023 · 10 comments · Fixed by #15179
Labels
bug Observed behavior contradicts documented or intended behavior standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@hqnna
Copy link
Contributor

hqnna commented Mar 21, 2023

Zig Version

0.11.0-dev.2160+49d37e2d1

Steps to Reproduce and Observed Behavior

When attempting to make a GET request to any web page using TLS, the request seems to take 5x longer to read, this can be observed by running zig run -lc example.zig, where example.zig contains the minimal reproduction code below:

const std = @import("std");

pub fn main() !void {
    const allocator = std.heap.c_allocator;
    var client = std.http.Client{ .allocator = allocator };

    var headers = std.http.Client.Request.Headers{};
    const url = "https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz";
    var req = try client.request(try std.Uri.parse(url), headers, .{});

    const buf = try req.reader().readAllAlloc(allocator, std.math.maxInt(usize));
    allocator.free(buf);
}

However, it is noticed that when doing the same with a site not using TLS, it does not appear to happen. This also appears to affect the package manager that was implemented in #14265. This can once again be reproduced by using the following minimal reproduction:

.{
    .name = "mocha",
    .version = "0.1.0",
    .dependencies = .{
        .ptk = .{
            .url = "https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz",
            .hash = "122035a324a6a46bc367ecd79dfb0bf666bda97cb64ab4f5d20e6d7e88972f608d84",
        },
    },
}

I have eliminated any chances of this being an antivirus or firewall issue, as my system uses Windows' built-in stuff, and I have made no changes to any settings. Around a week ago, however, I noticed this slowdown suddenly start happening, not only on my own devices, but on CI's from multiple websites such as sr.ht, and codeberg.

Expected Behavior

According to hyperfine, downloading the tarball in the reproduction code above takes normally ~300-500ms

/tmp ❯ hyperfine "curl -sL https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz"
Benchmark 1: curl -sL https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz
  Time (mean ± σ):     362.2 ms ± 118.7 ms    [User: 20.2 ms, System: 2.7 ms]
  Range (min … max):   277.7 ms … 680.3 ms    10 runs

After compiling the reproduction code to eliminate the chances of it being build-time related, we can see it being nearly 350x slower.

/tmp ❯ hyperfine -r 2 "curl -sL https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz" "./hello"
Benchmark 1: curl -sL https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz
  Time (mean ± σ):     352.8 ms ±  99.1 ms    [User: 21.5 ms, System: 0.0 ms]
  Range (min … max):   282.7 ms … 422.9 ms    2 runs

Benchmark 2: ./hello
  Time (mean ± σ):     120.566 s ±  0.311 s    [User: 0.081 s, System: 0.007 s]
  Range (min … max):   120.346 s … 120.787 s    2 runs

Summary
  'curl -sL https://github.com/MasterQ32/parser-toolkit/archive/master.tar.gz' ran
  341.77 ± 96.02 times faster than './hello'
@hqnna hqnna added the bug Observed behavior contradicts documented or intended behavior label Mar 21, 2023
@hqnna hqnna changed the title Slow requests via `std.http.Client Slow requests via std.http.Client Mar 21, 2023
@hqnna hqnna changed the title Slow requests via std.http.Client Slow requests via std.http.Client when using TLS Mar 21, 2023
@jedisct1
Copy link
Contributor

What CPU are you testing this on?

@hqnna
Copy link
Contributor Author

hqnna commented Mar 21, 2023

My personal rig, a Ryzen 5900x, as well as whatever specifications the Sourcehut CI has.

@jedisct1
Copy link
Contributor

May the lack of Content-Length in the response is be the issue here?

I tried a couple websites and this is what those exhibiting the same behavior seem to have in common.

@hqnna
Copy link
Contributor Author

hqnna commented Mar 21, 2023

May the lack of Content-Length in the response is be the issue here?

I tried a couple websites and this is what those exhibiting the same behavior seem to have in common.

It's possible, would have to investigate.

@jedisct1
Copy link
Contributor

CPU remains very low when it happens, so I suspect an issue in HTTP or TLS framing.

@hqnna
Copy link
Contributor Author

hqnna commented Mar 22, 2023

CPU remains very low when it happens, so I suspect an issue in HTTP or TLS framing.

Could it be related to the TLS bug andrew mentioned in discord a while back? this one in specific.

@jedisct1
Copy link
Contributor

Very plausible, yes.

@kassane
Copy link
Contributor

kassane commented Mar 29, 2023

Downloading with build.zig.zon file may return that error just as slow.

$> zig build
error: BadReaderState

I have been doing some tests with pkg-manager (MVP) and GH actions can download the packages listed in the zon, more than on my own home network.

@truemedian
Copy link
Contributor

I've now done a bit more poking around at the actual cause of this issue. It seems like the tls client is trying to eagerly read data from the stream, even though it may be able to give the user data. This causes the last part of the message to never be received until the peer closes the connection (which would explain the consistent 2 minute pause for everyone).

@hqnna
Copy link
Contributor Author

hqnna commented Apr 3, 2023

I've now done a bit more poking around at the actual cause of this issue. It seems like the tls client is trying to eagerly read data from the stream, even though it may be able to give the user data. This causes the last part of the message to never be received until the peer closes the connection (which would explain the consistent 2 minute pause for everyone).

Interesting. Definitely sounds like it is somewhat related to the TLS bug andrew mentioned.

@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label Apr 7, 2023
@andrewrk andrewrk added this to the 0.11.0 milestone Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants