-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build with PGO #5588
Comments
I'm not sure PGO will be a good fit for Caddy. A general purpose webserver that's user-configured can be used in infinite ways, so there's no one profile that would be the best fit. I think it's unlikely that Matt or I will spend time on this, but contributions are welcome. |
I looked into that and thought about it. The optimization depends on the profiles of the production load, analyzing the executed paths, and optimizing the machine code based on known historic workload. Wouldn't this be different for every user? For instance, my own deployment doesn't use any of the FastCGI features, so any optimization based on profiles of my production deployment will not optimize FastCGI aspects. Different users utilize different parts of Caddy, so their preferred optimizations will be different. What do you think? |
It's unlikely that we can cover all usecases but that's also not the point of PGO. The detection and optimization of shared hot codepaths would be good enough. If I understood it correctly, we can also merge profiles. So we could generate 2-3 based on frequently used workloads:
|
Is PGO limited to our code or is this also applied to the dependencies we are using? The benefit would be a lot greater if this would also apply to the code of e.g quic-go Edit: it's pointed out in the FAQ:
|
The little bit I've read about PGO (as of this morning 😅) is that it shouldn't slow down a program, but can offer nominal performance improvements in hot paths with a slightly larger binary size and slightly longer compile times. I agree with @bt90, maybe we generate profiles that utilize primarily:
Of course, because we don't have telemetry (:cry:) we have no idea what the popular configurations are, so we can only guess. (Thank you, unnecessary community backlash of 2018, for leaving us in the dark.) I'd definitely be open to trying this after releasing 2.7. |
Perhaps we can have an option to turn on profiling with xcaddy, then th user can run their workloads for a bit and then run xcaddy again with the profile a input? At least, this is how i do it with gcc pgo builds. |
Profiles can be obtained from any Caddy instance, for years now -- just go to I actually collected a profile this week from our Caddy website and deployed a pgo-optimized instance of Caddy and noticed only barely any speedup... quite insignificant (maybe 2-4% depending on the run of the load test). Maybe that's significant enough to warrant it, and maybe our profile didn't have enough data (I ran it for an hour but it's not a very busy site compared to big enterprise services). |
I had a go at a simple test by benchmarking using
Result without PGO:
Result with PGO:
This is a 22% increase in handled requests per second. Not bad IMHO. Profile was collected with The build script I use is: #!/bin/sh
export XCADDY_SETCAP=1
export GOARCH="amd64"
export GOAMD64="v3"
export CGO_ENABLED=1
export GOFLAGS="-pgo=/usr/src/caddy/default.pgo"
/root/go/bin/xcaddy build --with github.com/caddyserver/caddy/v2=/usr/src/caddy/git/caddy --with github.com/ueffel/caddy-brotli --with github.com/caddyserver/transform-encoder --with github.com/caddyserver/cache-handler --with github.com/kirsch33/realip --with github.com/git001/caddyv2-upload
strip -s -v caddy
setcap cap_net_bind_service=+ep ./caddy |
I agree. PGO can be highly dependant on use-case and the host hardware configuration. It may be better to include PGO support as an option with xcaddy? Quoting from the Go PGO page below:
|
I know @WeidiDeng has merged some Caddy profiles successfully for pgo. Maybe I should ask the community to submit their profiles and we'll try merging them and see if that helps. Seeing your results above is encouraging so maybe we just need a variety. |
I'm thinking that the xcaddy option to build caddy with profile input is a good first step. What do you think of opening a issue at https://github.com/caddyserver/xcaddy ? |
@Forza-tng That sounds like a plan. See caddyserver/xcaddy#163 |
go 1.21 will ship with PGO support enabled by default. Maybe we can squeeze a little bit performance out of this.
https://go.dev/doc/pgo
The text was updated successfully, but these errors were encountered: