-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary size is too big! #169
Comments
I definitely feel the pain as well. Without delving into the backstory of things. We've been working on https://github.com/storj/picobuf in the background to minimize the binary sizes. There are still some features missing that would allow to replace gogo/protobuf (and protobuf) in general. But, the progress got a stuck due to shifting priorities. Why a full replacement? Because one of the major problems with protobuf is that it relies on reflection and Method.Call, which forces the compiler into conservative mode, which doesn't allow a lot of code to be elided. Splitting up ended up feeling like a temporary bandaid for the protobuf situation, so we didn't go that route. The main reason the bloat exists is due to the |
Change pb: remove descriptors and type registration mentions this issue. |
@egonelbre thanks for taking a look at this - much appreciated.
I did not realise this and that probably explains some of the other bloat in the rclone binary from the HDFS backend which also uses protobufs. |
Descriptors are used to dynamically read protobuf information that has not been encoded into the types. Similarly, the type registration ends up forcing every protobuf type to be reachable. This ends up significantly bloating the binary sizes. The type registrations are used to deserialize any messages, which Storj does not use. Updates storj/uplink#169 Change-Id: I705e3a481b2623d164e9f5c237bf88ef855a386c
So, removing the descriptors only ended up reducing things by 300kb. I also experimented with using picobuf, which reduced things by 2.5MB compared to the baseline.
But, I'm still slightly confused why it ends up that large - even with picobuf. I kind of understand why it ends up larger than other backends (mostly because many other use net/http and hence can share the backend), but not to that extent. Some of our monitoring could be the culprit, due to inlining. |
As far as I can tell, we might be hitting a inliner bug, which causes some code to be inlined, which probably shouldn't be. I'll need to dig into that more. |
We use
storj.io/uplink
andstorj.io/uplink/edge
in the Storj backend for rclone.I've just discovered in rclone/rclone#7998 that the storj backend is adding over 6MiB to the rclone binary! That is nearly 10% of the binary size. Most backends weigh in at about 200k - it is only the ones with big external SDKs that exceed 1MB and the storj backend is by far the largest.
I suspect that this package is pulling in more things than it needs to which are making its way into the rclone binary.
I used https://github.com/Zxilly/go-size-analyzer to examine the rclone binary and it seems
storj.io/common/pb
is the biggest culprit. Maybe this could be split up into sub packages per task?I note that AWS split their SDK up into sub modules to help with this problem. The code all still lives in the same single github repo, but subpackages have their own go.mod and with the new workspaces feature that makes it manageable to work on.
Any thoughts on how to improve this?
Thanks
The text was updated successfully, but these errors were encountered: