-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid allocating intermediary strings when readpartial is passed an outbuf #61
Conversation
Nice work @segiddins. I tested it out and it seems to work fine still for extracting gems. I agreed that |
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
New benchmark: Script#!/usr/bin/env ruby
# frozen_string_literal: true
require 'rubygems'
require 'bundler/inline'
gemfile do
source "https://rubygems.org/"
gem 'memory_profiler'
gem 'zlib'#, path: "/Users/segiddins/Development/github.com/ruby/zlib"
gem 'benchmark-ipsa'
end
len = 16_384
Benchmark.ipsa do |x|
x.report("read #{len} until eof") do
f = Zlib::GzipReader.open("/tmp/versions.gz")
f.read len until f.eof
ensure
f&.close
end
x.report("readpartial #{len} until eof") do
f = Zlib::GzipReader.open("/tmp/versions.gz")
f.readpartial len until f.eof
ensure
f&.close
end
x.report("readpartial #{len}, buf until eof") do
buf = String.new(capacity: len, encoding: Encoding::BINARY)
f = Zlib::GzipReader.open("/tmp/versions.gz")
f.readpartial len, buf until f.eof
ensure
f&.close
end
read_takes_buf = begin
Zlib::GzipReader.open("/tmp/versions.gz") { |f| f.read nil, nil }
rescue ArgumentError
false
else
true
end
x.report("read #{len}, buf until eof") do
buf = String.new(capacity: len, encoding: Encoding::BINARY)
f = Zlib::GzipReader.open("/tmp/versions.gz")
f.read len, buf until f.eof
ensure
f&.close
end if read_takes_buf
x.compare!
end Before
After
In summary:
@hsbt this will have a significant benefit for RubyGems/Bundler/RubyGems.org, I'd greatly appreciate your help in getting this reviewed 🙇🏻♂️ ps. ran two more benchmarks, one for pps. This drops allocations for |
ext/zlib/zlib.c
Outdated
// fill stream until the buffer can be filled. while this isn't strictly | ||
// necessary, it does help cut down on the number of calls to readpartial | ||
// in the typical case. | ||
while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the bug I was seeing. It wasn't reading enough to fill the len requested, so it was returning less than requested even when the amount was available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what readpartial
means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that readpartial reads as much as it can, up to the limit, unless it would block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i fixed now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also mistaken about the exact behavior of readpartial. Sorry for any confusion. I double checked this branch myself locally and this definitely solves #56.
@hsbt would appreciate a review on this one when you get a chance, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay to merge this. But I would like to review this by @nobu or other ruby core member.
ext/zlib/zlib.c
Outdated
// fill stream until the buffer can be filled. while this isn't strictly | ||
// necessary, it does help cut down on the number of calls to readpartial | ||
// in the typical case. | ||
while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that what readpartial
means?
9c41288
to
462bbf1
Compare
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
@nobu i think I've addressed your concerns? |
|
How would you like me to include the text fixture if not in a fixtures directory? |
I think ruby-core backport script rejects any changes that pollute the global Maybe if you move the |
I wonder the necessity of fixtures in this case at least. |
I'm happy to replace the fixture but I don't know how to manually generate a gzip file that demonstrates the same issue with readpartial -- it doesn't surface on all gzips. cc @martinemde |
All I can offer right now is that I previously tried unzipping and rezipping this data.tar. The freshly gzipped file no longer causes the issue. Including this gem in the test is a real world proof. We could instead hunt around for a smaller example gem, if this is a size issue, or try to figure out what it is about this one file and recreate it (I don't have that expertise). We could also just generate random gzips until we find one that fails. 😁 |
diff --git c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz i/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz
deleted file mode 100644
index 6e5581c..0000000
Binary files c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz and /dev/null differ
diff --git c/test/zlib/test_zlib.rb i/test/zlib/test_zlib.rb
index 9d18058..5811f8b 100644
--- c/test/zlib/test_zlib.rb
+++ i/test/zlib/test_zlib.rb
@@ -1005,20 +1005,20 @@ if defined? Zlib
assert_equal "".b, s
assert_predicate f, :eof?
end
+ }
- File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
- Zlib::GzipReader.wrap(f) do |gzio|
- count = 0
- until gzio.eof?
- s = gzio.read(16_384)
- count += s.size
- assert_predicate gzio, :eof? if s.empty?
- end
- assert_equal 59904, count
+ compressed_data_file do |f, size|
+ Zlib::GzipReader.wrap(f) do |gzio|
+ count = 0
+ until gzio.eof?
+ s = gzio.read(16_384)
+ count += s.size
+ assert_predicate gzio, :eof? if s.empty?
end
- assert_predicate f, :closed?
+ assert_equal size, count
end
- }
+ assert_predicate f, :closed?
+ end
end
def test_readpartial
@@ -1043,7 +1043,7 @@ if defined? Zlib
end
}
- File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+ compressed_data_file do |f, size|
Zlib::GzipReader.wrap(f) do |gzio|
count = 0
until gzio.eof?
@@ -1051,12 +1051,12 @@ if defined? Zlib
count += s.size
assert_predicate gzio, :eof? if s.empty?
end
- assert_equal 59904, count
+ assert_equal size, count
end
assert_predicate f, :closed?
end
- File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+ compressed_data_file do |f, size|
Zlib::GzipReader.wrap(f) do |gzio|
count = 0
s = String.new(capacity: 16_384, encoding: Encoding::BINARY)
@@ -1071,7 +1071,7 @@ if defined? Zlib
break
end
end
- assert_equal 59904, count
+ assert_equal size, count
end
end
end
@@ -1271,6 +1271,17 @@ if defined? Zlib
}
end
+ private
+ def compressed_data_file
+ Tempfile.create(caller_locations(1, 1).first.base_label) {|t|
+ size = Zlib::GzipWriter.open(t.path) {|gz|
+ data = File.read(__FILE__)
+ gz.print(data)
+ data.size
+ }
+ yield t, size
+ }
+ end
end
class TestZlibGzipWriter < Test::Unit::TestCase |
@nobu I just tried reverting some of my changes, and the test in your patch still passes when I undo the change at the end of |
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.
For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above. rubygems/rubygems@f78d45d927
@nobu It sounds like you don't want this PR to include the bugfix. It seems like including an effective test for this bug makes this PR unmergable. How would you suggest we proceed? |
Yes, exactly. |
I was under the impression that the whole point of this PR is to fix #56, and the performance improvements were found as part of the fix. Should we remove the bug fix from this PR so readpartial continues to be broken but is more memory efficient, or should we remove the memory efficiency improvements and just fix the bug? |
Note: rubygems does have 3 test fixture .gem files in test/rubygems/packages. |
The title and description look only for the latter, but nothing about #56.
I'm ok both, but you'll need to edit the title in the latter case. |
dac893a
to
8ad02d4
Compare
8ad02d4
to
70a688a
Compare
@nobu, Have you given up on this PR? |
@nobu Could you give us some feedback here about what Sam or I could do to get this PR ready? We think it would enable better performance in rubygems now that we will soon be able to use readpartial again. Any ideas? |
Maybe we can try opening another ruby-core ticket about this performance improvement. That worked great for getting the bug fix merged. |
…d an outbuf This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present
70a688a
to
07f44b7
Compare
Yep, that seems a good way to get attention on this. Also note you should add such tickets to the dev meeting ticket to make sure it gets reviewed in the monthly dev meeting. |
This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present
This is my first time in a while writing C code for Ruby, so please excuse any style errors.
cc @martinemde
Verified improvement via this script:
script
Before
After