Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid allocating intermediary strings when readpartial is passed an outbuf #61

Merged
merged 2 commits into from
May 14, 2024

Conversation

segiddins
Copy link
Contributor

This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present

This is my first time in a while writing C code for Ruby, so please excuse any style errors.

cc @martinemde

Verified improvement via this script:

script

#!/usr/bin/env ruby

require 'rubygems'
require 'bundler/inline'

gemfile do
  source "https://rubygems.org/"
  gem 'memory_profiler'
  gem 'zlib', path: "/Users/segiddins/Development/github.com/ruby/zlib"
end

f = File.open("/Users/segiddins/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions")

gz = Zlib.gzip(f.read)

File.write("/tmp/versions.gz", gz)

f = Zlib::GzipReader.open("/tmp/versions.gz")

buf = String.new(capacity: 16_384, encoding: Encoding::BINARY)

MemoryProfiler.report do
  f.readpartial 16_384, buf until f.eof?
end.pretty_print(scale_bytes: true)

Before

Total allocated: 24.71 MB (3378 objects)
Total retained:  0 B (0 objects)

allocated memory by gem
-----------------------------------
  24.71 MB  other

allocated memory by file
-----------------------------------
  24.71 MB  Untitled 2.rb

allocated memory by location
-----------------------------------
  24.71 MB  Untitled 2.rb:25

allocated memory by class
-----------------------------------
  24.71 MB  String

allocated objects by gem
-----------------------------------
      3378  other

allocated objects by file
-----------------------------------
      3378  Untitled 2.rb

allocated objects by location
-----------------------------------
      3378  Untitled 2.rb:25

allocated objects by class
-----------------------------------
      3378  String

retained memory by gem
-----------------------------------
NO DATA

retained memory by file
-----------------------------------
NO DATA

retained memory by location
-----------------------------------
NO DATA

retained memory by class
-----------------------------------
NO DATA

retained objects by gem
-----------------------------------
NO DATA

retained objects by file
-----------------------------------
NO DATA

retained objects by location
-----------------------------------
NO DATA

retained objects by class
-----------------------------------
NO DATA


Allocated String Report
-----------------------------------
         1  "\nBibliografia.alu0100816167 0.2.0 c86a6472b1ee133100ee922bcffd3a39\nBibliografia_alu0100502107 0.1.0 bfaec8bf21658d9664741c8d923e4eca\nBidManagerLoggerGem 0.0.0 baa3c08c034fb911c0e6fd90d8219b1e\nBigCat 0"
         1  Untitled 2.rb:25

         1  ",1.0.1,1.0.2 a926a2955fc394053c68a3d50b6ef48e\nRailsRemoteControl 1.0.0 850c11e07777754b00ad66f506af8ae3\nRailsRunSignUp 0.0.1,0.0.3,0.0.4,0.0.5,0.0.7 f2323ed86e442b2b33ab73aa5cd95587\nRailsTop 0.0.1,0.0"
         1  Untitled 2.rb:25

         1  ",2022.2.19,2022.2.20,2022.2.21,2022.2.22,2022.2.24,2022.3.9,2023.1.29,2023.1.31,2023.5.25,2023.7.14,2023.7.20 0b781346d3306df5e023bbf0a5d1f988\nEZAccounts 0.6.1 3f289d4a6b3d1b11d7d2c2a6dd80fa75\nEZTOP10"
         1  Untitled 2.rb:25

         1  "-graticule 0.2.7.6,0.2.7.5,0.2.7.4,0.2.7.3,0.2.7.2 8f2de7b75ea069d8e310eddef4388588\nGUI-mini_magick 1.2.3 61cc135041fcec97105c69df07095fc4\nGUnit 0.1.2,0.2.0,0.2.1,0.2.2,0.3.0,0.3.1,0.3.2,0.3.3,0.3.4,0"
         1  Untitled 2.rb:25

         1  ".3.0,1.4.0,1.4.1,1.4.2 53814ff52d4ec9f2588cad7a79e7c272\nExit_Zero 0.1.0,1.0.0,1.0.1,1.1.0,1.1.1,1.1.2,1.1.3,1.2.0,1.3.0 5ed64f091d3410b02d803e4c141bd503\nExpressQuery 0.0.1,1.0.0,1.1.1 343c7b16462239c8"
         1  Untitled 2.rb:25

         1  "03ec4a0\nModelMaker 0.1.0,0.1.1,0.1.2 fb981323c3e30f29e5c055414b0e8357\nModule2Class 0.0.1 3fdb19bffdf40d75add588a571f9c44d\nMoneyConversions 0.1.0 272811aa64258719b3adb885315ab4c9\nMoney_DaWanda 0.0.0 76"
         1  Untitled 2.rb:25

         1  "04434d414302db7a904190e363cb6cdf\nP4Ruby-mingwx86 2014.1 4b85a4fb36bc3bc3be322135637b48f9\nP5_T_34 0.1.1,0.1.2 9284c3d71f9b135a1098cfe62b5101db\nP6-alu0101111254 0.1.0 5163068706fe08a4113b0334ab392b5d\nPA"
         1  Untitled 2.rb:25

         1  "05cb89f789bfa32a1975d6f1b44\nBBB 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10,0.1.0,0.1.1,0.1.2,0.1.3,0.2.0,0.3.0 c191f999c3105549170c4e7bd7377f67\nBBRedCloth 0.8.0,0.8.1,0.8.2,0.8.3,0.8"
         1  Untitled 2.rb:25

         1  "0aca8d1\nLocksmith 0.0.0,0.1.1 7c310256129f799338aa71f208713348\nLogSimple 0.1 3b00fd807f15ed6447b6c6871be22e82\nLogic_test 0.0.1 7c46270d6975e6b7711b95a448f7f216\nLoomioScraper 0.1.0,0.1.1 489a4b884bf093"
         1  Untitled 2.rb:25

         1  "0bd67e707921fc\ntencentcloud-sdk-iottid 3.0.659 ced9f766d69fd167092e454ff688a6ea\ntencentcloud-sdk-mongodb 3.0.659 a080375825a02372c86c82cf8a60367f\ntencentcloud-sdk-bizlive 3.0.659 8f597c6e4c9d470a3e7e5"
         1  Untitled 2.rb:25

         1  "0feca9b939dcb97ef06aa91cb\nSpreadsheet-HTML 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10,0.0.11,0.0.12,1.0.0,1.0.1,1.0.2,1.0.3 25bcde480ed325cdb679dcf357532488\nSpreadsheetML 0.1 c997b7b"
         1  Untitled 2.rb:25

         1  "1.0 cd725301f04f5e09a1bf4a80b9ccdb55\nJosephPecoraro-rr 1.0.4,1.0.3 fd13aa36aeeb1b9f077b21015c33a5fe\nJosephPecoraro-whenever 0.3.2,0.3.1 5b089cca8a96705ee79bf547053fbd98\nJuice10-bogus_sass_checker 0.1."
         1  Untitled 2.rb:25

         1  "10,2.0.11 0654aae51dabfd9bc66933ef8d1afcdf\nabiquo-etk 0.4.13,0.4.14,0.4.15,0.4.16,0.4.17,0.4.18,0.4.19,0.4.20,0.4.22,0.4.23,0.4.24,0.4.25,0.4.29,0.4.32,0.4.33,0.4.42,0.5.3,0.5.8,0.5.9,0.6.0,0.6.1,0.6."
         1  Untitled 2.rb:25

         1  "15,0.1.16,0.1.17,0.1.18,0.1.19,0.1.20,0.1.21,0.1.22,0.1.23,0.1.24,0.1.25,0.1.26,0.1.27,0.1.28,0.1.29,0.1.30,0.1.31,0.1.32,0.1.33,0.1.34,0.1.35,0.1.36,0.1.37,0.1.38,0.1.39,0.1.40,0.1.41,0.1.42,0.1.43 b"
         1  Untitled 2.rb:25

         1  "198da1acadb4ca2c5364\nSFAscrapi 1.2.2 4d2def6a28e3be807146c138305f9cc2\nSFAtidy 1.1.3 79d8a59b992cfbaa960c442a71a9368b\nSFDO-API 0.1.0,0.1.1,0.1.2,0.1.3,0.1.4,0.1.5,0.1.6,0.1.7,0.1.8,0.1.9,1.0.0,1.0.1,1."
         1  Untitled 2.rb:25

         1  "1e225cb3e\nThermostat_Joppe_Declercq 0.3.3 c23c9fcd243fc450ff0709c97c12bfa0\nThiagoLelis-backgroundjob 1.0.6,1.0.5,1.0.4,1.0.3,1.0.2 68105c2e0318b2c9f5eabeec6ee89e17\nThiagoLelis-bj-utc 1.0.2,1.0.1 9eb3b"
         1  Untitled 2.rb:25

         1  "1ea1f71b64f0bc0c7d69e48e0\nSbirsp 0.0.1 c33fc055771420a79e2c73a04050bd74\nScanSSL 0.0.1,0.0.2 66fe25a024331a0a44e7b615e0c20c20\nSchedulerWindows 0.0.0 7a1ac3e86d5c07ebfedbae4b65ef664d\nSchmock 0.9 ea463b2"
         1  Untitled 2.rb:25

         1  "2.0,0.3.0,0.4.0 ee02b0a530ec1d6cea1cfff08a110298\nTxChaos 0.0.2 87d29cbac99db2e0b09e6a85e9212a26\nTypoglycaemic 0.1.1 440a6a2861bdd3760ef6a467d8c8f76f\nUCSAPI 0.0.6,0.0.7,0.0.8,0.0.9 3edfa21b8d9f62e93361"
         1  Untitled 2.rb:25

         1  "281839b65c58db947bb5b70c9\nCarRegistrationIndia 0.0.2 bf8e77bd63089f92666cbc3618aaa3ae\nCarRegistrationIreland 0.0.2 fb75737a4dea91e4484e262fd11eb2ef\nCarRegistrationItaly 0.0.2 8842075a3975ae9e21b862e8c"
         1  Untitled 2.rb:25

         1  "36f8\nIPGlider-ImmutableAttributes 1.0.3.1 93cab95102ec337edd23c0fac12fb25f\nIPGlider-annotate 2.2.7,2.2.6 3ff6d312c8646836529684e0d5595f61\nIPGlider-metric_fu 1.1.5.1,1.1.5.1.1 1b18a62f2cdf418198d219eca"
         1  Untitled 2.rb:25

         1  "389f6ed2f6660b5f967a4c0a7fbb57ae\nJOT 1.0.2,1.0.3,1.0.4,1.0.5,1.0.8 080d1e648c4bf9c1a4528e959f2f8c88\nJRank 0.1.0 9a3a7e55a67abc57dc72ecc16d7f37d9\nJRuby-OpenSSL 0.1 4c413ca5761334b4008ba22410e5dbe0\nJSON"
         1  Untitled 2.rb:25

         1  "3a\nAndrey-hello_world 0.1.0,0.1.1,0.1.2,0.1.3 a6b7b86ac146e0be5549933faa2c587b\nAndyFirstGem 0.1.0 2ca49468af6351fe5b0ed7a2c7eb3176\nAnilistrb 0.1.1,0.1.2 34762f1030d18ee68bedf6942e3f06a4\nAnimeDL 0.1.1,"
         1  Untitled 2.rb:25

         1  "53aba615e0ed5b07678\nabnf-parsing 0.2.0,0.2.1,0.2.2 01b99d03fb2fcd0c5d11f17fed9af514\nabnftt 0.1.1,0.2.1,0.2.2,0.2.3,0.2.4 769ceeedab9574e3556535947c27c6fc\nabnormal 0.0.0 3565d01266f5d7d1a11a022a2aa73a4"
         1  Untitled 2.rb:25

         1  "5be08a87430\naaronh-chronic 0.3.9 5e92631e9e460e49a48ed21e8923b684\naaronp-frex 1.0.1,1.0.0 4c51934a49cd360790d34e8fb1fdf3ad\naaronp-meow 1.1.0,1.0.0 696b645f09e8b577a0ba12803e776e2b\naaronp-zomg 1.0.2.20"
         1  Untitled 2.rb:25

         1  "621561a661\nPr0d1r2-paperclip-time-stamped 1.0.2,1.0.1,1.0.0 609c1dd80afe644bfb48d94f273367fd\nPr0d1r2-resource_controller 0.6.6 03a09281a4d9fe554b186e84715026b5\nPractica9 0.0.1 aa353d3cf70d946e1c8bf2a5"
         1  Untitled 2.rb:25

         1  "670ae1295038623dff512873250695\nDankest_View_Tool 0.1.0 3bb5d765faffdca9f7bf929a15c45e2a\nDanskeHelligdage 1.0.2 49963828c9bb510cd7bf43fe5483637e\nDarkFolio 0.1.0 770b9f4dfd501aeb6428851dc6df9673\nDashaMa"
         1  Untitled 2.rb:25

         1  "7,0.5.98,0.5.99,0.5.95,0.6.1,0.6.2,0.6.7,0.6.8,0.6.10,0.6.11,0.6.12,0.6.16,0.6.21,0.6.22,0.6.24,0.6.25,0.6.26,0.6.27,0.6.28,0.6.29,0.6.31,0.6.32,0.6.33,0.6.34,0.6.35,0.6.36,0.6.38,0.6.39,0.6.40,0.6.41"
         1  Untitled 2.rb:25

         1  "754349c6d\nClickClack 0.0.1,0.0.2,0.0.3 b7861c5c4d277c78365a80a426a151a4\nClickSpotter 0.1.1 6f59a77ea207488b629a54785faa0ae1\nClockOFF 0.0.1 4536a463ee6cdd81ccdd37db582195bb\nClosestSum 1.0.0-universal-l"
         1  Untitled 2.rb:25

         1  "7d\nVladTheEnterprising 0.1.6,0.1.7,0.1.8,0.2,0.1.4,0.1.5 b1e9969dec1927f7472ed610d894e4d5\nVoiceIt2 1.0.0,1.0.1,1.0.2,1.0.3,1.0.4,1.0.5,1.0.6,2.0.0,3.0.0,3.0.1,3.0.2,3.1.0,3.2.0,3.3.0,3.4.0,3.5.0,3.6.0"
         1  Untitled 2.rb:25

         1  "8d1bd01cb62e3e633732\nWinkerAI 0.0.01,0.0.02,0.0.03,0.0.04,0.0.05 51a44b0d85415c91c048d85ff44008ec\nWireAPI 0.3,0.4,0.5 ae643fea15a4ef81e8a1e30b2ec8de5f\nWishky 2.0.1 9d48d2ed9c14c714df2944d808427a29\nWon"
         1  Untitled 2.rb:25

         1  "9363b592abb629bc706d56c05cc65\nRiFF-RAFF 0.0.1 9715c4dd1fd3a3d4d006c233f764b6e9\nRidiculous 0.3,0.5 fe1500d870f555a6931d44e2d345378a\nRightScaleAPIHelper 0.4.0,0.4.1,0.4.2,0.4.3,0.4.4,0.4.5,0.4.6,0.5.7,0"
         1  Untitled 2.rb:25

         1  "9632c67031472\nSwfUtil 0.01 8dc23bb47a31c63e69a3825c64995f56\nSyd-sinatra 0.9.0.4,0.9.0.2,0.3.2 452d71d3bc8b9e9b264f409f7d87bc39\nSykGenData 0.1.2,0.1.3,0.1.4,0.1.5,0.1.6,0.1.7,0.1.8,0.1.9 1c668d3ba14e52"
         1  Untitled 2.rb:25

         1  "9f96c69acf7606a072d2a5df\nActionMailer-Base-to-use-an-absolute-path-template 1.0.0 63f40c52ceaf75627747dd4e5a32d375\nActionPool 0.0.1,0.1.0,0.2.0,0.2.1,0.2.2,0.2.3 9b9a08e67027c1e35cae5a25434ac681\nActio"
         1  Untitled 2.rb:25

         1  "FA 0.2.0 e8b428c46a768a65844551a3ab122024\nLIS-OUT 0.0.0 b957020ea9a2fb5548fb83caeecf978d\nLJ_RPS 0.1.0,0.1.1 69de688a86bdd96c832fc6c980daa434\nLMG_modbus 1.0.2,1.0.3,1.0.4 2ca98d4939106231269d2f9e845396"
         1  Untitled 2.rb:25

         1  "Lab_Plaid 0.0.0,0.0.1 b145c6cf314454f260031a7a9a523ca9\nDirTagger 2.0.0,2.0.1,2.0.3,2.0.4,2.0.5,2.0.6 f8d1ce386d30bb8c37c6ec2a95774443\nDirectorio_de_trabajo_del_equipo 0.1.0 8a64ffd5183e066763632670ea1"
         1  Untitled 2.rb:25

         1  "MediumToMarkdown 1.0.0,1.0.1,1.0.2,1.0.3,1.1.0,1.2.0,1.2.1,1.2.2,1.3.0,1.3.1,1.3.2,1.3.3,1.3.4,1.3.5,1.3.6,1.3.7,1.3.8,1.3.9,1.4.0,1.4.1,1.4.2,1.4.7,1.4.8,1.5.0,1.6.0,1.6.1,1.6.2,1.6.3,1.7.0,1.7.1,1.7"
         1  Untitled 2.rb:25

         1  "Ruby 0.0.1 6fceeb05f6d4d9a23fe96e03fc7a6dc8\nHelloRubyGem 0.0.0 429f816abe827687fb51b7554764966d\nHelloWorldAA 0.0.1 5b82d9aaf57d7c2d94246ec97c2116e6\nHelloWorldAlephCampos 0.1.0 420a0b341620a28bea7b3066"
         1  Untitled 2.rb:25

         1  "aDA 1.0.0,1.0.1,1.1.0 34a580b2a4b44913cb7348775de2fed4\nGradleSearch 1.0.0 6d43f990fee5dbaa04f5143627683d73\nGradleSearchResTools 1.0.3,1.0.4 769acf384902684100395999479c754a\nGraphUtils 0.0.0,0.0.1 1825"
         1  Untitled 2.rb:25

         1  "c3fe7fe590f78dae951c\nQuintero_view_tool 0.1.0 234f75130040fab82a8a0f7fef9afaf7\nQuipu 0.1.0 89a05f2eb212e7ea88b5a507d3f61eb6\nR3T 0.0.1 e6fcbebbb855226536dc400837ce88e1\nR3c 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5"
         1  Untitled 2.rb:25

         1  "c830044d0ce1\nCASProjectServer 0.0.2 2db2c12bd35838013bad0bf0234fa3a6\nCA_breweries 0.1.0,0.1.1,0.1.2,0.1.3 4d32740df4a936df41c4d3bd45689b06\nCCHIT-xds-facade 0.1.1 dc01fed8498d33484576a2753eb3ce7c\nCFPro"
         1  Untitled 2.rb:25

         1  "cac3d5f25c3b48a31\nCpSoButtons 0.0.1,0.0.2,0.0.3,0.0.4,0.0.8 b2e14092aec2ecb6e80fd45dc9109322\nCreamCheese 1.0.2,1.0.3 8a0c8536d02e0c855ef276c272f8ef34\nCreizer-Meli 3.0.0,3.0.1 9279a45029583a02c4572f66b"
         1  Untitled 2.rb:25

         1  "d1cc61cc8d8fb001fd6d06b2e9b\nNCIRLFileUploader 0.0.1,0.0.12,0.0.13 a21b2bff3a253b98c2501161d3b8786a\nNDKMeans 0.0.1,0.0.2 e570c4bcc4c75c703576f5cbaa86dac9\nND_Studio_Game_Coursework 1.0.0 643e814e3d3118e"
         1  Untitled 2.rb:25

         1  "ef89bbf3202dfab9e48fb4efa0c0ede\nabbyy 0.2.0,0.2.1 18a30aca9c56f950480e7abf9afa8952\nabbyy-cloud 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10 39dda5f39733b8e478e0639057b964cd\nabbyy-ruby "
         1  Untitled 2.rb:25

         1  "f5e13d5172e8baa5956f4364ef\nManfred-Turippu 0.1.0 b6da62eb14876b6502735f319ac3275f\nMangUpdate 0.0.1,0.0.2 2be47f50f357ac759260e26a6007ffe9\nMange-field_helpers 1.0.1 5e1007621f38e4cd52c1a0765ca4ec27\nMan"
         1  Untitled 2.rb:25

         1  "fa39e5e0d\nPerson 1.0.0 8e150bf5fa62743ada26e6a6db730156\nPetSearch 0.0.1,0.0.2,0.0.3,0.0.5,0.0.6,0.0.7.1,0.0.7.7,0.0.7.8,0.0.7.9,0.0.8.2.3 b4036b50d75ca17cd302f17ed360e232\nPeterCoulton-mygemify 0.0.3 1"
         1  Untitled 2.rb:25

         1  "fefb3b8a939c0abc0bdda44620bb4f2c\nNoDevent 0.0.2 1e08f8e0a1ae74cae3fb2a0bcc0e6552\nNoNo 0.0.1,0.0.2 834f380817d072110f76f5bc92130380\nNordes-logstash-input-azureblob 0.9.5.pre.1 f6ea8c9b99802b56875158b5b"
         1  Untitled 2.rb:25

         1  "lightXML2RESTDriver 0.1.0,0.1.1,0.1.2,0.1.3,0.1.4,0.1.5,0.1.6 a111aa5db4c4f72d3dd13849b5d76ade\nFlipkartSeller 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8 3538a0f17a8f350cc3344d25dcbe8a2f\nFlipper 1"
         1  Untitled 2.rb:25

         1  "s 1.0.0 3fd17aec49e8efd8ab1e3f9bb04adea7\na_little_less 0.2.1,0.2.2,0.2.3,0.2.4 9e4eb8da7cf6f46b0f31475f9f586d87\na_marmita 0.0.1,0.5.8,0.5.9,0.5.10,0.5.11,0.5.12 d5a060341305461347e7f873142b2b0c\na_maze"
         1  Untitled 2.rb:25

         1  "t_int 0.0.3 6002cccdc2ddb5e28f944912e40e9e45\nabstats 0.0.1 4ef3552178957a15f35b736ed2d7084f\nabstract 1.0.0 bfa6d0ef9e861f83938206e083721eb6\nabstract-data-types 0.0.1 cfafc4f9be16f2fdbbb4bbf634de9796\na"
         1  Untitled 2.rb:25

         1  "test 0.0.1,0.0.2,0.1.0 6ba7b7c35b2610530cc01506d2727daf\na1447ll_test 0.0.1 9ff322f022cb696f92d5906463282287\na14z6ch_elapsed_days 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5 060a3df67e6ec087fec63e1d5986217a\na14z8fy_"
         1  Untitled 2.rb:25

After

Total allocated: 0 B (0 objects)
Total retained:  0 B (0 objects)

allocated memory by gem
-----------------------------------
NO DATA

allocated memory by file
-----------------------------------
NO DATA

allocated memory by location
-----------------------------------
NO DATA

allocated memory by class
-----------------------------------
NO DATA

allocated objects by gem
-----------------------------------
NO DATA

allocated objects by file
-----------------------------------
NO DATA

allocated objects by location
-----------------------------------
NO DATA

allocated objects by class
-----------------------------------
NO DATA

retained memory by gem
-----------------------------------
NO DATA

retained memory by file
-----------------------------------
NO DATA

retained memory by location
-----------------------------------
NO DATA

retained memory by class
-----------------------------------
NO DATA

retained objects by gem
-----------------------------------
NO DATA

retained objects by file
-----------------------------------
NO DATA

retained objects by location
-----------------------------------
NO DATA

retained objects by class
-----------------------------------
NO DATA

ext/zlib/zlib.c Outdated Show resolved Hide resolved
@martinemde
Copy link
Contributor

Nice work @segiddins. I tested it out and it seems to work fine still for extracting gems. I agreed that read needs to accept an outbuf.

segiddins added a commit to rubygems/rubygems that referenced this pull request Sep 19, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
@segiddins
Copy link
Contributor Author

segiddins commented Sep 19, 2023

New benchmark:

Script
#!/usr/bin/env ruby
# frozen_string_literal: true

require 'rubygems'
require 'bundler/inline'

gemfile do
  source "https://rubygems.org/"
  gem 'memory_profiler'
  gem 'zlib'#, path: "/Users/segiddins/Development/github.com/ruby/zlib"
  gem 'benchmark-ipsa'
end


len = 16_384


Benchmark.ipsa do |x|
  x.report("read #{len} until eof") do
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.read len until f.eof
  ensure
    f&.close
  end
  
  x.report("readpartial #{len} until eof") do
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.readpartial len until f.eof
  ensure
    f&.close
  end
  
  x.report("readpartial #{len}, buf until eof") do
    buf = String.new(capacity: len, encoding: Encoding::BINARY)
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.readpartial len, buf until f.eof
  ensure
    f&.close
  end
  
  read_takes_buf = begin
  Zlib::GzipReader.open("/tmp/versions.gz") { |f| f.read nil, nil }
  rescue ArgumentError
    false
  else
    true
  end
  
  x.report("read #{len}, buf until eof") do
    buf = String.new(capacity: len, encoding: Encoding::BINARY)
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.read len, buf until f.eof
  ensure
    f&.close
  end if read_takes_buf
  

  x.compare!
end
Before
Allocations -------------------------------------
read 16384 until eof    4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384 until eof
                        6728/0  alloc/ret       50/0  strings/ret
                    31.78 MB/0 B  alloc/ret
readpartial 16384, buf until eof
                        3411/0  alloc/ret       50/0  strings/ret
                    24.85 MB/0 B  alloc/ret
Warming up --------------------------------------
read 16384 until eof     1.000  i/100ms
readpartial 16384 until eof
                         1.000  i/100ms
readpartial 16384, buf until eof
                         1.000  i/100ms
Calculating -------------------------------------
read 16384 until eof     13.390  (± 7.5%) i/s -     67.000 
readpartial 16384 until eof
                         12.511  (± 8.0%) i/s -     63.000 
readpartial 16384, buf until eof
                         12.463  (± 8.0%) i/s -     63.000 

Comparison:
read 16384 until eof:       13.4 i/s
readpartial 16384 until eof:       12.5 i/s - same-ish: difference falls within error
readpartial 16384, buf until eof:       12.5 i/s - same-ish: difference falls within error
After
Allocations -------------------------------------
read 16384 until eof    4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384 until eof
                        4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384, buf until eof
                           9/0  alloc/ret        3/0  strings/ret
                    5.03 kB/0 B  alloc/ret
read 16384, buf until eof
                           9/0  alloc/ret        3/0  strings/ret
                    5.03 kB/0 B  alloc/ret
Warming up --------------------------------------
read 16384 until eof     1.000  i/100ms
readpartial 16384 until eof
                         1.000  i/100ms
readpartial 16384, buf until eof
                         1.000  i/100ms
read 16384, buf until eof
                         1.000  i/100ms
Calculating -------------------------------------
read 16384 until eof     12.839  (± 7.8%) i/s -     64.000 
readpartial 16384 until eof
                         13.298  (± 7.5%) i/s -     67.000 
readpartial 16384, buf until eof
                         13.883  (± 7.2%) i/s -     70.000 
read 16384, buf until eof
                         13.274  (±15.1%) i/s -     64.000 

Comparison:
readpartial 16384, buf until eof:       13.9 i/s
readpartial 16384 until eof:       13.3 i/s - same-ish: difference falls within error
read 16384, buf until eof:       13.3 i/s - same-ish: difference falls within error
read 16384 until eof:       12.8 i/s - same-ish: difference falls within error

In summary:

  • no change to read 16384 until eof
  • allocations drop by a third on readpartial 16384 until eof
  • allocations go to a constant number on read/readpartial 16384, buf until eof
  • no degradation of performance on any of those cases

@hsbt this will have a significant benefit for RubyGems/Bundler/RubyGems.org, I'd greatly appreciate your help in getting this reviewed 🙇🏻‍♂️

ps.

ran two more benchmarks, one for read nil and one for read nil, buf.
no change in read nil, and read nil, buf drops allocations significantly, down to 4 strings (from 50) and 26.23MB, from 33.15, so I think there is still room to optimize the case where read with no length and no buffer is passed in.

pps.

This drops allocations for IO.copy_stream out to nearly 0, and in my testing speeds it up by maybe 10%

ext/zlib/zlib.c Outdated
// fill stream until the buffer can be filled. while this isn't strictly
// necessary, it does help cut down on the number of calls to readpartial
// in the typical case.
while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the bug I was seeing. It wasn't reading enough to fill the len requested, so it was returning less than requested even when the amount was available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what readpartial means?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that readpartial reads as much as it can, up to the limit, unless it would block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i fixed now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also mistaken about the exact behavior of readpartial. Sorry for any confusion. I double checked this branch myself locally and this definitely solves #56.

@segiddins
Copy link
Contributor Author

@hsbt would appreciate a review on this one when you get a chance, thanks!

@hsbt hsbt self-assigned this Oct 3, 2023
Copy link
Member

@hsbt hsbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay to merge this. But I would like to review this by @nobu or other ruby core member.

ext/zlib/zlib.c Outdated Show resolved Hide resolved
ext/zlib/zlib.c Outdated
// fill stream until the buffer can be filled. while this isn't strictly
// necessary, it does help cut down on the number of calls to readpartial
// in the typical case.
while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that what readpartial means?

martinemde pushed a commit to rubygems/rubygems that referenced this pull request Oct 12, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
@segiddins
Copy link
Contributor Author

@nobu i think I've addressed your concerns?

@nobu
Copy link
Member

nobu commented Oct 14, 2023

test/fixtures directory will be rejected to merge to ruby/ruby.

@segiddins
Copy link
Contributor Author

How would you like me to include the text fixture if not in a fixtures directory?

@deivid-rodriguez
Copy link

I think ruby-core backport script rejects any changes that pollute the global test folder in ruby-core with a fixtures or lib folders: https://github.com/ruby/ruby/blob/f6e2f32a962c2c19a0418dfd2020c8c9ab32c8b0/tool/sync_default_gems.rb#L427-L435.

Maybe if you move the fixtures folder to test/zlib it would be acceptable for ruby-core?

@nobu
Copy link
Member

nobu commented Oct 16, 2023

I wonder the necessity of fixtures in this case at least.
Is it necessary to be a pre-existing file?

@segiddins
Copy link
Contributor Author

I'm happy to replace the fixture but I don't know how to manually generate a gzip file that demonstrates the same issue with readpartial -- it doesn't surface on all gzips. cc @martinemde

@martinemde
Copy link
Contributor

All I can offer right now is that I previously tried unzipping and rezipping this data.tar. The freshly gzipped file no longer causes the issue. Including this gem in the test is a real world proof.

We could instead hunt around for a smaller example gem, if this is a size issue, or try to figure out what it is about this one file and recreate it (I don't have that expertise). We could also just generate random gzips until we find one that fails. 😁

@nobu
Copy link
Member

nobu commented Oct 24, 2023

diff --git c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz i/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz
deleted file mode 100644
index 6e5581c..0000000
Binary files c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz and /dev/null differ
diff --git c/test/zlib/test_zlib.rb i/test/zlib/test_zlib.rb
index 9d18058..5811f8b 100644
--- c/test/zlib/test_zlib.rb
+++ i/test/zlib/test_zlib.rb
@@ -1005,20 +1005,20 @@ if defined? Zlib
           assert_equal "".b, s
           assert_predicate f, :eof?
         end
+      }
 
-        File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
-          Zlib::GzipReader.wrap(f) do |gzio|
-            count = 0
-            until gzio.eof?
-              s = gzio.read(16_384)
-              count += s.size
-              assert_predicate gzio, :eof? if s.empty?
-            end
-            assert_equal 59904, count
+      compressed_data_file do |f, size|
+        Zlib::GzipReader.wrap(f) do |gzio|
+          count = 0
+          until gzio.eof?
+            s = gzio.read(16_384)
+            count += s.size
+            assert_predicate gzio, :eof? if s.empty?
           end
-          assert_predicate f, :closed?
+          assert_equal size, count
         end
-      }
+        assert_predicate f, :closed?
+      end
     end
 
     def test_readpartial
@@ -1043,7 +1043,7 @@ if defined? Zlib
         end
       }
 
-      File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+      compressed_data_file do |f, size|
         Zlib::GzipReader.wrap(f) do |gzio|
           count = 0
           until gzio.eof?
@@ -1051,12 +1051,12 @@ if defined? Zlib
             count += s.size
             assert_predicate gzio, :eof? if s.empty?
           end
-          assert_equal 59904, count
+          assert_equal size, count
         end
         assert_predicate f, :closed?
       end
 
-      File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+      compressed_data_file do |f, size|
         Zlib::GzipReader.wrap(f) do |gzio|
           count = 0
           s = String.new(capacity: 16_384, encoding: Encoding::BINARY)
@@ -1071,7 +1071,7 @@ if defined? Zlib
               break
             end
           end
-          assert_equal 59904, count
+          assert_equal size, count
         end
       end
     end
@@ -1271,6 +1271,17 @@ if defined? Zlib
       }
     end
 
+    private
+    def compressed_data_file
+      Tempfile.create(caller_locations(1, 1).first.base_label) {|t|
+        size = Zlib::GzipWriter.open(t.path) {|gz|
+          data = File.read(__FILE__)
+          gz.print(data)
+          data.size
+        }
+        yield t, size
+      }
+    end
   end
 
   class TestZlibGzipWriter < Test::Unit::TestCase

@segiddins
Copy link
Contributor Author

@nobu I just tried reverting some of my changes, and the test in your patch still passes when I undo the change at the end of gzfile_readpartial

segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 8, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 11, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
segiddins added a commit to rubygems/rubygems that referenced this pull request Dec 11, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.
matzbot pushed a commit to ruby/ruby that referenced this pull request Dec 11, 2023
For now, on a small rails app I have hanging around:

```
==> memprof.after.txt <==
Total allocated: 872.51 MB (465330 objects)
Total retained:  40.48 kB (326 objects)

==> memprof.before.txt <==
Total allocated: 890.79 MB (1494026 objects)
Total retained:  40.40 kB (328 objects)
```

Not a huge difference in memory usage, but it's a drastic improvement
in total number of allocations.

Additionally, this will pay huge dividends once
ruby/zlib#61 is merged, as it will allow us to
completely avoid allocations in the repeated calls to readpartial,
which currently accounts for most of the memory usage shown above.

rubygems/rubygems@f78d45d927
@nobu
Copy link
Member

nobu commented Dec 16, 2023

@nobu I just tried reverting some of my changes, and the test in your patch still passes when I undo the change at the end of gzfile_readpartial

Isn't this PR an improvement of readpartial and unrelated to #56?
Why mixing unrelated fix?

@martinemde
Copy link
Contributor

@nobu It sounds like you don't want this PR to include the bugfix. It seems like including an effective test for this bug makes this PR unmergable.

How would you suggest we proceed?

@nobu
Copy link
Member

nobu commented Dec 17, 2023

@nobu It sounds like you don't want this PR to include the bugfix.

Yes, exactly.
It's a different story.

@martinemde
Copy link
Contributor

martinemde commented Dec 17, 2023

I was under the impression that the whole point of this PR is to fix #56, and the performance improvements were found as part of the fix.

Should we remove the bug fix from this PR so readpartial continues to be broken but is more memory efficient, or should we remove the memory efficiency improvements and just fix the bug?

@martinemde
Copy link
Contributor

Note: rubygems does have 3 test fixture .gem files in test/rubygems/packages.

@nobu
Copy link
Member

nobu commented Dec 18, 2023

I was under the impression that the whole point of this PR is to fix #56, and the performance improvements were found as part of the fix.

The title and description look only for the latter, but nothing about #56.

Should we remove the bug fix from this PR so readpartial continues to be broken but is more memory efficient, or should we remove the memory efficiency improvements and just fix the bug?

I'm ok both, but you'll need to edit the title in the latter case.

@segiddins
Copy link
Contributor Author

@nobu I removed the fix for #56

@segiddins
Copy link
Contributor Author

Came up with a new fix for #56 in #73, would appreciate another review of this PR and that one when you get a chance!

@segiddins
Copy link
Contributor Author

Rebased with truffleruby CI fix, I believe this now addresses @nobu's concerns and should pass CI, cc @hsbt

@martinemde
Copy link
Contributor

@nobu, Have you given up on this PR?

@segiddins
Copy link
Contributor Author

@nobu @nurse I would greatly appreciate a look at this change 🙇🏻

@martinemde
Copy link
Contributor

martinemde commented Mar 6, 2024

@nobu Could you give us some feedback here about what Sam or I could do to get this PR ready? We think it would enable better performance in rubygems now that we will soon be able to use readpartial again.

Any ideas?

@deivid-rodriguez
Copy link

Maybe we can try opening another ruby-core ticket about this performance improvement. That worked great for getting the bug fix merged.

…d an outbuf

This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present
@segiddins segiddins force-pushed the segiddins/readpartial-buffer branch from 70a688a to 07f44b7 Compare April 11, 2024 19:28
@eregon
Copy link
Member

eregon commented Apr 12, 2024

Maybe we can try opening another ruby-core ticket about this performance improvement. That worked great for getting the bug fix merged.

Yep, that seems a good way to get attention on this. Also note you should add such tickets to the dev meeting ticket to make sure it gets reviewed in the monthly dev meeting.

@segiddins
Copy link
Contributor Author

https://bugs.ruby-lang.org/issues/20424

@hsbt hsbt assigned nobu and unassigned hsbt Apr 19, 2024
@hsbt hsbt merged commit 950dad7 into ruby:master May 14, 2024
29 checks passed
@segiddins segiddins deleted the segiddins/readpartial-buffer branch May 14, 2024 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants