Avoid allocating intermediary strings when readpartial is passed an outbuf #61

segiddins · 2023-09-14T03:15:43Z

This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present

This is my first time in a while writing C code for Ruby, so please excuse any style errors.

cc @martinemde

Verified improvement via this script:

script

#!/usr/bin/env ruby

require 'rubygems'
require 'bundler/inline'

gemfile do
  source "https://rubygems.org/"
  gem 'memory_profiler'
  gem 'zlib', path: "/Users/segiddins/Development/github.com/ruby/zlib"
end

f = File.open("/Users/segiddins/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions")

gz = Zlib.gzip(f.read)

File.write("/tmp/versions.gz", gz)

f = Zlib::GzipReader.open("/tmp/versions.gz")

buf = String.new(capacity: 16_384, encoding: Encoding::BINARY)

MemoryProfiler.report do
  f.readpartial 16_384, buf until f.eof?
end.pretty_print(scale_bytes: true)

Before

Total allocated: 24.71 MB (3378 objects)
Total retained:  0 B (0 objects)

allocated memory by gem
-----------------------------------
  24.71 MB  other

allocated memory by file
-----------------------------------
  24.71 MB  Untitled 2.rb

allocated memory by location
-----------------------------------
  24.71 MB  Untitled 2.rb:25

allocated memory by class
-----------------------------------
  24.71 MB  String

allocated objects by gem
-----------------------------------
      3378  other

allocated objects by file
-----------------------------------
      3378  Untitled 2.rb

allocated objects by location
-----------------------------------
      3378  Untitled 2.rb:25

allocated objects by class
-----------------------------------
      3378  String

retained memory by gem
-----------------------------------
NO DATA

retained memory by file
-----------------------------------
NO DATA

retained memory by location
-----------------------------------
NO DATA

retained memory by class
-----------------------------------
NO DATA

retained objects by gem
-----------------------------------
NO DATA

retained objects by file
-----------------------------------
NO DATA

retained objects by location
-----------------------------------
NO DATA

retained objects by class
-----------------------------------
NO DATA


Allocated String Report
-----------------------------------
         1  "\nBibliografia.alu0100816167 0.2.0 c86a6472b1ee133100ee922bcffd3a39\nBibliografia_alu0100502107 0.1.0 bfaec8bf21658d9664741c8d923e4eca\nBidManagerLoggerGem 0.0.0 baa3c08c034fb911c0e6fd90d8219b1e\nBigCat 0"
         1  Untitled 2.rb:25

         1  ",1.0.1,1.0.2 a926a2955fc394053c68a3d50b6ef48e\nRailsRemoteControl 1.0.0 850c11e07777754b00ad66f506af8ae3\nRailsRunSignUp 0.0.1,0.0.3,0.0.4,0.0.5,0.0.7 f2323ed86e442b2b33ab73aa5cd95587\nRailsTop 0.0.1,0.0"
         1  Untitled 2.rb:25

         1  ",2022.2.19,2022.2.20,2022.2.21,2022.2.22,2022.2.24,2022.3.9,2023.1.29,2023.1.31,2023.5.25,2023.7.14,2023.7.20 0b781346d3306df5e023bbf0a5d1f988\nEZAccounts 0.6.1 3f289d4a6b3d1b11d7d2c2a6dd80fa75\nEZTOP10"
         1  Untitled 2.rb:25

         1  "-graticule 0.2.7.6,0.2.7.5,0.2.7.4,0.2.7.3,0.2.7.2 8f2de7b75ea069d8e310eddef4388588\nGUI-mini_magick 1.2.3 61cc135041fcec97105c69df07095fc4\nGUnit 0.1.2,0.2.0,0.2.1,0.2.2,0.3.0,0.3.1,0.3.2,0.3.3,0.3.4,0"
         1  Untitled 2.rb:25

         1  ".3.0,1.4.0,1.4.1,1.4.2 53814ff52d4ec9f2588cad7a79e7c272\nExit_Zero 0.1.0,1.0.0,1.0.1,1.1.0,1.1.1,1.1.2,1.1.3,1.2.0,1.3.0 5ed64f091d3410b02d803e4c141bd503\nExpressQuery 0.0.1,1.0.0,1.1.1 343c7b16462239c8"
         1  Untitled 2.rb:25

         1  "03ec4a0\nModelMaker 0.1.0,0.1.1,0.1.2 fb981323c3e30f29e5c055414b0e8357\nModule2Class 0.0.1 3fdb19bffdf40d75add588a571f9c44d\nMoneyConversions 0.1.0 272811aa64258719b3adb885315ab4c9\nMoney_DaWanda 0.0.0 76"
         1  Untitled 2.rb:25

         1  "04434d414302db7a904190e363cb6cdf\nP4Ruby-mingwx86 2014.1 4b85a4fb36bc3bc3be322135637b48f9\nP5_T_34 0.1.1,0.1.2 9284c3d71f9b135a1098cfe62b5101db\nP6-alu0101111254 0.1.0 5163068706fe08a4113b0334ab392b5d\nPA"
         1  Untitled 2.rb:25

         1  "05cb89f789bfa32a1975d6f1b44\nBBB 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10,0.1.0,0.1.1,0.1.2,0.1.3,0.2.0,0.3.0 c191f999c3105549170c4e7bd7377f67\nBBRedCloth 0.8.0,0.8.1,0.8.2,0.8.3,0.8"
         1  Untitled 2.rb:25

         1  "0aca8d1\nLocksmith 0.0.0,0.1.1 7c310256129f799338aa71f208713348\nLogSimple 0.1 3b00fd807f15ed6447b6c6871be22e82\nLogic_test 0.0.1 7c46270d6975e6b7711b95a448f7f216\nLoomioScraper 0.1.0,0.1.1 489a4b884bf093"
         1  Untitled 2.rb:25

         1  "0bd67e707921fc\ntencentcloud-sdk-iottid 3.0.659 ced9f766d69fd167092e454ff688a6ea\ntencentcloud-sdk-mongodb 3.0.659 a080375825a02372c86c82cf8a60367f\ntencentcloud-sdk-bizlive 3.0.659 8f597c6e4c9d470a3e7e5"
         1  Untitled 2.rb:25

         1  "0feca9b939dcb97ef06aa91cb\nSpreadsheet-HTML 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10,0.0.11,0.0.12,1.0.0,1.0.1,1.0.2,1.0.3 25bcde480ed325cdb679dcf357532488\nSpreadsheetML 0.1 c997b7b"
         1  Untitled 2.rb:25

         1  "1.0 cd725301f04f5e09a1bf4a80b9ccdb55\nJosephPecoraro-rr 1.0.4,1.0.3 fd13aa36aeeb1b9f077b21015c33a5fe\nJosephPecoraro-whenever 0.3.2,0.3.1 5b089cca8a96705ee79bf547053fbd98\nJuice10-bogus_sass_checker 0.1."
         1  Untitled 2.rb:25

         1  "10,2.0.11 0654aae51dabfd9bc66933ef8d1afcdf\nabiquo-etk 0.4.13,0.4.14,0.4.15,0.4.16,0.4.17,0.4.18,0.4.19,0.4.20,0.4.22,0.4.23,0.4.24,0.4.25,0.4.29,0.4.32,0.4.33,0.4.42,0.5.3,0.5.8,0.5.9,0.6.0,0.6.1,0.6."
         1  Untitled 2.rb:25

         1  "15,0.1.16,0.1.17,0.1.18,0.1.19,0.1.20,0.1.21,0.1.22,0.1.23,0.1.24,0.1.25,0.1.26,0.1.27,0.1.28,0.1.29,0.1.30,0.1.31,0.1.32,0.1.33,0.1.34,0.1.35,0.1.36,0.1.37,0.1.38,0.1.39,0.1.40,0.1.41,0.1.42,0.1.43 b"
         1  Untitled 2.rb:25

         1  "198da1acadb4ca2c5364\nSFAscrapi 1.2.2 4d2def6a28e3be807146c138305f9cc2\nSFAtidy 1.1.3 79d8a59b992cfbaa960c442a71a9368b\nSFDO-API 0.1.0,0.1.1,0.1.2,0.1.3,0.1.4,0.1.5,0.1.6,0.1.7,0.1.8,0.1.9,1.0.0,1.0.1,1."
         1  Untitled 2.rb:25

         1  "1e225cb3e\nThermostat_Joppe_Declercq 0.3.3 c23c9fcd243fc450ff0709c97c12bfa0\nThiagoLelis-backgroundjob 1.0.6,1.0.5,1.0.4,1.0.3,1.0.2 68105c2e0318b2c9f5eabeec6ee89e17\nThiagoLelis-bj-utc 1.0.2,1.0.1 9eb3b"
         1  Untitled 2.rb:25

         1  "1ea1f71b64f0bc0c7d69e48e0\nSbirsp 0.0.1 c33fc055771420a79e2c73a04050bd74\nScanSSL 0.0.1,0.0.2 66fe25a024331a0a44e7b615e0c20c20\nSchedulerWindows 0.0.0 7a1ac3e86d5c07ebfedbae4b65ef664d\nSchmock 0.9 ea463b2"
         1  Untitled 2.rb:25

         1  "2.0,0.3.0,0.4.0 ee02b0a530ec1d6cea1cfff08a110298\nTxChaos 0.0.2 87d29cbac99db2e0b09e6a85e9212a26\nTypoglycaemic 0.1.1 440a6a2861bdd3760ef6a467d8c8f76f\nUCSAPI 0.0.6,0.0.7,0.0.8,0.0.9 3edfa21b8d9f62e93361"
         1  Untitled 2.rb:25

         1  "281839b65c58db947bb5b70c9\nCarRegistrationIndia 0.0.2 bf8e77bd63089f92666cbc3618aaa3ae\nCarRegistrationIreland 0.0.2 fb75737a4dea91e4484e262fd11eb2ef\nCarRegistrationItaly 0.0.2 8842075a3975ae9e21b862e8c"
         1  Untitled 2.rb:25

         1  "36f8\nIPGlider-ImmutableAttributes 1.0.3.1 93cab95102ec337edd23c0fac12fb25f\nIPGlider-annotate 2.2.7,2.2.6 3ff6d312c8646836529684e0d5595f61\nIPGlider-metric_fu 1.1.5.1,1.1.5.1.1 1b18a62f2cdf418198d219eca"
         1  Untitled 2.rb:25

         1  "389f6ed2f6660b5f967a4c0a7fbb57ae\nJOT 1.0.2,1.0.3,1.0.4,1.0.5,1.0.8 080d1e648c4bf9c1a4528e959f2f8c88\nJRank 0.1.0 9a3a7e55a67abc57dc72ecc16d7f37d9\nJRuby-OpenSSL 0.1 4c413ca5761334b4008ba22410e5dbe0\nJSON"
         1  Untitled 2.rb:25

         1  "3a\nAndrey-hello_world 0.1.0,0.1.1,0.1.2,0.1.3 a6b7b86ac146e0be5549933faa2c587b\nAndyFirstGem 0.1.0 2ca49468af6351fe5b0ed7a2c7eb3176\nAnilistrb 0.1.1,0.1.2 34762f1030d18ee68bedf6942e3f06a4\nAnimeDL 0.1.1,"
         1  Untitled 2.rb:25

         1  "53aba615e0ed5b07678\nabnf-parsing 0.2.0,0.2.1,0.2.2 01b99d03fb2fcd0c5d11f17fed9af514\nabnftt 0.1.1,0.2.1,0.2.2,0.2.3,0.2.4 769ceeedab9574e3556535947c27c6fc\nabnormal 0.0.0 3565d01266f5d7d1a11a022a2aa73a4"
         1  Untitled 2.rb:25

         1  "5be08a87430\naaronh-chronic 0.3.9 5e92631e9e460e49a48ed21e8923b684\naaronp-frex 1.0.1,1.0.0 4c51934a49cd360790d34e8fb1fdf3ad\naaronp-meow 1.1.0,1.0.0 696b645f09e8b577a0ba12803e776e2b\naaronp-zomg 1.0.2.20"
         1  Untitled 2.rb:25

         1  "621561a661\nPr0d1r2-paperclip-time-stamped 1.0.2,1.0.1,1.0.0 609c1dd80afe644bfb48d94f273367fd\nPr0d1r2-resource_controller 0.6.6 03a09281a4d9fe554b186e84715026b5\nPractica9 0.0.1 aa353d3cf70d946e1c8bf2a5"
         1  Untitled 2.rb:25

         1  "670ae1295038623dff512873250695\nDankest_View_Tool 0.1.0 3bb5d765faffdca9f7bf929a15c45e2a\nDanskeHelligdage 1.0.2 49963828c9bb510cd7bf43fe5483637e\nDarkFolio 0.1.0 770b9f4dfd501aeb6428851dc6df9673\nDashaMa"
         1  Untitled 2.rb:25

         1  "7,0.5.98,0.5.99,0.5.95,0.6.1,0.6.2,0.6.7,0.6.8,0.6.10,0.6.11,0.6.12,0.6.16,0.6.21,0.6.22,0.6.24,0.6.25,0.6.26,0.6.27,0.6.28,0.6.29,0.6.31,0.6.32,0.6.33,0.6.34,0.6.35,0.6.36,0.6.38,0.6.39,0.6.40,0.6.41"
         1  Untitled 2.rb:25

         1  "754349c6d\nClickClack 0.0.1,0.0.2,0.0.3 b7861c5c4d277c78365a80a426a151a4\nClickSpotter 0.1.1 6f59a77ea207488b629a54785faa0ae1\nClockOFF 0.0.1 4536a463ee6cdd81ccdd37db582195bb\nClosestSum 1.0.0-universal-l"
         1  Untitled 2.rb:25

         1  "7d\nVladTheEnterprising 0.1.6,0.1.7,0.1.8,0.2,0.1.4,0.1.5 b1e9969dec1927f7472ed610d894e4d5\nVoiceIt2 1.0.0,1.0.1,1.0.2,1.0.3,1.0.4,1.0.5,1.0.6,2.0.0,3.0.0,3.0.1,3.0.2,3.1.0,3.2.0,3.3.0,3.4.0,3.5.0,3.6.0"
         1  Untitled 2.rb:25

         1  "8d1bd01cb62e3e633732\nWinkerAI 0.0.01,0.0.02,0.0.03,0.0.04,0.0.05 51a44b0d85415c91c048d85ff44008ec\nWireAPI 0.3,0.4,0.5 ae643fea15a4ef81e8a1e30b2ec8de5f\nWishky 2.0.1 9d48d2ed9c14c714df2944d808427a29\nWon"
         1  Untitled 2.rb:25

         1  "9363b592abb629bc706d56c05cc65\nRiFF-RAFF 0.0.1 9715c4dd1fd3a3d4d006c233f764b6e9\nRidiculous 0.3,0.5 fe1500d870f555a6931d44e2d345378a\nRightScaleAPIHelper 0.4.0,0.4.1,0.4.2,0.4.3,0.4.4,0.4.5,0.4.6,0.5.7,0"
         1  Untitled 2.rb:25

         1  "9632c67031472\nSwfUtil 0.01 8dc23bb47a31c63e69a3825c64995f56\nSyd-sinatra 0.9.0.4,0.9.0.2,0.3.2 452d71d3bc8b9e9b264f409f7d87bc39\nSykGenData 0.1.2,0.1.3,0.1.4,0.1.5,0.1.6,0.1.7,0.1.8,0.1.9 1c668d3ba14e52"
         1  Untitled 2.rb:25

         1  "9f96c69acf7606a072d2a5df\nActionMailer-Base-to-use-an-absolute-path-template 1.0.0 63f40c52ceaf75627747dd4e5a32d375\nActionPool 0.0.1,0.1.0,0.2.0,0.2.1,0.2.2,0.2.3 9b9a08e67027c1e35cae5a25434ac681\nActio"
         1  Untitled 2.rb:25

         1  "FA 0.2.0 e8b428c46a768a65844551a3ab122024\nLIS-OUT 0.0.0 b957020ea9a2fb5548fb83caeecf978d\nLJ_RPS 0.1.0,0.1.1 69de688a86bdd96c832fc6c980daa434\nLMG_modbus 1.0.2,1.0.3,1.0.4 2ca98d4939106231269d2f9e845396"
         1  Untitled 2.rb:25

         1  "Lab_Plaid 0.0.0,0.0.1 b145c6cf314454f260031a7a9a523ca9\nDirTagger 2.0.0,2.0.1,2.0.3,2.0.4,2.0.5,2.0.6 f8d1ce386d30bb8c37c6ec2a95774443\nDirectorio_de_trabajo_del_equipo 0.1.0 8a64ffd5183e066763632670ea1"
         1  Untitled 2.rb:25

         1  "MediumToMarkdown 1.0.0,1.0.1,1.0.2,1.0.3,1.1.0,1.2.0,1.2.1,1.2.2,1.3.0,1.3.1,1.3.2,1.3.3,1.3.4,1.3.5,1.3.6,1.3.7,1.3.8,1.3.9,1.4.0,1.4.1,1.4.2,1.4.7,1.4.8,1.5.0,1.6.0,1.6.1,1.6.2,1.6.3,1.7.0,1.7.1,1.7"
         1  Untitled 2.rb:25

         1  "Ruby 0.0.1 6fceeb05f6d4d9a23fe96e03fc7a6dc8\nHelloRubyGem 0.0.0 429f816abe827687fb51b7554764966d\nHelloWorldAA 0.0.1 5b82d9aaf57d7c2d94246ec97c2116e6\nHelloWorldAlephCampos 0.1.0 420a0b341620a28bea7b3066"
         1  Untitled 2.rb:25

         1  "aDA 1.0.0,1.0.1,1.1.0 34a580b2a4b44913cb7348775de2fed4\nGradleSearch 1.0.0 6d43f990fee5dbaa04f5143627683d73\nGradleSearchResTools 1.0.3,1.0.4 769acf384902684100395999479c754a\nGraphUtils 0.0.0,0.0.1 1825"
         1  Untitled 2.rb:25

         1  "c3fe7fe590f78dae951c\nQuintero_view_tool 0.1.0 234f75130040fab82a8a0f7fef9afaf7\nQuipu 0.1.0 89a05f2eb212e7ea88b5a507d3f61eb6\nR3T 0.0.1 e6fcbebbb855226536dc400837ce88e1\nR3c 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5"
         1  Untitled 2.rb:25

         1  "c830044d0ce1\nCASProjectServer 0.0.2 2db2c12bd35838013bad0bf0234fa3a6\nCA_breweries 0.1.0,0.1.1,0.1.2,0.1.3 4d32740df4a936df41c4d3bd45689b06\nCCHIT-xds-facade 0.1.1 dc01fed8498d33484576a2753eb3ce7c\nCFPro"
         1  Untitled 2.rb:25

         1  "cac3d5f25c3b48a31\nCpSoButtons 0.0.1,0.0.2,0.0.3,0.0.4,0.0.8 b2e14092aec2ecb6e80fd45dc9109322\nCreamCheese 1.0.2,1.0.3 8a0c8536d02e0c855ef276c272f8ef34\nCreizer-Meli 3.0.0,3.0.1 9279a45029583a02c4572f66b"
         1  Untitled 2.rb:25

         1  "d1cc61cc8d8fb001fd6d06b2e9b\nNCIRLFileUploader 0.0.1,0.0.12,0.0.13 a21b2bff3a253b98c2501161d3b8786a\nNDKMeans 0.0.1,0.0.2 e570c4bcc4c75c703576f5cbaa86dac9\nND_Studio_Game_Coursework 1.0.0 643e814e3d3118e"
         1  Untitled 2.rb:25

         1  "ef89bbf3202dfab9e48fb4efa0c0ede\nabbyy 0.2.0,0.2.1 18a30aca9c56f950480e7abf9afa8952\nabbyy-cloud 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8,0.0.9,0.0.10 39dda5f39733b8e478e0639057b964cd\nabbyy-ruby "
         1  Untitled 2.rb:25

         1  "f5e13d5172e8baa5956f4364ef\nManfred-Turippu 0.1.0 b6da62eb14876b6502735f319ac3275f\nMangUpdate 0.0.1,0.0.2 2be47f50f357ac759260e26a6007ffe9\nMange-field_helpers 1.0.1 5e1007621f38e4cd52c1a0765ca4ec27\nMan"
         1  Untitled 2.rb:25

         1  "fa39e5e0d\nPerson 1.0.0 8e150bf5fa62743ada26e6a6db730156\nPetSearch 0.0.1,0.0.2,0.0.3,0.0.5,0.0.6,0.0.7.1,0.0.7.7,0.0.7.8,0.0.7.9,0.0.8.2.3 b4036b50d75ca17cd302f17ed360e232\nPeterCoulton-mygemify 0.0.3 1"
         1  Untitled 2.rb:25

         1  "fefb3b8a939c0abc0bdda44620bb4f2c\nNoDevent 0.0.2 1e08f8e0a1ae74cae3fb2a0bcc0e6552\nNoNo 0.0.1,0.0.2 834f380817d072110f76f5bc92130380\nNordes-logstash-input-azureblob 0.9.5.pre.1 f6ea8c9b99802b56875158b5b"
         1  Untitled 2.rb:25

         1  "lightXML2RESTDriver 0.1.0,0.1.1,0.1.2,0.1.3,0.1.4,0.1.5,0.1.6 a111aa5db4c4f72d3dd13849b5d76ade\nFlipkartSeller 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5,0.0.6,0.0.7,0.0.8 3538a0f17a8f350cc3344d25dcbe8a2f\nFlipper 1"
         1  Untitled 2.rb:25

         1  "s 1.0.0 3fd17aec49e8efd8ab1e3f9bb04adea7\na_little_less 0.2.1,0.2.2,0.2.3,0.2.4 9e4eb8da7cf6f46b0f31475f9f586d87\na_marmita 0.0.1,0.5.8,0.5.9,0.5.10,0.5.11,0.5.12 d5a060341305461347e7f873142b2b0c\na_maze"
         1  Untitled 2.rb:25

         1  "t_int 0.0.3 6002cccdc2ddb5e28f944912e40e9e45\nabstats 0.0.1 4ef3552178957a15f35b736ed2d7084f\nabstract 1.0.0 bfa6d0ef9e861f83938206e083721eb6\nabstract-data-types 0.0.1 cfafc4f9be16f2fdbbb4bbf634de9796\na"
         1  Untitled 2.rb:25

         1  "test 0.0.1,0.0.2,0.1.0 6ba7b7c35b2610530cc01506d2727daf\na1447ll_test 0.0.1 9ff322f022cb696f92d5906463282287\na14z6ch_elapsed_days 0.0.1,0.0.2,0.0.3,0.0.4,0.0.5 060a3df67e6ec087fec63e1d5986217a\na14z8fy_"
         1  Untitled 2.rb:25

After

Total allocated: 0 B (0 objects)
Total retained:  0 B (0 objects)

allocated memory by gem
-----------------------------------
NO DATA

allocated memory by file
-----------------------------------
NO DATA

allocated memory by location
-----------------------------------
NO DATA

allocated memory by class
-----------------------------------
NO DATA

allocated objects by gem
-----------------------------------
NO DATA

allocated objects by file
-----------------------------------
NO DATA

allocated objects by location
-----------------------------------
NO DATA

allocated objects by class
-----------------------------------
NO DATA

retained memory by gem
-----------------------------------
NO DATA

retained memory by file
-----------------------------------
NO DATA

retained memory by location
-----------------------------------
NO DATA

retained memory by class
-----------------------------------
NO DATA

retained objects by gem
-----------------------------------
NO DATA

retained objects by file
-----------------------------------
NO DATA

retained objects by location
-----------------------------------
NO DATA

retained objects by class
-----------------------------------
NO DATA

ext/zlib/zlib.c

martinemde · 2023-09-14T04:17:36Z

Nice work @segiddins. I tested it out and it seems to work fine still for extracting gems. I agreed that read needs to accept an outbuf.

For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.

segiddins · 2023-09-19T21:03:53Z

New benchmark:

Script

#!/usr/bin/env ruby
# frozen_string_literal: true

require 'rubygems'
require 'bundler/inline'

gemfile do
  source "https://rubygems.org/"
  gem 'memory_profiler'
  gem 'zlib'#, path: "/Users/segiddins/Development/github.com/ruby/zlib"
  gem 'benchmark-ipsa'
end


len = 16_384


Benchmark.ipsa do |x|
  x.report("read #{len} until eof") do
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.read len until f.eof
  ensure
    f&.close
  end
  
  x.report("readpartial #{len} until eof") do
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.readpartial len until f.eof
  ensure
    f&.close
  end
  
  x.report("readpartial #{len}, buf until eof") do
    buf = String.new(capacity: len, encoding: Encoding::BINARY)
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.readpartial len, buf until f.eof
  ensure
    f&.close
  end
  
  read_takes_buf = begin
  Zlib::GzipReader.open("/tmp/versions.gz") { |f| f.read nil, nil }
  rescue ArgumentError
    false
  else
    true
  end
  
  x.report("read #{len}, buf until eof") do
    buf = String.new(capacity: len, encoding: Encoding::BINARY)
    f = Zlib::GzipReader.open("/tmp/versions.gz")
    f.read len, buf until f.eof
  ensure
    f&.close
  end if read_takes_buf
  

  x.compare!
end

Before

Allocations -------------------------------------
read 16384 until eof    4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384 until eof
                        6728/0  alloc/ret       50/0  strings/ret
                    31.78 MB/0 B  alloc/ret
readpartial 16384, buf until eof
                        3411/0  alloc/ret       50/0  strings/ret
                    24.85 MB/0 B  alloc/ret
Warming up --------------------------------------
read 16384 until eof     1.000  i/100ms
readpartial 16384 until eof
                         1.000  i/100ms
readpartial 16384, buf until eof
                         1.000  i/100ms
Calculating -------------------------------------
read 16384 until eof     13.390  (± 7.5%) i/s -     67.000 
readpartial 16384 until eof
                         12.511  (± 8.0%) i/s -     63.000 
readpartial 16384, buf until eof
                         12.463  (± 8.0%) i/s -     63.000 

Comparison:
read 16384 until eof:       13.4 i/s
readpartial 16384 until eof:       12.5 i/s - same-ish: difference falls within error
readpartial 16384, buf until eof:       12.5 i/s - same-ish: difference falls within error

After

Allocations -------------------------------------
read 16384 until eof    4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384 until eof
                        4488/0  alloc/ret       50/0  strings/ret
                    26.13 MB/0 B  alloc/ret
readpartial 16384, buf until eof
                           9/0  alloc/ret        3/0  strings/ret
                    5.03 kB/0 B  alloc/ret
read 16384, buf until eof
                           9/0  alloc/ret        3/0  strings/ret
                    5.03 kB/0 B  alloc/ret
Warming up --------------------------------------
read 16384 until eof     1.000  i/100ms
readpartial 16384 until eof
                         1.000  i/100ms
readpartial 16384, buf until eof
                         1.000  i/100ms
read 16384, buf until eof
                         1.000  i/100ms
Calculating -------------------------------------
read 16384 until eof     12.839  (± 7.8%) i/s -     64.000 
readpartial 16384 until eof
                         13.298  (± 7.5%) i/s -     67.000 
readpartial 16384, buf until eof
                         13.883  (± 7.2%) i/s -     70.000 
read 16384, buf until eof
                         13.274  (±15.1%) i/s -     64.000 

Comparison:
readpartial 16384, buf until eof:       13.9 i/s
readpartial 16384 until eof:       13.3 i/s - same-ish: difference falls within error
read 16384, buf until eof:       13.3 i/s - same-ish: difference falls within error
read 16384 until eof:       12.8 i/s - same-ish: difference falls within error

In summary:

no change to read 16384 until eof
allocations drop by a third on readpartial 16384 until eof
allocations go to a constant number on read/readpartial 16384, buf until eof
no degradation of performance on any of those cases

@hsbt this will have a significant benefit for RubyGems/Bundler/RubyGems.org, I'd greatly appreciate your help in getting this reviewed 🙇🏻‍♂️

ps.

ran two more benchmarks, one for read nil and one for read nil, buf.
no change in read nil, and read nil, buf drops allocations significantly, down to 4 strings (from 50) and 26.23MB, from 33.15, so I think there is still room to optimize the case where read with no length and no buffer is passed in.

pps.

This drops allocations for IO.copy_stream out to nearly 0, and in my testing speeds it up by maybe 10%

martinemde · 2023-09-19T21:06:01Z

ext/zlib/zlib.c

+    // fill stream until the buffer can be filled. while this isn't strictly
+    // necessary, it does help cut down on the number of calls to readpartial
+    // in the typical case.
+    while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) {


This is probably the bug I was seeing. It wasn't reading enough to fill the len requested, so it was returning less than requested even when the amount was available.

Isn't that what readpartial means?

My understanding is that readpartial reads as much as it can, up to the limit, unless it would block.

yes, i fixed now

I was also mistaken about the exact behavior of readpartial. Sorry for any confusion. I double checked this branch myself locally and this definitely solves #56.

segiddins · 2023-09-28T21:11:46Z

@hsbt would appreciate a review on this one when you get a chance, thanks!

hsbt

I'm okay to merge this. But I would like to review this by @nobu or other ruby core member.

ext/zlib/zlib.c

nobu · 2023-10-05T07:23:14Z

ext/zlib/zlib.c

+    // fill stream until the buffer can be filled. while this isn't strictly
+    // necessary, it does help cut down on the number of calls to readpartial
+    // in the typical case.
+    while (!ZSTREAM_IS_FINISHED(&gz->z) && ZSTREAM_BUF_FILLED(&gz->z) < len) {


Isn't that what readpartial means?

For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.

segiddins · 2023-10-12T04:58:51Z

@nobu i think I've addressed your concerns?

nobu · 2023-10-14T14:11:21Z

test/fixtures directory will be rejected to merge to ruby/ruby.

segiddins · 2023-10-14T19:42:34Z

How would you like me to include the text fixture if not in a fixtures directory?

deivid-rodriguez · 2023-10-16T11:57:04Z

I think ruby-core backport script rejects any changes that pollute the global test folder in ruby-core with a fixtures or lib folders: https://github.com/ruby/ruby/blob/f6e2f32a962c2c19a0418dfd2020c8c9ab32c8b0/tool/sync_default_gems.rb#L427-L435.

Maybe if you move the fixtures folder to test/zlib it would be acceptable for ruby-core?

nobu · 2023-10-16T14:07:27Z

I wonder the necessity of fixtures in this case at least.
Is it necessary to be a pre-existing file?

segiddins · 2023-10-16T14:35:04Z

I'm happy to replace the fixture but I don't know how to manually generate a gzip file that demonstrates the same issue with readpartial -- it doesn't surface on all gzips. cc @martinemde

martinemde · 2023-10-17T06:30:13Z

All I can offer right now is that I previously tried unzipping and rezipping this data.tar. The freshly gzipped file no longer causes the issue. Including this gem in the test is a real world proof.

We could instead hunt around for a smaller example gem, if this is a size issue, or try to figure out what it is about this one file and recreate it (I don't have that expertise). We could also just generate random gzips until we find one that fails. 😁

nobu · 2023-10-24T04:44:44Z

diff --git c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz i/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz
deleted file mode 100644
index 6e5581c..0000000
Binary files c/test/zlib/fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz and /dev/null differ
diff --git c/test/zlib/test_zlib.rb i/test/zlib/test_zlib.rb
index 9d18058..5811f8b 100644
--- c/test/zlib/test_zlib.rb
+++ i/test/zlib/test_zlib.rb
@@ -1005,20 +1005,20 @@ if defined? Zlib
           assert_equal "".b, s
           assert_predicate f, :eof?
         end
+      }
 
-        File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
-          Zlib::GzipReader.wrap(f) do |gzio|
-            count = 0
-            until gzio.eof?
-              s = gzio.read(16_384)
-              count += s.size
-              assert_predicate gzio, :eof? if s.empty?
-            end
-            assert_equal 59904, count
+      compressed_data_file do |f, size|
+        Zlib::GzipReader.wrap(f) do |gzio|
+          count = 0
+          until gzio.eof?
+            s = gzio.read(16_384)
+            count += s.size
+            assert_predicate gzio, :eof? if s.empty?
           end
-          assert_predicate f, :closed?
+          assert_equal size, count
         end
-      }
+        assert_predicate f, :closed?
+      end
     end
 
     def test_readpartial
@@ -1043,7 +1043,7 @@ if defined? Zlib
         end
       }
 
-      File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+      compressed_data_file do |f, size|
         Zlib::GzipReader.wrap(f) do |gzio|
           count = 0
           until gzio.eof?
@@ -1051,12 +1051,12 @@ if defined? Zlib
             count += s.size
             assert_predicate gzio, :eof? if s.empty?
           end
-          assert_equal 59904, count
+          assert_equal size, count
         end
         assert_predicate f, :closed?
       end
 
-      File.open(File.expand_path("fixtures/ruby-progressbar-1.13.0.gem.data.tar.gz", __dir__), "rb") do |f|
+      compressed_data_file do |f, size|
         Zlib::GzipReader.wrap(f) do |gzio|
           count = 0
           s = String.new(capacity: 16_384, encoding: Encoding::BINARY)
@@ -1071,7 +1071,7 @@ if defined? Zlib
               break
             end
           end
-          assert_equal 59904, count
+          assert_equal size, count
         end
       end
     end
@@ -1271,6 +1271,17 @@ if defined? Zlib
       }
     end
 
+    private
+    def compressed_data_file
+      Tempfile.create(caller_locations(1, 1).first.base_label) {|t|
+        size = Zlib::GzipWriter.open(t.path) {|gz|
+          data = File.read(__FILE__)
+          gz.print(data)
+          data.size
+        }
+        yield t, size
+      }
+    end
   end
 
   class TestZlibGzipWriter < Test::Unit::TestCase

segiddins · 2023-11-27T04:39:30Z

@nobu I just tried reverting some of my changes, and the test in your patch still passes when I undo the change at the end of gzfile_readpartial

For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above.

For now, on a small rails app I have hanging around: ``` ==> memprof.after.txt <== Total allocated: 872.51 MB (465330 objects) Total retained: 40.48 kB (326 objects) ==> memprof.before.txt <== Total allocated: 890.79 MB (1494026 objects) Total retained: 40.40 kB (328 objects) ``` Not a huge difference in memory usage, but it's a drastic improvement in total number of allocations. Additionally, this will pay huge dividends once ruby/zlib#61 is merged, as it will allow us to completely avoid allocations in the repeated calls to readpartial, which currently accounts for most of the memory usage shown above. rubygems/rubygems@f78d45d927

nobu · 2023-12-16T09:49:42Z

@nobu I just tried reverting some of my changes, and the test in your patch still passes when I undo the change at the end of gzfile_readpartial

Isn't this PR an improvement of readpartial and unrelated to #56?
Why mixing unrelated fix?

martinemde · 2023-12-17T06:09:13Z

@nobu It sounds like you don't want this PR to include the bugfix. It seems like including an effective test for this bug makes this PR unmergable.

How would you suggest we proceed?

nobu · 2023-12-17T11:57:19Z

@nobu It sounds like you don't want this PR to include the bugfix.

Yes, exactly.
It's a different story.

martinemde · 2023-12-17T18:32:55Z

I was under the impression that the whole point of this PR is to fix #56, and the performance improvements were found as part of the fix.

Should we remove the bug fix from this PR so readpartial continues to be broken but is more memory efficient, or should we remove the memory efficiency improvements and just fix the bug?

martinemde · 2023-12-17T18:51:35Z

Note: rubygems does have 3 test fixture .gem files in test/rubygems/packages.

nobu · 2023-12-18T06:07:26Z

I was under the impression that the whole point of this PR is to fix #56, and the performance improvements were found as part of the fix.

The title and description look only for the latter, but nothing about #56.

Should we remove the bug fix from this PR so readpartial continues to be broken but is more memory efficient, or should we remove the memory efficiency improvements and just fix the bug?

I'm ok both, but you'll need to edit the title in the latter case.

segiddins · 2023-12-21T05:29:18Z

@nobu I removed the fix for #56

segiddins · 2023-12-22T01:49:25Z

Came up with a new fix for #56 in #73, would appreciate another review of this PR and that one when you get a chance!

segiddins · 2024-01-16T19:59:33Z

Rebased with truffleruby CI fix, I believe this now addresses @nobu's concerns and should pass CI, cc @hsbt

martinemde · 2024-01-30T21:56:22Z

@nobu, Have you given up on this PR?

segiddins · 2024-02-20T19:01:40Z

@nobu @nurse I would greatly appreciate a look at this change 🙇🏻

martinemde · 2024-03-06T04:44:54Z

@nobu Could you give us some feedback here about what Sam or I could do to get this PR ready? We think it would enable better performance in rubygems now that we will soon be able to use readpartial again.

Any ideas?

deivid-rodriguez · 2024-03-27T14:55:45Z

Maybe we can try opening another ruby-core ticket about this performance improvement. That worked great for getting the bug fix merged.

…d an outbuf This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present

eregon · 2024-04-12T09:42:59Z

Maybe we can try opening another ruby-core ticket about this performance improvement. That worked great for getting the bug fix merged.

Yep, that seems a good way to get attention on this. Also note you should add such tickets to the dev meeting ticket to make sure it gets reviewed in the monthly dev meeting.

segiddins · 2024-04-12T16:02:11Z

https://bugs.ruby-lang.org/issues/20424

segiddins commented Sep 14, 2023

View reviewed changes

ext/zlib/zlib.c Outdated Show resolved Hide resolved

segiddins mentioned this pull request Sep 19, 2023

Fewer allocations in gem installation rubygems/rubygems#6975

Merged

4 tasks

martinemde reviewed Sep 19, 2023

View reviewed changes

dentarg mentioned this pull request Sep 19, 2023

Handle ERROR: While executing gem ... (Zlib::BufError)? fedora-ruby/gem-compare#31

Open

eregon mentioned this pull request Sep 21, 2023

Add truffleruby-head in CI #62

Merged

hsbt self-assigned this Oct 3, 2023

hsbt approved these changes Oct 5, 2023

View reviewed changes

nobu requested changes Oct 5, 2023

View reviewed changes

segiddins force-pushed the segiddins/readpartial-buffer branch from 9c41288 to 462bbf1 Compare October 6, 2023 20:52

segiddins mentioned this pull request Oct 20, 2023

Use PR of zlib to try to reduce memory usage rubygems/rubygems.org#4147

Closed

segiddins force-pushed the segiddins/readpartial-buffer branch from dac893a to 8ad02d4 Compare December 21, 2023 05:29

segiddins mentioned this pull request Jan 13, 2024

Fix eof? to return true when there is no more data to read #73

Closed

segiddins requested a review from nobu January 13, 2024 08:56

segiddins force-pushed the segiddins/readpartial-buffer branch from 8ad02d4 to 70a688a Compare January 16, 2024 19:58

segiddins added 2 commits April 11, 2024 14:00

Avoid allocating intermediary strings when read/readpartial are passe…

d25ef40

…d an outbuf This accounts for a significant number of string allocations when reading rubygems, but we can avoid that in many places by only copying into the outbuf when present

Avoid double copying into provided outbuf

07f44b7

segiddins force-pushed the segiddins/readpartial-buffer branch from 70a688a to 07f44b7 Compare April 11, 2024 19:28

hsbt assigned nobu and unassigned hsbt Apr 19, 2024

nobu approved these changes May 14, 2024

View reviewed changes

hsbt merged commit 950dad7 into ruby:master May 14, 2024
29 checks passed

segiddins deleted the segiddins/readpartial-buffer branch May 14, 2024 21:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid allocating intermediary strings when readpartial is passed an outbuf #61

Avoid allocating intermediary strings when readpartial is passed an outbuf #61

segiddins commented Sep 14, 2023

martinemde commented Sep 14, 2023

segiddins commented Sep 19, 2023 •

edited

Loading

martinemde Sep 19, 2023

nobu Oct 5, 2023

martinemde Oct 6, 2023

segiddins Oct 6, 2023

martinemde Oct 6, 2023

segiddins commented Sep 28, 2023

hsbt left a comment

nobu Oct 5, 2023

segiddins commented Oct 12, 2023

nobu commented Oct 14, 2023

segiddins commented Oct 14, 2023

deivid-rodriguez commented Oct 16, 2023

nobu commented Oct 16, 2023

segiddins commented Oct 16, 2023

martinemde commented Oct 17, 2023

nobu commented Oct 24, 2023

segiddins commented Nov 27, 2023

nobu commented Dec 16, 2023

martinemde commented Dec 17, 2023

nobu commented Dec 17, 2023

martinemde commented Dec 17, 2023 •

edited

Loading

martinemde commented Dec 17, 2023

nobu commented Dec 18, 2023 •

edited

Loading

segiddins commented Dec 21, 2023

segiddins commented Dec 22, 2023

segiddins commented Jan 16, 2024

martinemde commented Jan 30, 2024

segiddins commented Feb 20, 2024

martinemde commented Mar 6, 2024 •

edited

Loading

deivid-rodriguez commented Mar 27, 2024

eregon commented Apr 12, 2024

segiddins commented Apr 12, 2024

Avoid allocating intermediary strings when readpartial is passed an outbuf #61

Avoid allocating intermediary strings when readpartial is passed an outbuf #61

Conversation

segiddins commented Sep 14, 2023

martinemde commented Sep 14, 2023

segiddins commented Sep 19, 2023 • edited Loading

martinemde Sep 19, 2023

Choose a reason for hiding this comment

nobu Oct 5, 2023

Choose a reason for hiding this comment

martinemde Oct 6, 2023

Choose a reason for hiding this comment

segiddins Oct 6, 2023

Choose a reason for hiding this comment

martinemde Oct 6, 2023

Choose a reason for hiding this comment

segiddins commented Sep 28, 2023

hsbt left a comment

Choose a reason for hiding this comment

nobu Oct 5, 2023

Choose a reason for hiding this comment

segiddins commented Oct 12, 2023

nobu commented Oct 14, 2023

segiddins commented Oct 14, 2023

deivid-rodriguez commented Oct 16, 2023

nobu commented Oct 16, 2023

segiddins commented Oct 16, 2023

martinemde commented Oct 17, 2023

nobu commented Oct 24, 2023

segiddins commented Nov 27, 2023

nobu commented Dec 16, 2023

martinemde commented Dec 17, 2023

nobu commented Dec 17, 2023

martinemde commented Dec 17, 2023 • edited Loading

martinemde commented Dec 17, 2023

nobu commented Dec 18, 2023 • edited Loading

segiddins commented Dec 21, 2023

segiddins commented Dec 22, 2023

segiddins commented Jan 16, 2024

martinemde commented Jan 30, 2024

segiddins commented Feb 20, 2024

martinemde commented Mar 6, 2024 • edited Loading

deivid-rodriguez commented Mar 27, 2024

eregon commented Apr 12, 2024

segiddins commented Apr 12, 2024

segiddins commented Sep 19, 2023 •

edited

Loading

martinemde commented Dec 17, 2023 •

edited

Loading

nobu commented Dec 18, 2023 •

edited

Loading

martinemde commented Mar 6, 2024 •

edited

Loading