-
-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new ZipWithoutBlock
cop
#462
base: master
Are you sure you want to change the base?
Add new ZipWithoutBlock
cop
#462
Conversation
46fa44a
to
c2fe0d8
Compare
Some of the gain here comes from simply not allocating the block. But I think most of it comes from the lower level conversion of the array into an array of arrays. |
Dear @koic and @Earlopain, Thank you so much for your consideration! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This largely makes sense to me. Never really thought about using zip
without arguments, that's a nice idea.
I think this pattern is relatively common in rails where they need to handle composite primary keys. I'll edit my findings in later if I decide to check. Edit: Nope, I was thinking about wrapping a single value in an array like [id]
Tip: use benchmark-ips
for future benchmarks. The output is much more legible.
spec/rubocop/cop/performance/use_zip_to_wrap_array_contents_spec.rb
Outdated
Show resolved
Hide resolved
spec/rubocop/cop/performance/use_zip_to_wrap_array_contents_spec.rb
Outdated
Show resolved
Hide resolved
spec/rubocop/cop/performance/use_zip_to_wrap_array_contents_spec.rb
Outdated
Show resolved
Hide resolved
260f786
to
4864c84
Compare
I really appreciate the thoughtful comments @Earlopain Fully revised and ready for review. I had run the original version on 50,000 files. I added 2 more tests and diversified the input slightly to get confident in the new |
Thanks for the |
4864c84
to
3733eca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost done from my side, just a few remaining things.
Can you also add a test for map
and map {}
? These are common places for errors in the past. I think it works fine in your case right now
spec/rubocop/cop/performance/use_zip_to_wrap_array_contents_spec.rb
Outdated
Show resolved
Hide resolved
First off, this will be a nice improvement for our code base, as we often use and recommend the I ran some of my own benchmarks based on @corsonknowles's in the body against small (4), medium (1000) and big (50k) arrays using benchmark-ips, and also added benchmark-memory for good measure. High level, we have:
I'm kinda surprised, but memory usage ends up the same which I'm not quite sure how to explain yet 😅 Here is the benchmark: require 'benchmark/ips'
require 'benchmark/memory'
arrays = {
small: (1..4).to_a,
medium: (1..1000).to_a,
big: (1..50000).to_a,
}
arrays.each do |size, array|
puts "=== #{size} ==="
Benchmark.ips do |x|
x.report(".map { |id| [id] }:") do
array.map { |id| [id] }
end
x.report(".zip:") do
array.zip
end
x.compare! order: :baseline
end
Benchmark.memory do |x|
x.report(".map { |id| [id] }:") do
array.map { |id| [id] }
end
x.report(".zip:") do
array.zip
end
end
end And the results:
|
3733eca
to
a6f1f90
Compare
Thanks @Earlopain !
|
a6f1f90
to
7d516ae
Compare
8bd882d
to
93c36b6
Compare
93c36b6
to
8fcb933
Compare
For the very curious,
|
0668ee0
to
060e040
Compare
I ran Is this what you are seeing? Failed examples:
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:123]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects str =~ /\A\^/
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:33]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects str.match? /\A\^/
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:124]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects /\A\^/ =~ str
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:213]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects str.match /\A\^/
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:34]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects /\A\^/.match? str
rspec './spec/rubocop/cop/performance/start_with_spec.rb[1:3:214]' # RuboCop::Cop::Performance::StartWith when `SafeMultiline: false` registers an offense and corrects /\A\^/.match str I was able to reproduce this on the primary branch upstream as well. I was able to demonstrate that these specs will pass again when Rubocop is locked to |
@corsonknowles ahh it's from a change I made upstream in rubocop/rubocop@f450f62. I'll fix it, thank you. |
CI is fixed not but "check the oldest supported rubocop version" is failing. This is because in RuboCop 1.48.1, the default target version for tests is 2.6, and numblocks were added in 2.7. Can you please tag your numblock specs with |
060e040
to
a26ed29
Compare
Great catch! Added these tags. I originally added numblock handling because of the InternalAffairs rule. Ruby 2.7 went end of life in Spring '23, so if Rubocop intends to continue to test for and support even earlier versions much longer, we might want to push some guidance up into the InternalAffairs rule to include this knowledge. |
a26ed29
to
f56ba4d
Compare
@koic do one of these names work for you, or do you have a different suggestion? |
Typically, cop names don't include |
f56ba4d
to
f334788
Compare
f334788
to
8409cf9
Compare
I think you may have just named it perfectly in the revised Description & Message. If we focus on what the fix is, like we would suggest in code review, the performant choice is to use "Zip Without a Block" -- I renamed it, let me know what you think. |
ZipForArrayWrapping
copZipWithoutBlock
cop
8409cf9
to
d782370
Compare
|
Add new Performance Cop to check for patterns like `.map { |id| [id] }` or `.map { [_1] }` and replace them with `.zip` without a block. This is a Performance Cop for the more efficient way to generate an Array of Arrays. * Performs 40-90% faster than `.map` to iteratively wrap array contents. * Performs 5 - 55% faster on ranges, depending on size.
d782370
to
c4308ae
Compare
Performance Cop for the more efficient way to generate an Array of Arrays.
.map
to iteratively wrap array contents..map
, depending on size..map!
each_with_object
This optimization is particularly helpful in performance sensitive paths or in large scale applications that need to generate arrays of arrays at scale. For example, leveraging the bulk enqueuing feature in Sidekiq requires an array of arrays, and is by definition used at scale.
A performance gain is always present for all sizes of arrays and ranges. The gain is smallest with small ranges, but still significant for small arrays.
This is not a style cop, but it is poetic that the more performant approach also has simpler and shorter syntax.
.zip
has been intentionally optimized in Ruby. This has been discussed publicly since at least 2012:Official
.zip
documentation:Source code for
.zip
:This performance cop isn't just an announcement to use
.zip
in the spirit of appreciating Ruby's great features, it is also a useful and necessary tool to leverage Rubocop to clean up and add rigor in large Ruby code bases. Rubocop is a much better approach than pattern matching for clean up at scale here, and it comes with the added benefit of proactive user feedback as additions are made to the code base going forward.For Arrays with 1000 entries, a common size for bulk operations, this performs 70% faster. Here it is at 70% faster, using
benchmark-ips
:Before submitting the PR make sure the following are checked:
[Fix #issue-number]
(if the related issue exists).master
(if not - rebase it).bundle exec rake default
. It executes all tests and runs RuboCop on its own code.{change_type}_{change_description}.md
if the new code introduces user-observable changes. See changelog entry format for details.