Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix from_any tests #1110

Merged
merged 1 commit into from
Nov 25, 2024
Merged

[BugFix] Fix from_any tests #1110

merged 1 commit into from
Nov 25, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 25, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 25, 2024
ghstack-source-id: 8c3b3d825555c727c7c18c7e8a87311f718a94b6
Pull Request resolved: #1110
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 25, 2024
@vmoens vmoens merged commit 9785ea6 into gh/vmoens/37/base Nov 25, 2024
10 of 24 checks passed
vmoens added a commit that referenced this pull request Nov 25, 2024
ghstack-source-id: 8c3b3d825555c727c7c18c7e8a87311f718a94b6
Pull Request resolved: #1110
@vmoens vmoens deleted the gh/vmoens/37/head branch November 25, 2024 11:32
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}25$. Worsened: $\large\color{#d91a1a}20$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 36.4210μs 11.0589μs 90.4249 KOps/s 98.3810 KOps/s $\textbf{\color{#d91a1a}-8.09\%}$
test_plain_set_stack_nested 43.1000μs 11.1044μs 90.0545 KOps/s 97.5404 KOps/s $\textbf{\color{#d91a1a}-7.67\%}$
test_plain_set_nested_inplace 44.6010μs 11.9431μs 83.7301 KOps/s 89.9632 KOps/s $\textbf{\color{#d91a1a}-6.93\%}$
test_plain_set_stack_nested_inplace 34.9610μs 11.9830μs 83.4515 KOps/s 90.1053 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_items 34.7400μs 2.9159μs 342.9496 KOps/s 339.1431 KOps/s $\color{#35bf28}+1.12\%$
test_items_nested 0.3585ms 0.3202ms 3.1227 KOps/s 3.1556 KOps/s $\color{#d91a1a}-1.04\%$
test_items_nested_locked 0.4733ms 0.3215ms 3.1102 KOps/s 3.1345 KOps/s $\color{#d91a1a}-0.77\%$
test_items_nested_leaf 0.2271ms 58.3570μs 17.1359 KOps/s 17.5140 KOps/s $\color{#d91a1a}-2.16\%$
test_items_stack_nested 0.4917ms 0.3212ms 3.1135 KOps/s 3.1353 KOps/s $\color{#d91a1a}-0.69\%$
test_items_stack_nested_leaf 96.9520μs 59.7616μs 16.7332 KOps/s 16.8703 KOps/s $\color{#d91a1a}-0.81\%$
test_items_stack_nested_locked 0.4841ms 0.3223ms 3.1024 KOps/s 3.1166 KOps/s $\color{#d91a1a}-0.46\%$
test_keys 25.6610μs 3.5024μs 285.5212 KOps/s 289.2653 KOps/s $\color{#d91a1a}-1.29\%$
test_keys_nested 0.1131ms 70.2717μs 14.2305 KOps/s 14.2471 KOps/s $\color{#d91a1a}-0.12\%$
test_keys_nested_locked 0.7620ms 74.8078μs 13.3676 KOps/s 13.1183 KOps/s $\color{#35bf28}+1.90\%$
test_keys_nested_leaf 0.1076ms 60.9163μs 16.4160 KOps/s 16.3291 KOps/s $\color{#35bf28}+0.53\%$
test_keys_stack_nested 0.1368ms 70.5068μs 14.1830 KOps/s 14.0096 KOps/s $\color{#35bf28}+1.24\%$
test_keys_stack_nested_leaf 0.1046ms 61.7519μs 16.1938 KOps/s 15.9442 KOps/s $\color{#35bf28}+1.57\%$
test_keys_stack_nested_locked 0.1574ms 76.1050μs 13.1397 KOps/s 13.0438 KOps/s $\color{#35bf28}+0.74\%$
test_values 4.6335μs 0.8471μs 1.1805 MOps/s 1.1889 MOps/s $\color{#d91a1a}-0.71\%$
test_values_nested 60.4310μs 31.3397μs 31.9084 KOps/s 32.1252 KOps/s $\color{#d91a1a}-0.67\%$
test_values_nested_locked 76.7420μs 32.9211μs 30.3757 KOps/s 30.6925 KOps/s $\color{#d91a1a}-1.03\%$
test_values_nested_leaf 75.5110μs 33.9237μs 29.4779 KOps/s 29.7910 KOps/s $\color{#d91a1a}-1.05\%$
test_values_stack_nested 89.0020μs 31.9637μs 31.2855 KOps/s 31.3745 KOps/s $\color{#d91a1a}-0.28\%$
test_values_stack_nested_leaf 75.5910μs 34.6821μs 28.8333 KOps/s 29.3732 KOps/s $\color{#d91a1a}-1.84\%$
test_values_stack_nested_locked 77.6320μs 33.3294μs 30.0036 KOps/s 30.1019 KOps/s $\color{#d91a1a}-0.33\%$
test_membership 2.2215μs 0.5082μs 1.9677 MOps/s 1.9431 MOps/s $\color{#35bf28}+1.27\%$
test_membership_nested 16.2100μs 1.8677μs 535.4208 KOps/s 522.3608 KOps/s $\color{#35bf28}+2.50\%$
test_membership_nested_leaf 16.4100μs 1.8972μs 527.0916 KOps/s 521.9056 KOps/s $\color{#35bf28}+0.99\%$
test_membership_stacked_nested 31.7010μs 1.9344μs 516.9688 KOps/s 493.2719 KOps/s $\color{#35bf28}+4.80\%$
test_membership_stacked_nested_leaf 27.3110μs 1.9422μs 514.8888 KOps/s 494.8936 KOps/s $\color{#35bf28}+4.04\%$
test_membership_nested_last 27.8910μs 2.7902μs 358.3910 KOps/s 351.4160 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested_leaf_last 40.8510μs 2.8254μs 353.9296 KOps/s 353.3293 KOps/s $\color{#35bf28}+0.17\%$
test_membership_stacked_nested_last 38.7610μs 3.4571μs 289.2580 KOps/s 303.1986 KOps/s $\color{#d91a1a}-4.60\%$
test_membership_stacked_nested_leaf_last 27.9500μs 3.4665μs 288.4767 KOps/s 307.8611 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_nested_getleaf 29.5610μs 5.9745μs 167.3792 KOps/s 165.7868 KOps/s $\color{#35bf28}+0.96\%$
test_nested_get 33.9710μs 5.7014μs 175.3970 KOps/s 175.2176 KOps/s $\color{#35bf28}+0.10\%$
test_stacked_getleaf 29.4000μs 5.9627μs 167.7079 KOps/s 166.6742 KOps/s $\color{#35bf28}+0.62\%$
test_stacked_get 35.3500μs 5.6936μs 175.6350 KOps/s 174.7125 KOps/s $\color{#35bf28}+0.53\%$
test_nested_getitemleaf 28.6010μs 6.1148μs 163.5382 KOps/s 163.1931 KOps/s $\color{#35bf28}+0.21\%$
test_nested_getitem 38.0510μs 5.7614μs 173.5684 KOps/s 171.1683 KOps/s $\color{#35bf28}+1.40\%$
test_stacked_getitemleaf 1.6968ms 6.0781μs 164.5238 KOps/s 163.5156 KOps/s $\color{#35bf28}+0.62\%$
test_stacked_getitem 35.8810μs 5.7182μs 174.8790 KOps/s 172.1120 KOps/s $\color{#35bf28}+1.61\%$
test_lock_nested 9.7926ms 0.3750ms 2.6670 KOps/s 2.6560 KOps/s $\color{#35bf28}+0.41\%$
test_lock_stack_nested 0.3801ms 0.3308ms 3.0231 KOps/s 2.9384 KOps/s $\color{#35bf28}+2.88\%$
test_unlock_nested 0.7476ms 0.3052ms 3.2769 KOps/s 3.2063 KOps/s $\color{#35bf28}+2.20\%$
test_unlock_stack_nested 0.3371ms 0.2713ms 3.6859 KOps/s 3.5751 KOps/s $\color{#35bf28}+3.10\%$
test_flatten_speed 0.1131ms 72.4910μs 13.7948 KOps/s 13.7521 KOps/s $\color{#35bf28}+0.31\%$
test_unflatten_speed 0.3846ms 0.2905ms 3.4425 KOps/s 3.4358 KOps/s $\color{#35bf28}+0.20\%$
test_common_ops 1.7193ms 0.5961ms 1.6777 KOps/s 1.6914 KOps/s $\color{#d91a1a}-0.81\%$
test_creation 0.1116ms 1.4866μs 672.6575 KOps/s 672.5991 KOps/s $+0.01\%$
test_creation_empty 39.3010μs 8.4921μs 117.7566 KOps/s 150.3390 KOps/s $\textbf{\color{#d91a1a}-21.67\%}$
test_creation_nested_1 37.0210μs 9.8116μs 101.9205 KOps/s 121.9128 KOps/s $\textbf{\color{#d91a1a}-16.40\%}$
test_creation_nested_2 37.3310μs 12.3423μs 81.0219 KOps/s 93.8752 KOps/s $\textbf{\color{#d91a1a}-13.69\%}$
test_clone 86.0620μs 10.1340μs 98.6778 KOps/s 87.3291 KOps/s $\textbf{\color{#35bf28}+13.00\%}$
test_getitem[int] 1.9169ms 10.6172μs 94.1869 KOps/s 90.0438 KOps/s $\color{#35bf28}+4.60\%$
test_getitem[slice_int] 0.1104ms 20.1461μs 49.6375 KOps/s 45.7568 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_getitem[range] 0.1504ms 37.6416μs 26.5663 KOps/s 24.6275 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_getitem[tuple] 0.1159ms 17.9667μs 55.6587 KOps/s 51.9004 KOps/s $\textbf{\color{#35bf28}+7.24\%}$
test_getitem[list] 0.2492ms 33.2617μs 30.0646 KOps/s 28.2602 KOps/s $\textbf{\color{#35bf28}+6.38\%}$
test_setitem_dim[int] 39.8310μs 18.4554μs 54.1846 KOps/s 49.7568 KOps/s $\textbf{\color{#35bf28}+8.90\%}$
test_setitem_dim[slice_int] 66.7910μs 36.7351μs 27.2219 KOps/s 25.4563 KOps/s $\textbf{\color{#35bf28}+6.94\%}$
test_setitem_dim[range] 0.1007ms 55.1475μs 18.1332 KOps/s 17.8795 KOps/s $\color{#35bf28}+1.42\%$
test_setitem_dim[tuple] 69.6920μs 32.4681μs 30.7994 KOps/s 29.6007 KOps/s $\color{#35bf28}+4.05\%$
test_setitem 80.0420μs 14.9572μs 66.8573 KOps/s 63.5501 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_set 91.9620μs 14.4787μs 69.0672 KOps/s 66.9717 KOps/s $\color{#35bf28}+3.13\%$
test_set_shared 1.5441ms 0.1456ms 6.8668 KOps/s 6.7302 KOps/s $\color{#35bf28}+2.03\%$
test_update 0.4921ms 17.5237μs 57.0657 KOps/s 58.3408 KOps/s $\color{#d91a1a}-2.19\%$
test_update_nested 83.0910μs 21.6607μs 46.1665 KOps/s 44.5324 KOps/s $\color{#35bf28}+3.67\%$
test_update__nested 59.8110μs 24.5024μs 40.8123 KOps/s 37.9188 KOps/s $\textbf{\color{#35bf28}+7.63\%}$
test_set_nested 0.1306ms 15.8241μs 63.1946 KOps/s 61.0271 KOps/s $\color{#35bf28}+3.55\%$
test_set_nested_new 87.1720μs 17.8985μs 55.8706 KOps/s 53.1669 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_select 0.1021ms 29.3704μs 34.0479 KOps/s 32.3695 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_select_nested 70.3510μs 41.2333μs 24.2522 KOps/s 23.9299 KOps/s $\color{#35bf28}+1.35\%$
test_exclude_nested 87.6020μs 58.7014μs 17.0354 KOps/s 16.5927 KOps/s $\color{#35bf28}+2.67\%$
test_empty[True] 0.3188ms 0.2518ms 3.9708 KOps/s 3.8830 KOps/s $\color{#35bf28}+2.26\%$
test_empty[False] 3.7021μs 0.7473μs 1.3381 MOps/s 1.3340 MOps/s $\color{#35bf28}+0.30\%$
test_to 85.5220μs 54.6026μs 18.3142 KOps/s 18.0330 KOps/s $\color{#35bf28}+1.56\%$
test_to_nonblocking 1.0213ms 48.6027μs 20.5750 KOps/s 21.2671 KOps/s $\color{#d91a1a}-3.25\%$
test_unbind_speed 0.2933ms 0.2264ms 4.4166 KOps/s 4.1882 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_unbind_speed_stack0 0.2697ms 0.2301ms 4.3467 KOps/s 4.1713 KOps/s $\color{#35bf28}+4.20\%$
test_unbind_speed_stack1 93.4141ms 0.6439ms 1.5531 KOps/s 1.4943 KOps/s $\color{#35bf28}+3.94\%$
test_split 97.5681ms 1.5687ms 637.4561 Ops/s 607.3620 Ops/s $\color{#35bf28}+4.95\%$
test_chunk 96.8282ms 1.5684ms 637.5737 Ops/s 673.0411 Ops/s $\textbf{\color{#d91a1a}-5.27\%}$
test_consolidate[False-None] 99.3597ms 2.8593ms 349.7372 Ops/s 376.6804 Ops/s $\textbf{\color{#d91a1a}-7.15\%}$
test_consolidate[default-None] 1.8233ms 1.6792ms 595.5158 Ops/s 569.9585 Ops/s $\color{#35bf28}+4.48\%$
test_consolidate[reduce-overhead-None] 1.7910ms 1.7167ms 582.5289 Ops/s 561.3328 Ops/s $\color{#35bf28}+3.78\%$
test_consolidate_njt[False-None] 6.9558ms 6.6244ms 150.9561 Ops/s 151.6716 Ops/s $\color{#d91a1a}-0.47\%$
test_to[False-False-None] 1.7730ms 1.6953ms 589.8774 Ops/s 584.2844 Ops/s $\color{#35bf28}+0.96\%$
test_to[True-False-None] 1.4949ms 1.2993ms 769.6217 Ops/s 751.2779 Ops/s $\color{#35bf28}+2.44\%$
test_to[within-False-None] 4.1517ms 4.0110ms 249.3156 Ops/s 246.0468 Ops/s $\color{#35bf28}+1.33\%$
test_to[True-default-None] 5.2775ms 5.1375ms 194.6474 Ops/s 188.4691 Ops/s $\color{#35bf28}+3.28\%$
test_to_njt[False-False-None] 7.3840ms 7.0986ms 140.8719 Ops/s 142.6861 Ops/s $\color{#d91a1a}-1.27\%$
test_to_njt[True-False-None] 5.7256ms 5.5812ms 179.1733 Ops/s 181.9509 Ops/s $\color{#d91a1a}-1.53\%$
test_to_njt[within-False-None] 12.4739ms 12.2348ms 81.7339 Ops/s 80.8266 Ops/s $\color{#35bf28}+1.12\%$
test_creation[device0] 0.4647ms 79.6305μs 12.5580 KOps/s 11.9917 KOps/s $\color{#35bf28}+4.72\%$
test_creation_from_tensor 0.5288ms 84.1751μs 11.8800 KOps/s 11.7495 KOps/s $\color{#35bf28}+1.11\%$
test_add_one[memmap_tensor0] 0.4352ms 6.8660μs 145.6445 KOps/s 137.2096 KOps/s $\textbf{\color{#35bf28}+6.15\%}$
test_contiguous[memmap_tensor0] 2.0870μs 0.4136μs 2.4177 MOps/s 2.4425 MOps/s $\color{#d91a1a}-1.01\%$
test_stack[memmap_tensor0] 34.1010μs 4.5713μs 218.7565 KOps/s 209.9683 KOps/s $\color{#35bf28}+4.19\%$
test_memmaptd_index 1.9013ms 0.2494ms 4.0091 KOps/s 3.7557 KOps/s $\textbf{\color{#35bf28}+6.75\%}$
test_memmaptd_index_astensor 0.9480ms 0.3066ms 3.2618 KOps/s 3.1087 KOps/s $\color{#35bf28}+4.92\%$
test_memmaptd_index_op 1.0132ms 0.5985ms 1.6709 KOps/s 1.6615 KOps/s $\color{#35bf28}+0.56\%$
test_serialize_model 0.1310s 0.1303s 7.6771 Ops/s 7.6491 Ops/s $\color{#35bf28}+0.37\%$
test_serialize_model_pickle 1.3482s 1.2136s 0.8240 Ops/s 0.8429 Ops/s $\color{#d91a1a}-2.24\%$
test_serialize_weights 0.1313s 0.1295s 7.7196 Ops/s 7.7079 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_weights_returnearly 0.3459s 64.1190ms 15.5960 Ops/s 12.9368 Ops/s $\textbf{\color{#35bf28}+20.55\%}$
test_serialize_weights_pickle 1.3567s 1.2172s 0.8215 Ops/s 0.8231 Ops/s $\color{#d91a1a}-0.19\%$
test_reshape_pytree 79.6110μs 22.4971μs 44.4502 KOps/s 43.8858 KOps/s $\color{#35bf28}+1.29\%$
test_reshape_td 0.1025ms 27.3716μs 36.5342 KOps/s 36.7828 KOps/s $\color{#d91a1a}-0.68\%$
test_view_pytree 58.4310μs 22.4666μs 44.5105 KOps/s 44.2711 KOps/s $\color{#35bf28}+0.54\%$
test_view_td 0.1392ms 29.0438μs 34.4308 KOps/s 32.8095 KOps/s $\color{#35bf28}+4.94\%$
test_unbind_pytree 68.9210μs 28.1372μs 35.5401 KOps/s 34.7360 KOps/s $\color{#35bf28}+2.31\%$
test_unbind_td 0.5493ms 35.2337μs 28.3819 KOps/s 27.5304 KOps/s $\color{#35bf28}+3.09\%$
test_split_pytree 70.3410μs 30.1128μs 33.2084 KOps/s 31.9192 KOps/s $\color{#35bf28}+4.04\%$
test_split_td 0.6463ms 37.8024μs 26.4533 KOps/s 24.8305 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_add_pytree 67.2720μs 34.0816μs 29.3414 KOps/s 27.7269 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_add_td 0.1016ms 48.7327μs 20.5201 KOps/s 20.7870 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_add_one_nested[tensordict-compile] 0.2738ms 0.1208ms 8.2774 KOps/s 8.0169 KOps/s $\color{#35bf28}+3.25\%$
test_compile_add_one_nested[tensordict-eager] 0.2709ms 0.1243ms 8.0433 KOps/s 7.8818 KOps/s $\color{#35bf28}+2.05\%$
test_compile_add_one_nested[pytree-compile] 0.1513ms 97.7086μs 10.2345 KOps/s 10.0072 KOps/s $\color{#35bf28}+2.27\%$
test_compile_add_one_nested[pytree-eager] 1.2369ms 0.1488ms 6.7217 KOps/s 6.4610 KOps/s $\color{#35bf28}+4.04\%$
test_compile_copy_nested[tensordict-compile] 93.3010μs 22.1555μs 45.1356 KOps/s 39.0146 KOps/s $\textbf{\color{#35bf28}+15.69\%}$
test_compile_copy_nested[tensordict-eager] 51.3510μs 26.7178μs 37.4282 KOps/s 36.4540 KOps/s $\color{#35bf28}+2.67\%$
test_compile_copy_nested[pytree-compile] 0.3108ms 63.9088μs 15.6473 KOps/s 15.2506 KOps/s $\color{#35bf28}+2.60\%$
test_compile_copy_nested[pytree-eager] 73.9410μs 49.1243μs 20.3565 KOps/s 20.0761 KOps/s $\color{#35bf28}+1.40\%$
test_compile_add_one_flat[tensordict-compile] 0.1826ms 0.1436ms 6.9623 KOps/s 6.9732 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_one_flat[tensordict-eager] 0.3146ms 0.2095ms 4.7739 KOps/s 4.8206 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_add_one_flat[tensorclass-compile] 0.1438ms 98.6887μs 10.1329 KOps/s 10.1224 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[tensorclass-eager] 0.2247ms 53.8606μs 18.5664 KOps/s 19.1094 KOps/s $\color{#d91a1a}-2.84\%$
test_compile_add_one_flat[pytree-compile] 0.1830ms 0.1369ms 7.3044 KOps/s 7.2385 KOps/s $\color{#35bf28}+0.91\%$
test_compile_add_one_flat[pytree-eager] 0.6727ms 0.4805ms 2.0814 KOps/s 1.9957 KOps/s $\color{#35bf28}+4.29\%$
test_compile_add_self_flat[tensordict-eager] 0.3641ms 0.2500ms 3.9994 KOps/s 4.0376 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_add_self_flat[tensordict-compile] 0.1828ms 0.1442ms 6.9343 KOps/s 6.9191 KOps/s $\color{#35bf28}+0.22\%$
test_compile_add_self_flat[tensorclass-eager] 0.1916ms 62.1008μs 16.1028 KOps/s 16.0494 KOps/s $\color{#35bf28}+0.33\%$
test_compile_add_self_flat[tensorclass-compile] 0.1558ms 98.2280μs 10.1804 KOps/s 10.0558 KOps/s $\color{#35bf28}+1.24\%$
test_compile_add_self_flat[pytree-eager] 0.5673ms 0.4059ms 2.4636 KOps/s 2.4114 KOps/s $\color{#35bf28}+2.16\%$
test_compile_add_self_flat[pytree-compile] 0.3021ms 0.1393ms 7.1801 KOps/s 7.3423 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_copy_flat[tensordict-compile] 0.1814ms 19.8789μs 50.3046 KOps/s 55.1486 KOps/s $\textbf{\color{#d91a1a}-8.78\%}$
test_compile_copy_flat[tensordict-eager] 55.8310μs 27.1339μs 36.8543 KOps/s 35.4021 KOps/s $\color{#35bf28}+4.10\%$
test_compile_copy_flat[pytree-compile] 0.1002ms 69.4624μs 14.3963 KOps/s 14.2889 KOps/s $\color{#35bf28}+0.75\%$
test_compile_copy_flat[pytree-eager] 0.1916ms 51.3413μs 19.4775 KOps/s 19.3794 KOps/s $\color{#35bf28}+0.51\%$
test_compile_assign_and_add[tensordict-compile] 1.6493ms 0.3948ms 2.5330 KOps/s 2.1776 KOps/s $\textbf{\color{#35bf28}+16.32\%}$
test_compile_assign_and_add[tensordict-eager] 2.8120ms 2.5893ms 386.2046 Ops/s 379.6340 Ops/s $\color{#35bf28}+1.73\%$
test_compile_assign_and_add[pytree-compile] 1.6366ms 0.4409ms 2.2682 KOps/s 2.0595 KOps/s $\textbf{\color{#35bf28}+10.13\%}$
test_compile_assign_and_add[pytree-eager] 2.8726ms 2.6145ms 382.4792 Ops/s 357.6060 Ops/s $\textbf{\color{#35bf28}+6.96\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1743ms 0.1199ms 8.3370 KOps/s 8.1085 KOps/s $\color{#35bf28}+2.82\%$
test_compile_indexing[tensor-tensordict-eager] 0.5583ms 83.8810μs 11.9216 KOps/s 11.8268 KOps/s $\color{#35bf28}+0.80\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1754ms 0.1132ms 8.8342 KOps/s 8.8053 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2363ms 69.3100μs 14.4279 KOps/s 13.4850 KOps/s $\textbf{\color{#35bf28}+6.99\%}$
test_compile_indexing[tensor-pytree-compile] 0.2777ms 0.1155ms 8.6550 KOps/s 8.7012 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_indexing[tensor-pytree-eager] 0.1615ms 72.1139μs 13.8669 KOps/s 13.4524 KOps/s $\color{#35bf28}+3.08\%$
test_compile_indexing[slice-tensordict-compile] 0.2644ms 0.1065ms 9.3874 KOps/s 9.7000 KOps/s $\color{#d91a1a}-3.22\%$
test_compile_indexing[slice-tensordict-eager] 0.1422ms 16.7414μs 59.7323 KOps/s 54.8537 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2345ms 96.9672μs 10.3128 KOps/s 10.1915 KOps/s $\color{#35bf28}+1.19\%$
test_compile_indexing[slice-tensorclass-eager] 59.6010μs 16.3387μs 61.2045 KOps/s 60.5862 KOps/s $\color{#35bf28}+1.02\%$
test_compile_indexing[slice-pytree-compile] 0.1537ms 0.1026ms 9.7501 KOps/s 9.5964 KOps/s $\color{#35bf28}+1.60\%$
test_compile_indexing[slice-pytree-eager] 82.4920μs 16.1718μs 61.8359 KOps/s 60.7995 KOps/s $\color{#35bf28}+1.70\%$
test_compile_indexing[int-tensordict-compile] 0.2557ms 0.1069ms 9.3562 KOps/s 9.3351 KOps/s $\color{#35bf28}+0.23\%$
test_compile_indexing[int-tensordict-eager] 0.6471ms 17.1059μs 58.4594 KOps/s 55.9498 KOps/s $\color{#35bf28}+4.49\%$
test_compile_indexing[int-tensorclass-compile] 0.2893ms 0.1025ms 9.7603 KOps/s 10.0836 KOps/s $\color{#d91a1a}-3.21\%$
test_compile_indexing[int-tensorclass-eager] 74.9620μs 16.5074μs 60.5790 KOps/s 61.3933 KOps/s $\color{#d91a1a}-1.33\%$
test_compile_indexing[int-pytree-compile] 0.1512ms 0.1027ms 9.7341 KOps/s 10.0616 KOps/s $\color{#d91a1a}-3.25\%$
test_compile_indexing[int-pytree-eager] 54.0410μs 16.5317μs 60.4900 KOps/s 61.1437 KOps/s $\color{#d91a1a}-1.07\%$
test_mod_add[eager] 0.1676ms 37.2529μs 26.8435 KOps/s 28.7772 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_mod_add[compile] 0.1883ms 81.4709μs 12.2743 KOps/s 12.1909 KOps/s $\color{#35bf28}+0.68\%$
test_mod_add[compile-overhead] 0.3295ms 0.1690ms 5.9184 KOps/s 5.4640 KOps/s $\textbf{\color{#35bf28}+8.32\%}$
test_mod_wrap[eager] 0.4350ms 0.2626ms 3.8083 KOps/s 3.8572 KOps/s $\color{#d91a1a}-1.27\%$
test_mod_wrap[compile] 0.4364ms 0.2864ms 3.4920 KOps/s 3.4199 KOps/s $\color{#35bf28}+2.11\%$
test_mod_wrap[compile-overhead] 7.0983ms 3.7214ms 268.7182 Ops/s 266.7329 Ops/s $\color{#35bf28}+0.74\%$
test_mod_wrap_and_backward[eager] 1.5871ms 1.4548ms 687.3783 Ops/s 672.3184 Ops/s $\color{#35bf28}+2.24\%$
test_mod_wrap_and_backward[compile] 1.5410ms 1.3700ms 729.9263 Ops/s 715.4091 Ops/s $\color{#35bf28}+2.03\%$
test_mod_wrap_and_backward[compile-overhead] 1.5418ms 1.0506ms 951.8131 Ops/s 966.0471 Ops/s $\color{#d91a1a}-1.47\%$
test_seq_add[eager] 0.1713ms 0.1018ms 9.8221 KOps/s 9.5276 KOps/s $\color{#35bf28}+3.09\%$
test_seq_add[compile] 0.2333ms 87.8537μs 11.3826 KOps/s 11.0842 KOps/s $\color{#35bf28}+2.69\%$
test_seq_add[compile-overhead] 0.1722ms 0.1296ms 7.7169 KOps/s 7.5978 KOps/s $\color{#35bf28}+1.57\%$
test_seq_wrap[eager] 0.5115ms 0.3908ms 2.5589 KOps/s 2.4451 KOps/s $\color{#35bf28}+4.66\%$
test_seq_wrap[compile] 0.4249ms 0.3018ms 3.3130 KOps/s 3.2080 KOps/s $\color{#35bf28}+3.27\%$
test_seq_wrap[compile-overhead] 0.3068ms 0.2257ms 4.4306 KOps/s 4.3524 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_runtime[False-eager] 0.8859ms 0.7488ms 1.3355 KOps/s 1.3140 KOps/s $\color{#35bf28}+1.64\%$
test_func_call_runtime[False-compile] 0.8969ms 0.7513ms 1.3310 KOps/s 1.2947 KOps/s $\color{#35bf28}+2.80\%$
test_func_call_runtime[False-compile-overhead] 0.4230ms 0.3667ms 2.7271 KOps/s 2.7116 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_runtime[True-eager] 0.9848ms 0.9046ms 1.1055 KOps/s 1.0734 KOps/s $\color{#35bf28}+2.99\%$
test_func_call_runtime[True-compile] 0.9278ms 0.7695ms 1.2995 KOps/s 1.2660 KOps/s $\color{#35bf28}+2.64\%$
test_func_call_runtime[True-compile-overhead] 0.5294ms 0.3875ms 2.5804 KOps/s 2.5707 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_cm_runtime[False-eager] 0.8742ms 0.7473ms 1.3381 KOps/s 1.2995 KOps/s $\color{#35bf28}+2.97\%$
test_func_call_cm_runtime[False-compile] 1.1149ms 0.7505ms 1.3325 KOps/s 1.2824 KOps/s $\color{#35bf28}+3.91\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4158ms 0.3677ms 2.7199 KOps/s 2.6823 KOps/s $\color{#35bf28}+1.40\%$
test_func_call_cm_runtime[True-eager] 1.1199ms 1.0030ms 996.9964 Ops/s 968.1539 Ops/s $\color{#35bf28}+2.98\%$
test_func_call_cm_runtime[True-compile] 0.9452ms 0.7996ms 1.2506 KOps/s 1.2218 KOps/s $\color{#35bf28}+2.36\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4844ms 0.4123ms 2.4255 KOps/s 2.3878 KOps/s $\color{#35bf28}+1.58\%$
test_vmap_func_call_cm_runtime[eager] 2.5586ms 2.0916ms 478.1025 Ops/s 468.8242 Ops/s $\color{#35bf28}+1.98\%$
test_vmap_func_call_cm_runtime[compile] 0.9545ms 0.8163ms 1.2250 KOps/s 1.2014 KOps/s $\color{#35bf28}+1.97\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5822ms 0.4151ms 2.4090 KOps/s 2.3748 KOps/s $\color{#35bf28}+1.44\%$
test_distributed 2.5025ms 0.1866ms 5.3578 KOps/s 8.7204 KOps/s $\textbf{\color{#d91a1a}-38.56\%}$
test_tdmodule 0.3633ms 15.7744μs 63.3937 KOps/s 64.6439 KOps/s $\color{#d91a1a}-1.93\%$
test_tdmodule_dispatch 66.3610μs 30.5599μs 32.7226 KOps/s 34.4983 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_tdseq 27.2500μs 15.7975μs 63.3011 KOps/s 68.3123 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_tdseq_dispatch 56.3210μs 34.8137μs 28.7243 KOps/s 32.4583 KOps/s $\textbf{\color{#d91a1a}-11.50\%}$
test_instantiation_functorch 1.6765ms 1.5471ms 646.3533 Ops/s 641.7799 Ops/s $\color{#35bf28}+0.71\%$
test_exec_functorch 0.1874ms 0.1444ms 6.9248 KOps/s 6.6440 KOps/s $\color{#35bf28}+4.23\%$
test_exec_functional_call 0.2106ms 0.1394ms 7.1742 KOps/s 6.8998 KOps/s $\color{#35bf28}+3.98\%$
test_exec_td_decorator 0.3631ms 0.1838ms 5.4412 KOps/s 5.2854 KOps/s $\color{#35bf28}+2.95\%$
test_vmap_mlp_speed_decorator[True-True] 0.8272ms 0.6812ms 1.4680 KOps/s 1.4529 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[True-False] 0.8224ms 0.6783ms 1.4742 KOps/s 1.4571 KOps/s $\color{#35bf28}+1.17\%$
test_vmap_mlp_speed_decorator[False-True] 0.7798ms 0.5975ms 1.6737 KOps/s 1.6628 KOps/s $\color{#35bf28}+0.66\%$
test_vmap_mlp_speed_decorator[False-False] 0.6978ms 0.5974ms 1.6739 KOps/s 1.6617 KOps/s $\color{#35bf28}+0.73\%$
test_vmap_transformer_speed_decorator[True-True] 19.5650ms 19.3168ms 51.7683 Ops/s 51.3479 Ops/s $\color{#35bf28}+0.82\%$
test_vmap_transformer_speed_decorator[True-False] 20.1133ms 19.3609ms 51.6504 Ops/s 51.2744 Ops/s $\color{#35bf28}+0.73\%$
test_vmap_transformer_speed_decorator[False-True] 19.3882ms 19.2039ms 52.0728 Ops/s 51.9399 Ops/s $\color{#35bf28}+0.26\%$
test_vmap_transformer_speed_decorator[False-False] 19.3247ms 19.1792ms 52.1398 Ops/s 51.6646 Ops/s $\color{#35bf28}+0.92\%$
test_to_module_speed[True] 1.0359ms 0.9215ms 1.0852 KOps/s 1.0573 KOps/s $\color{#35bf28}+2.64\%$
test_to_module_speed[False] 1.3416ms 0.9112ms 1.0974 KOps/s 1.0778 KOps/s $\color{#35bf28}+1.82\%$
test_tc_init 78.3420μs 36.0966μs 27.7035 KOps/s 28.4511 KOps/s $\color{#d91a1a}-2.63\%$
test_tc_init_nested 0.1622ms 71.7660μs 13.9342 KOps/s 13.6755 KOps/s $\color{#35bf28}+1.89\%$
test_tc_first_layer_tensor 5.4201μs 0.6976μs 1.4336 MOps/s 1.4256 MOps/s $\color{#35bf28}+0.56\%$
test_tc_first_layer_nontensor 27.9110μs 2.3091μs 433.0750 KOps/s 427.9320 KOps/s $\color{#35bf28}+1.20\%$
test_tc_second_layer_tensor 16.0603μs 1.4203μs 704.0885 KOps/s 684.0785 KOps/s $\color{#35bf28}+2.93\%$
test_tc_second_layer_nontensor 37.8210μs 3.0375μs 329.2197 KOps/s 323.6031 KOps/s $\color{#35bf28}+1.74\%$
test_unbind 0.2225s 10.0217ms 99.7838 Ops/s 151.1347 Ops/s $\textbf{\color{#d91a1a}-33.98\%}$
test_full_like 11.3823ms 9.7256ms 102.8211 Ops/s 100.8932 Ops/s $\color{#35bf28}+1.91\%$
test_zeros_like 5.9617ms 4.4482ms 224.8106 Ops/s 137.5098 Ops/s $\textbf{\color{#35bf28}+63.49\%}$
test_ones_like 9.4556ms 7.2741ms 137.4748 Ops/s 226.5151 Ops/s $\textbf{\color{#d91a1a}-39.31\%}$
test_clone 12.5988ms 9.6070ms 104.0910 Ops/s 145.7545 Ops/s $\textbf{\color{#d91a1a}-28.58\%}$
test_squeeze 55.5810μs 9.5865μs 104.3136 KOps/s 105.5929 KOps/s $\color{#d91a1a}-1.21\%$
test_unsqueeze 0.1433ms 72.2751μs 13.8360 KOps/s 14.0371 KOps/s $\color{#d91a1a}-1.43\%$
test_split 0.3731ms 0.1610ms 6.2118 KOps/s 6.2919 KOps/s $\color{#d91a1a}-1.27\%$
test_permute 0.2537ms 0.1893ms 5.2814 KOps/s 5.5390 KOps/s $\color{#d91a1a}-4.65\%$
test_stack 53.4801ms 52.0921ms 19.1968 Ops/s 19.2657 Ops/s $\color{#d91a1a}-0.36\%$
test_cat 53.0172ms 51.9525ms 19.2484 Ops/s 22.9318 Ops/s $\textbf{\color{#d91a1a}-16.06\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants