Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Better compile checks #1139

Merged
merged 1 commit into from
Dec 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 15, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: c6a8d4587df45e374f0d6cb59fe1c982c7818276
Pull Request resolved: #1139
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 15, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}69$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1601ms 21.2751μs 47.0032 KOps/s 60.4768 KOps/s $\textbf{\color{#d91a1a}-22.28\%}$
test_plain_set_stack_nested 61.6640μs 21.4202μs 46.6849 KOps/s 59.2392 KOps/s $\textbf{\color{#d91a1a}-21.19\%}$
test_plain_set_nested_inplace 62.8570μs 23.1178μs 43.2567 KOps/s 53.3940 KOps/s $\textbf{\color{#d91a1a}-18.99\%}$
test_plain_set_stack_nested_inplace 67.1550μs 23.0113μs 43.4569 KOps/s 53.4481 KOps/s $\textbf{\color{#d91a1a}-18.69\%}$
test_items 45.3040μs 4.1854μs 238.9266 KOps/s 241.6223 KOps/s $\color{#d91a1a}-1.12\%$
test_items_nested 0.6948ms 0.4389ms 2.2786 KOps/s 2.5095 KOps/s $\textbf{\color{#d91a1a}-9.20\%}$
test_items_nested_locked 0.9044ms 0.4378ms 2.2842 KOps/s 2.5117 KOps/s $\textbf{\color{#d91a1a}-9.06\%}$
test_items_nested_leaf 0.1503ms 77.2468μs 12.9455 KOps/s 13.8968 KOps/s $\textbf{\color{#d91a1a}-6.85\%}$
test_items_stack_nested 0.9527ms 0.4455ms 2.2447 KOps/s 2.4887 KOps/s $\textbf{\color{#d91a1a}-9.80\%}$
test_items_stack_nested_leaf 0.1709ms 81.0893μs 12.3321 KOps/s 13.4219 KOps/s $\textbf{\color{#d91a1a}-8.12\%}$
test_items_stack_nested_locked 0.6031ms 0.4411ms 2.2670 KOps/s 2.5049 KOps/s $\textbf{\color{#d91a1a}-9.50\%}$
test_keys 21.7400μs 3.5013μs 285.6063 KOps/s 266.7866 KOps/s $\textbf{\color{#35bf28}+7.05\%}$
test_keys_nested 0.3059ms 0.1682ms 5.9451 KOps/s 7.0685 KOps/s $\textbf{\color{#d91a1a}-15.89\%}$
test_keys_nested_locked 0.5574ms 0.1745ms 5.7322 KOps/s 7.0806 KOps/s $\textbf{\color{#d91a1a}-19.04\%}$
test_keys_nested_leaf 1.5829ms 0.1475ms 6.7798 KOps/s 8.6060 KOps/s $\textbf{\color{#d91a1a}-21.22\%}$
test_keys_stack_nested 0.2888ms 0.1656ms 6.0403 KOps/s 7.3496 KOps/s $\textbf{\color{#d91a1a}-17.81\%}$
test_keys_stack_nested_leaf 0.2687ms 0.1447ms 6.9091 KOps/s 8.6128 KOps/s $\textbf{\color{#d91a1a}-19.78\%}$
test_keys_stack_nested_locked 0.2845ms 0.1710ms 5.8496 KOps/s 7.1155 KOps/s $\textbf{\color{#d91a1a}-17.79\%}$
test_values 6.9468μs 1.0378μs 963.5858 KOps/s 953.4912 KOps/s $\color{#35bf28}+1.06\%$
test_values_nested 0.1033ms 63.0724μs 15.8548 KOps/s 18.0447 KOps/s $\textbf{\color{#d91a1a}-12.14\%}$
test_values_nested_locked 0.1035ms 62.6917μs 15.9511 KOps/s 18.1903 KOps/s $\textbf{\color{#d91a1a}-12.31\%}$
test_values_nested_leaf 0.1285ms 72.8137μs 13.7337 KOps/s 16.4306 KOps/s $\textbf{\color{#d91a1a}-16.41\%}$
test_values_stack_nested 0.1122ms 62.8224μs 15.9179 KOps/s 17.5159 KOps/s $\textbf{\color{#d91a1a}-9.12\%}$
test_values_stack_nested_leaf 0.1244ms 72.0903μs 13.8715 KOps/s 16.3166 KOps/s $\textbf{\color{#d91a1a}-14.99\%}$
test_values_stack_nested_locked 0.1453ms 63.4492μs 15.7606 KOps/s 17.5097 KOps/s $\textbf{\color{#d91a1a}-9.99\%}$
test_membership 20.7190μs 0.8835μs 1.1319 MOps/s 1.0505 MOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_membership_nested 38.5820μs 3.0259μs 330.4787 KOps/s 335.6853 KOps/s $\color{#d91a1a}-1.55\%$
test_membership_nested_leaf 27.5110μs 3.0531μs 327.5333 KOps/s 332.3051 KOps/s $\color{#d91a1a}-1.44\%$
test_membership_stacked_nested 33.6930μs 2.9964μs 333.7381 KOps/s 334.2799 KOps/s $\color{#d91a1a}-0.16\%$
test_membership_stacked_nested_leaf 38.6120μs 2.9870μs 334.7847 KOps/s 332.9141 KOps/s $\color{#35bf28}+0.56\%$
test_membership_nested_last 30.3870μs 4.5473μs 219.9127 KOps/s 221.9799 KOps/s $\color{#d91a1a}-0.93\%$
test_membership_nested_leaf_last 44.7940μs 4.5053μs 221.9622 KOps/s 215.9217 KOps/s $\color{#35bf28}+2.80\%$
test_membership_stacked_nested_last 65.6000μs 7.7436μs 129.1396 KOps/s 149.0092 KOps/s $\textbf{\color{#d91a1a}-13.33\%}$
test_membership_stacked_nested_leaf_last 45.7750μs 7.7249μs 129.4509 KOps/s 150.9018 KOps/s $\textbf{\color{#d91a1a}-14.22\%}$
test_nested_getleaf 32.7810μs 10.9051μs 91.6998 KOps/s 91.3735 KOps/s $\color{#35bf28}+0.36\%$
test_nested_get 47.8890μs 10.4375μs 95.8087 KOps/s 95.2783 KOps/s $\color{#35bf28}+0.56\%$
test_stacked_getleaf 52.4280μs 10.8358μs 92.2863 KOps/s 91.1768 KOps/s $\color{#35bf28}+1.22\%$
test_stacked_get 30.0460μs 10.4899μs 95.3298 KOps/s 95.8264 KOps/s $\color{#d91a1a}-0.52\%$
test_nested_getitemleaf 53.7600μs 11.4565μs 87.2867 KOps/s 87.0140 KOps/s $\color{#35bf28}+0.31\%$
test_nested_getitem 53.3390μs 10.5825μs 94.4955 KOps/s 92.2768 KOps/s $\color{#35bf28}+2.40\%$
test_stacked_getitemleaf 29.8450μs 11.2160μs 89.1583 KOps/s 87.1392 KOps/s $\color{#35bf28}+2.32\%$
test_stacked_getitem 50.7040μs 10.4045μs 96.1124 KOps/s 93.5930 KOps/s $\color{#35bf28}+2.69\%$
test_lock_nested 1.9882ms 0.4688ms 2.1330 KOps/s 2.2343 KOps/s $\color{#d91a1a}-4.54\%$
test_lock_stack_nested 0.6549ms 0.4296ms 2.3277 KOps/s 2.4022 KOps/s $\color{#d91a1a}-3.10\%$
test_unlock_nested 0.7754ms 0.3824ms 2.6148 KOps/s 2.7116 KOps/s $\color{#d91a1a}-3.57\%$
test_unlock_stack_nested 0.6658ms 0.3507ms 2.8511 KOps/s 2.9722 KOps/s $\color{#d91a1a}-4.08\%$
test_flatten_speed 0.1881ms 0.1012ms 9.8859 KOps/s 10.5085 KOps/s $\textbf{\color{#d91a1a}-5.92\%}$
test_unflatten_speed 0.7024ms 0.5371ms 1.8618 KOps/s 2.0306 KOps/s $\textbf{\color{#d91a1a}-8.31\%}$
test_common_ops 5.1693ms 0.8583ms 1.1651 KOps/s 1.3433 KOps/s $\textbf{\color{#d91a1a}-13.27\%}$
test_creation 17.4330μs 2.5474μs 392.5602 KOps/s 475.4615 KOps/s $\textbf{\color{#d91a1a}-17.44\%}$
test_creation_empty 40.8360μs 12.6333μs 79.1556 KOps/s 112.7130 KOps/s $\textbf{\color{#d91a1a}-29.77\%}$
test_creation_nested_1 1.3860ms 15.6810μs 63.7716 KOps/s 85.8118 KOps/s $\textbf{\color{#d91a1a}-25.68\%}$
test_creation_nested_2 44.0220μs 20.5870μs 48.5743 KOps/s 62.1526 KOps/s $\textbf{\color{#d91a1a}-21.85\%}$
test_clone 0.2292ms 13.4537μs 74.3291 KOps/s 73.6873 KOps/s $\color{#35bf28}+0.87\%$
test_getitem[int] 0.9355ms 12.9994μs 76.9264 KOps/s 78.1467 KOps/s $\color{#d91a1a}-1.56\%$
test_getitem[slice_int] 0.1404ms 25.7628μs 38.8157 KOps/s 39.8992 KOps/s $\color{#d91a1a}-2.72\%$
test_getitem[range] 0.1827ms 49.7482μs 20.1012 KOps/s 19.8761 KOps/s $\color{#35bf28}+1.13\%$
test_getitem[tuple] 0.1636ms 20.9669μs 47.6942 KOps/s 47.6216 KOps/s $\color{#35bf28}+0.15\%$
test_getitem[list] 0.1719ms 44.6124μs 22.4153 KOps/s 21.9638 KOps/s $\color{#35bf28}+2.06\%$
test_setitem_dim[int] 48.6210μs 25.6067μs 39.0523 KOps/s 38.5655 KOps/s $\color{#35bf28}+1.26\%$
test_setitem_dim[slice_int] 91.1000μs 52.8905μs 18.9070 KOps/s 18.9716 KOps/s $\color{#d91a1a}-0.34\%$
test_setitem_dim[range] 0.1321ms 73.2438μs 13.6530 KOps/s 13.1858 KOps/s $\color{#35bf28}+3.54\%$
test_setitem_dim[tuple] 77.9750μs 40.8121μs 24.5025 KOps/s 23.7973 KOps/s $\color{#35bf28}+2.96\%$
test_setitem 62.8770μs 20.7951μs 48.0883 KOps/s 52.1707 KOps/s $\textbf{\color{#d91a1a}-7.83\%}$
test_set 0.2402ms 20.3257μs 49.1988 KOps/s 53.3529 KOps/s $\textbf{\color{#d91a1a}-7.79\%}$
test_set_shared 3.5833ms 0.1738ms 5.7536 KOps/s 5.7818 KOps/s $\color{#d91a1a}-0.49\%$
test_update 0.4297ms 23.9401μs 41.7709 KOps/s 50.0148 KOps/s $\textbf{\color{#d91a1a}-16.48\%}$
test_update_nested 0.3857ms 34.5011μs 28.9846 KOps/s 32.0803 KOps/s $\textbf{\color{#d91a1a}-9.65\%}$
test_update__nested 0.5373ms 34.5379μs 28.9537 KOps/s 29.8909 KOps/s $\color{#d91a1a}-3.14\%$
test_set_nested 0.3418ms 22.5232μs 44.3987 KOps/s 48.1743 KOps/s $\textbf{\color{#d91a1a}-7.84\%}$
test_set_nested_new 0.1208ms 27.5077μs 36.3535 KOps/s 39.9675 KOps/s $\textbf{\color{#d91a1a}-9.04\%}$
test_select 0.4270ms 45.8990μs 21.7870 KOps/s 23.0409 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_select_nested 0.1244ms 64.7644μs 15.4406 KOps/s 15.8805 KOps/s $\color{#d91a1a}-2.77\%$
test_exclude_nested 0.1744ms 83.0230μs 12.0449 KOps/s 12.2873 KOps/s $\color{#d91a1a}-1.97\%$
test_empty[True] 0.5302ms 0.4334ms 2.3072 KOps/s 2.5984 KOps/s $\textbf{\color{#d91a1a}-11.21\%}$
test_empty[False] 19.1605μs 1.4391μs 694.8950 KOps/s 765.2208 KOps/s $\textbf{\color{#d91a1a}-9.19\%}$
test_unbind_speed 0.4775ms 0.2782ms 3.5940 KOps/s 3.7772 KOps/s $\color{#d91a1a}-4.85\%$
test_unbind_speed_stack0 0.4453ms 0.2744ms 3.6447 KOps/s 3.8419 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_unbind_speed_stack1 0.1128s 0.8223ms 1.2161 KOps/s 1.4123 KOps/s $\textbf{\color{#d91a1a}-13.89\%}$
test_split 1.8109ms 1.6003ms 624.8825 Ops/s 551.0864 Ops/s $\textbf{\color{#35bf28}+13.39\%}$
test_chunk 0.1087s 1.7743ms 563.6167 Ops/s 552.1593 Ops/s $\color{#35bf28}+2.08\%$
test_consolidate_njt[False-None] 0.1178s 9.1142ms 109.7185 Ops/s 121.4908 Ops/s $\textbf{\color{#d91a1a}-9.69\%}$
test_creation[device0] 0.2371ms 90.7709μs 11.0168 KOps/s 10.8480 KOps/s $\color{#35bf28}+1.56\%$
test_creation_from_tensor 0.2550ms 93.2986μs 10.7183 KOps/s 9.5541 KOps/s $\textbf{\color{#35bf28}+12.18\%}$
test_add_one[memmap_tensor0] 0.7275ms 4.8110μs 207.8577 KOps/s 207.3946 KOps/s $\color{#35bf28}+0.22\%$
test_contiguous[memmap_tensor0] 11.7320μs 0.5123μs 1.9519 MOps/s 1.9247 MOps/s $\color{#35bf28}+1.42\%$
test_stack[memmap_tensor0] 62.7460μs 3.4663μs 288.4929 KOps/s 302.4127 KOps/s $\color{#d91a1a}-4.60\%$
test_memmaptd_index 1.0133ms 0.2383ms 4.1964 KOps/s 4.1565 KOps/s $\color{#35bf28}+0.96\%$
test_memmaptd_index_astensor 0.6021ms 0.3258ms 3.0692 KOps/s 3.1085 KOps/s $\color{#d91a1a}-1.26\%$
test_memmaptd_index_op 1.0777ms 0.6174ms 1.6196 KOps/s 1.7681 KOps/s $\textbf{\color{#d91a1a}-8.40\%}$
test_serialize_model 0.1253s 0.1186s 8.4343 Ops/s 8.6634 Ops/s $\color{#d91a1a}-2.64\%$
test_serialize_model_pickle 0.4585s 0.3896s 2.5665 Ops/s 2.4748 Ops/s $\color{#35bf28}+3.70\%$
test_serialize_weights 0.1256s 0.1136s 8.8033 Ops/s 7.3756 Ops/s $\textbf{\color{#35bf28}+19.36\%}$
test_serialize_weights_returnearly 0.1719s 0.1606s 6.2271 Ops/s 6.2076 Ops/s $\color{#35bf28}+0.31\%$
test_serialize_weights_pickle 0.5524s 0.4468s 2.2381 Ops/s 2.3731 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_serialize_weights_filesystem 0.2641s 0.1579s 6.3325 Ops/s 6.8523 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_serialize_model_filesystem 0.1612s 0.1464s 6.8298 Ops/s 6.7044 Ops/s $\color{#35bf28}+1.87\%$
test_reshape_pytree 62.1660μs 27.4916μs 36.3747 KOps/s 37.3885 KOps/s $\color{#d91a1a}-2.71\%$
test_reshape_td 85.4690μs 33.9414μs 29.4625 KOps/s 29.7091 KOps/s $\color{#d91a1a}-0.83\%$
test_view_pytree 0.1056ms 27.3163μs 36.6082 KOps/s 37.3977 KOps/s $\color{#d91a1a}-2.11\%$
test_view_td 80.1290μs 39.4284μs 25.3625 KOps/s 25.4465 KOps/s $\color{#d91a1a}-0.33\%$
test_unbind_pytree 67.6860μs 29.7352μs 33.6301 KOps/s 33.3355 KOps/s $\color{#35bf28}+0.88\%$
test_unbind_td 0.3612ms 40.5888μs 24.6373 KOps/s 25.6660 KOps/s $\color{#d91a1a}-4.01\%$
test_split_pytree 70.2210μs 29.5767μs 33.8103 KOps/s 34.0714 KOps/s $\color{#d91a1a}-0.77\%$
test_split_td 0.2807ms 47.1211μs 21.2219 KOps/s 17.4594 KOps/s $\textbf{\color{#35bf28}+21.55\%}$
test_add_pytree 80.7100μs 36.3403μs 27.5177 KOps/s 28.0306 KOps/s $\color{#d91a1a}-1.83\%$
test_add_td 0.1281ms 59.8981μs 16.6950 KOps/s 19.0266 KOps/s $\textbf{\color{#d91a1a}-12.25\%}$
test_compile_add_one_nested[tensordict-compile] 0.1279ms 62.0627μs 16.1127 KOps/s 15.8706 KOps/s $\color{#35bf28}+1.53\%$
test_compile_add_one_nested[tensordict-eager] 0.6648ms 0.1722ms 5.8066 KOps/s 6.2047 KOps/s $\textbf{\color{#d91a1a}-6.42\%}$
test_compile_add_one_nested[pytree-compile] 0.1432ms 46.1449μs 21.6709 KOps/s 21.8041 KOps/s $\color{#d91a1a}-0.61\%$
test_compile_add_one_nested[pytree-eager] 0.2789ms 0.1204ms 8.3071 KOps/s 8.4219 KOps/s $\color{#d91a1a}-1.36\%$
test_compile_copy_nested[tensordict-compile] 0.1187ms 26.3840μs 37.9018 KOps/s 37.9086 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_copy_nested[tensordict-eager] 0.1260ms 58.8310μs 16.9978 KOps/s 18.3582 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_compile_copy_nested[pytree-compile] 0.1609ms 79.9105μs 12.5140 KOps/s 12.6980 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_copy_nested[pytree-eager] 0.1400ms 68.3961μs 14.6207 KOps/s 14.7726 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_add_one_flat[tensordict-compile] 0.2381ms 0.1039ms 9.6218 KOps/s 9.4112 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_flat[tensordict-eager] 0.4321ms 0.2187ms 4.5715 KOps/s 4.9844 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_compile_add_one_flat[tensorclass-compile] 0.1409ms 44.7953μs 22.3237 KOps/s 21.6021 KOps/s $\color{#35bf28}+3.34\%$
test_compile_add_one_flat[tensorclass-eager] 0.4611ms 66.5312μs 15.0305 KOps/s 16.0107 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_compile_add_one_flat[pytree-compile] 0.2355ms 0.1032ms 9.6920 KOps/s 9.6133 KOps/s $\color{#35bf28}+0.82\%$
test_compile_add_one_flat[pytree-eager] 0.3936ms 0.1990ms 5.0263 KOps/s 4.9665 KOps/s $\color{#35bf28}+1.20\%$
test_compile_add_self_flat[tensordict-eager] 0.3456ms 0.2343ms 4.2689 KOps/s 4.7001 KOps/s $\textbf{\color{#d91a1a}-9.17\%}$
test_compile_add_self_flat[tensordict-compile] 0.1892ms 0.1049ms 9.5371 KOps/s 9.4686 KOps/s $\color{#35bf28}+0.72\%$
test_compile_add_self_flat[tensorclass-eager] 0.2870ms 60.1341μs 16.6295 KOps/s 17.9689 KOps/s $\textbf{\color{#d91a1a}-7.45\%}$
test_compile_add_self_flat[tensorclass-compile] 0.1902ms 47.2474μs 21.1652 KOps/s 21.9768 KOps/s $\color{#d91a1a}-3.69\%$
test_compile_add_self_flat[pytree-eager] 0.6279ms 0.1576ms 6.3467 KOps/s 6.2774 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_self_flat[pytree-compile] 0.1761ms 0.1034ms 9.6728 KOps/s 9.7048 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_copy_flat[tensordict-compile] 61.5840μs 21.2245μs 47.1154 KOps/s 48.2441 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_copy_flat[tensordict-eager] 0.1713ms 65.5768μs 15.2493 KOps/s 17.1214 KOps/s $\textbf{\color{#d91a1a}-10.93\%}$
test_compile_copy_flat[pytree-compile] 0.1953ms 79.7425μs 12.5404 KOps/s 12.1120 KOps/s $\color{#35bf28}+3.54\%$
test_compile_copy_flat[pytree-eager] 0.1413ms 67.9685μs 14.7127 KOps/s 14.2827 KOps/s $\color{#35bf28}+3.01\%$
test_compile_assign_and_add[tensordict-compile] 0.3101ms 0.2083ms 4.8018 KOps/s 4.8843 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_assign_and_add[tensordict-eager] 1.5105ms 1.3369ms 748.0259 Ops/s 757.1184 Ops/s $\color{#d91a1a}-1.20\%$
test_compile_assign_and_add[pytree-compile] 0.4140ms 0.2045ms 4.8912 KOps/s 4.8911 KOps/s $+0.00\%$
test_compile_assign_and_add[pytree-eager] 1.3201ms 0.7694ms 1.2997 KOps/s 1.2944 KOps/s $\color{#35bf28}+0.41\%$
test_compile_assign_and_add_stack[compile] 0.5626ms 0.4603ms 2.1725 KOps/s 2.1969 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_assign_and_add_stack[eager] 3.6665ms 2.6798ms 373.1579 Ops/s 391.1088 Ops/s $\color{#d91a1a}-4.59\%$
test_compile_indexing[tensor-tensordict-compile] 90.0670μs 36.1716μs 27.6460 KOps/s 27.1683 KOps/s $\color{#35bf28}+1.76\%$
test_compile_indexing[tensor-tensordict-eager] 0.6204ms 34.1692μs 29.2662 KOps/s 29.6785 KOps/s $\color{#d91a1a}-1.39\%$
test_compile_indexing[tensor-tensorclass-compile] 82.5040μs 29.7559μs 33.6068 KOps/s 32.7780 KOps/s $\color{#35bf28}+2.53\%$
test_compile_indexing[tensor-tensorclass-eager] 75.3100μs 23.5739μs 42.4198 KOps/s 44.0759 KOps/s $\color{#d91a1a}-3.76\%$
test_compile_indexing[tensor-pytree-compile] 0.1049ms 30.0236μs 33.3072 KOps/s 31.9779 KOps/s $\color{#35bf28}+4.16\%$
test_compile_indexing[tensor-pytree-eager] 79.6480μs 23.3686μs 42.7925 KOps/s 43.9666 KOps/s $\color{#d91a1a}-2.67\%$
test_compile_indexing[slice-tensordict-compile] 0.1116ms 51.5383μs 19.4031 KOps/s 19.3779 KOps/s $\color{#35bf28}+0.13\%$
test_compile_indexing[slice-tensordict-eager] 0.3897ms 20.5688μs 48.6174 KOps/s 48.6333 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[slice-tensorclass-compile] 92.9630μs 43.8041μs 22.8289 KOps/s 22.1708 KOps/s $\color{#35bf28}+2.97\%$
test_compile_indexing[slice-tensorclass-eager] 64.5200μs 19.2995μs 51.8148 KOps/s 52.1782 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_indexing[slice-pytree-compile] 0.1047ms 44.9943μs 22.2250 KOps/s 21.9326 KOps/s $\color{#35bf28}+1.33\%$
test_compile_indexing[slice-pytree-eager] 65.8320μs 18.9428μs 52.7904 KOps/s 52.1073 KOps/s $\color{#35bf28}+1.31\%$
test_compile_indexing[int-tensordict-compile] 0.1031ms 52.8170μs 18.9333 KOps/s 19.0949 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_indexing[int-tensordict-eager] 0.9448ms 20.6968μs 48.3167 KOps/s 49.5505 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_indexing[int-tensorclass-compile] 0.1043ms 45.3989μs 22.0270 KOps/s 21.8768 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[int-tensorclass-eager] 73.3360μs 19.2296μs 52.0033 KOps/s 52.9081 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[int-pytree-compile] 0.1071ms 45.1918μs 22.1279 KOps/s 21.9717 KOps/s $\color{#35bf28}+0.71\%$
test_compile_indexing[int-pytree-eager] 66.3130μs 19.0273μs 52.5562 KOps/s 52.4245 KOps/s $\color{#35bf28}+0.25\%$
test_mod_add[eager] 99.0540μs 36.6647μs 27.2742 KOps/s 30.2976 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_mod_add[compile] 0.1326ms 48.1001μs 20.7900 KOps/s 20.6964 KOps/s $\color{#35bf28}+0.45\%$
test_mod_add[compile-overhead] 0.1523ms 46.9181μs 21.3137 KOps/s 20.5383 KOps/s $\color{#35bf28}+3.78\%$
test_mod_wrap[eager] 0.3454ms 0.2285ms 4.3768 KOps/s 4.4511 KOps/s $\color{#d91a1a}-1.67\%$
test_mod_wrap[compile] 0.4262ms 0.2094ms 4.7757 KOps/s 4.7861 KOps/s $\color{#d91a1a}-0.22\%$
test_mod_wrap[compile-overhead] 0.4145ms 0.2072ms 4.8267 KOps/s 4.7462 KOps/s $\color{#35bf28}+1.70\%$
test_mod_wrap_and_backward[eager] 16.7722ms 12.7920ms 78.1738 Ops/s 82.6006 Ops/s $\textbf{\color{#d91a1a}-5.36\%}$
test_mod_wrap_and_backward[compile] 15.5285ms 12.7445ms 78.4652 Ops/s 77.6706 Ops/s $\color{#35bf28}+1.02\%$
test_mod_wrap_and_backward[compile-overhead] 19.6406ms 13.4669ms 74.2560 Ops/s 79.6049 Ops/s $\textbf{\color{#d91a1a}-6.72\%}$
test_seq_add[eager] 0.2116ms 0.1172ms 8.5333 KOps/s 9.0615 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_seq_add[compile] 0.1166ms 63.2254μs 15.8164 KOps/s 15.8974 KOps/s $\color{#d91a1a}-0.51\%$
test_seq_add[compile-overhead] 0.1389ms 59.7839μs 16.7269 KOps/s 16.4546 KOps/s $\color{#35bf28}+1.65\%$
test_seq_wrap[eager] 0.6014ms 0.4541ms 2.2020 KOps/s 2.2488 KOps/s $\color{#d91a1a}-2.08\%$
test_seq_wrap[compile] 0.4058ms 0.2292ms 4.3634 KOps/s 4.3078 KOps/s $\color{#35bf28}+1.29\%$
test_seq_wrap[compile-overhead] 0.3720ms 0.2304ms 4.3412 KOps/s 4.3516 KOps/s $\color{#d91a1a}-0.24\%$
test_func_call_runtime[False-eager] 1.0315ms 0.5666ms 1.7649 KOps/s 1.7908 KOps/s $\color{#d91a1a}-1.45\%$
test_func_call_runtime[False-compile] 0.7759ms 0.4292ms 2.3301 KOps/s 2.3789 KOps/s $\color{#d91a1a}-2.05\%$
test_func_call_runtime[False-compile-overhead] 0.5465ms 0.4282ms 2.3354 KOps/s 2.3525 KOps/s $\color{#d91a1a}-0.73\%$
test_func_call_runtime[True-eager] 1.0524ms 0.7809ms 1.2806 KOps/s 1.2912 KOps/s $\color{#d91a1a}-0.82\%$
test_func_call_runtime[True-compile] 0.6242ms 0.4685ms 2.1346 KOps/s 2.1515 KOps/s $\color{#d91a1a}-0.79\%$
test_func_call_runtime[True-compile-overhead] 0.6173ms 0.4649ms 2.1509 KOps/s 2.1603 KOps/s $\color{#d91a1a}-0.44\%$
test_func_call_cm_runtime[False-eager] 0.9908ms 0.5603ms 1.7849 KOps/s 1.7971 KOps/s $\color{#d91a1a}-0.68\%$
test_func_call_cm_runtime[False-compile] 0.5704ms 0.4247ms 2.3546 KOps/s 2.3590 KOps/s $\color{#d91a1a}-0.19\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5372ms 0.4244ms 2.3561 KOps/s 2.3849 KOps/s $\color{#d91a1a}-1.21\%$
test_func_call_cm_runtime[True-eager] 1.3047ms 0.9283ms 1.0773 KOps/s 1.1073 KOps/s $\color{#d91a1a}-2.71\%$
test_func_call_cm_runtime[True-compile] 0.6065ms 0.4916ms 2.0343 KOps/s 2.0617 KOps/s $\color{#d91a1a}-1.33\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5947ms 0.4896ms 2.0423 KOps/s 2.0498 KOps/s $\color{#d91a1a}-0.37\%$
test_vmap_func_call_cm_runtime[eager] 2.6581ms 1.9147ms 522.2850 Ops/s 528.9813 Ops/s $\color{#d91a1a}-1.27\%$
test_vmap_func_call_cm_runtime[compile] 0.7006ms 0.5097ms 1.9621 KOps/s 1.9132 KOps/s $\color{#35bf28}+2.56\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7902ms 0.5142ms 1.9447 KOps/s 1.9288 KOps/s $\color{#35bf28}+0.83\%$
test_distributed 0.6070ms 0.1258ms 7.9495 KOps/s 7.7761 KOps/s $\color{#35bf28}+2.23\%$
test_tdmodule 49.6730μs 27.2910μs 36.6421 KOps/s 39.2321 KOps/s $\textbf{\color{#d91a1a}-6.60\%}$
test_tdmodule_dispatch 88.1940μs 50.3850μs 19.8472 KOps/s 21.8207 KOps/s $\textbf{\color{#d91a1a}-9.04\%}$
test_tdseq 48.1200μs 27.4967μs 36.3680 KOps/s 39.4469 KOps/s $\textbf{\color{#d91a1a}-7.81\%}$
test_tdseq_dispatch 85.3890μs 53.5413μs 18.6772 KOps/s 20.9034 KOps/s $\textbf{\color{#d91a1a}-10.65\%}$
test_instantiation_functorch 2.3046ms 1.5514ms 644.5874 Ops/s 635.7030 Ops/s $\color{#35bf28}+1.40\%$
test_exec_functorch 0.3331ms 0.1942ms 5.1495 KOps/s 5.5337 KOps/s $\textbf{\color{#d91a1a}-6.94\%}$
test_exec_functional_call 0.3324ms 0.1787ms 5.5945 KOps/s 5.8460 KOps/s $\color{#d91a1a}-4.30\%$
test_exec_td_decorator 0.5659ms 0.2385ms 4.1932 KOps/s 4.3835 KOps/s $\color{#d91a1a}-4.34\%$
test_vmap_mlp_speed_decorator[True-True] 0.9844ms 0.6677ms 1.4977 KOps/s 1.5405 KOps/s $\color{#d91a1a}-2.77\%$
test_vmap_mlp_speed_decorator[True-False] 1.1899ms 0.6748ms 1.4818 KOps/s 1.5169 KOps/s $\color{#d91a1a}-2.31\%$
test_vmap_mlp_speed_decorator[False-True] 2.4612ms 0.5430ms 1.8417 KOps/s 1.8790 KOps/s $\color{#d91a1a}-1.98\%$
test_vmap_mlp_speed_decorator[False-False] 0.7794ms 0.5416ms 1.8464 KOps/s 1.8885 KOps/s $\color{#d91a1a}-2.23\%$
test_to_module_speed[True] 1.5806ms 1.3624ms 734.0031 Ops/s 774.3877 Ops/s $\textbf{\color{#d91a1a}-5.22\%}$
test_to_module_speed[False] 1.5512ms 1.3159ms 759.9560 Ops/s 792.0041 Ops/s $\color{#d91a1a}-4.05\%$
test_tc_init 89.2860μs 50.2034μs 19.9190 KOps/s 22.5920 KOps/s $\textbf{\color{#d91a1a}-11.83\%}$
test_tc_init_nested 0.1734ms 99.9034μs 10.0097 KOps/s 11.2331 KOps/s $\textbf{\color{#d91a1a}-10.89\%}$
test_tc_first_layer_tensor 28.2030μs 1.5523μs 644.2165 KOps/s 656.6603 KOps/s $\color{#d91a1a}-1.90\%$
test_tc_first_layer_nontensor 36.3370μs 4.6991μs 212.8084 KOps/s 201.7902 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_tc_second_layer_tensor 42.5390μs 2.8942μs 345.5236 KOps/s 353.3794 KOps/s $\color{#d91a1a}-2.22\%$
test_tc_second_layer_nontensor 26.8790μs 6.1357μs 162.9806 KOps/s 158.1763 KOps/s $\color{#35bf28}+3.04\%$
test_unbind 0.2498s 16.1309ms 61.9927 Ops/s 76.7332 Ops/s $\textbf{\color{#d91a1a}-19.21\%}$
test_full_like 8.7453ms 7.5734ms 132.0408 Ops/s 77.9756 Ops/s $\textbf{\color{#35bf28}+69.34\%}$
test_zeros_like 4.1722ms 3.0279ms 330.2652 Ops/s 133.0278 Ops/s $\textbf{\color{#35bf28}+148.27\%}$
test_ones_like 4.5101ms 3.4497ms 289.8807 Ops/s 129.4414 Ops/s $\textbf{\color{#35bf28}+123.95\%}$
test_clone 16.2287ms 8.1020ms 123.4269 Ops/s 103.2108 Ops/s $\textbf{\color{#35bf28}+19.59\%}$
test_squeeze 87.4430μs 12.4358μs 80.4133 KOps/s 85.8042 KOps/s $\textbf{\color{#d91a1a}-6.28\%}$
test_unsqueeze 0.1670ms 92.1966μs 10.8464 KOps/s 11.1848 KOps/s $\color{#d91a1a}-3.03\%$
test_split 0.3678ms 0.1991ms 5.0224 KOps/s 5.1092 KOps/s $\color{#d91a1a}-1.70\%$
test_permute 0.3678ms 0.2151ms 4.6488 KOps/s 4.9864 KOps/s $\textbf{\color{#d91a1a}-6.77\%}$
test_stack 32.1051ms 25.8601ms 38.6696 Ops/s 37.0406 Ops/s $\color{#35bf28}+4.40\%$
test_cat 31.6346ms 25.7211ms 38.8786 Ops/s 36.8797 Ops/s $\textbf{\color{#35bf28}+5.42\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}32$. Worsened: $\large\color{#d91a1a}41$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.4100μs 12.7702μs 78.3074 KOps/s 94.9802 KOps/s $\textbf{\color{#d91a1a}-17.55\%}$
test_plain_set_stack_nested 42.2100μs 12.7880μs 78.1981 KOps/s 97.2236 KOps/s $\textbf{\color{#d91a1a}-19.57\%}$
test_plain_set_nested_inplace 40.1110μs 13.8443μs 72.2321 KOps/s 88.1753 KOps/s $\textbf{\color{#d91a1a}-18.08\%}$
test_plain_set_stack_nested_inplace 36.8810μs 13.7468μs 72.7445 KOps/s 89.2497 KOps/s $\textbf{\color{#d91a1a}-18.49\%}$
test_items 32.2200μs 2.9351μs 340.7058 KOps/s 342.3528 KOps/s $\color{#d91a1a}-0.48\%$
test_items_nested 0.4220ms 0.3647ms 2.7418 KOps/s 2.8282 KOps/s $\color{#d91a1a}-3.05\%$
test_items_nested_locked 0.5129ms 0.3610ms 2.7699 KOps/s 2.8101 KOps/s $\color{#d91a1a}-1.43\%$
test_items_nested_leaf 90.5420μs 62.0766μs 16.1091 KOps/s 16.7041 KOps/s $\color{#d91a1a}-3.56\%$
test_items_stack_nested 0.4212ms 0.3656ms 2.7351 KOps/s 2.7627 KOps/s $\color{#d91a1a}-1.00\%$
test_items_stack_nested_leaf 96.7910μs 62.8445μs 15.9123 KOps/s 16.5794 KOps/s $\color{#d91a1a}-4.02\%$
test_items_stack_nested_locked 0.4259ms 0.3675ms 2.7212 KOps/s 2.8112 KOps/s $\color{#d91a1a}-3.20\%$
test_keys 25.3700μs 3.4753μs 287.7452 KOps/s 287.9181 KOps/s $\color{#d91a1a}-0.06\%$
test_keys_nested 0.1222ms 82.9988μs 12.0484 KOps/s 13.8706 KOps/s $\textbf{\color{#d91a1a}-13.14\%}$
test_keys_nested_locked 0.7744ms 88.8801μs 11.2511 KOps/s 12.6655 KOps/s $\textbf{\color{#d91a1a}-11.17\%}$
test_keys_nested_leaf 0.1125ms 73.4958μs 13.6062 KOps/s 15.6792 KOps/s $\textbf{\color{#d91a1a}-13.22\%}$
test_keys_stack_nested 0.1165ms 84.2001μs 11.8765 KOps/s 13.9999 KOps/s $\textbf{\color{#d91a1a}-15.17\%}$
test_keys_stack_nested_leaf 0.1770ms 74.7232μs 13.3827 KOps/s 15.5928 KOps/s $\textbf{\color{#d91a1a}-14.17\%}$
test_keys_stack_nested_locked 0.1253ms 89.1143μs 11.2215 KOps/s 12.6503 KOps/s $\textbf{\color{#d91a1a}-11.29\%}$
test_values 9.6418μs 0.8676μs 1.1526 MOps/s 1.1585 MOps/s $\color{#d91a1a}-0.51\%$
test_values_nested 64.3410μs 34.7031μs 28.8159 KOps/s 32.1852 KOps/s $\textbf{\color{#d91a1a}-10.47\%}$
test_values_nested_locked 85.3810μs 35.7563μs 27.9671 KOps/s 30.4057 KOps/s $\textbf{\color{#d91a1a}-8.02\%}$
test_values_nested_leaf 69.9410μs 39.6482μs 25.2218 KOps/s 29.2303 KOps/s $\textbf{\color{#d91a1a}-13.71\%}$
test_values_stack_nested 95.1320μs 35.3759μs 28.2678 KOps/s 30.8504 KOps/s $\textbf{\color{#d91a1a}-8.37\%}$
test_values_stack_nested_leaf 66.5010μs 40.0030μs 24.9982 KOps/s 28.9424 KOps/s $\textbf{\color{#d91a1a}-13.63\%}$
test_values_stack_nested_locked 67.0110μs 36.6113μs 27.3140 KOps/s 29.5472 KOps/s $\textbf{\color{#d91a1a}-7.56\%}$
test_membership 2.7740μs 0.5121μs 1.9527 MOps/s 1.8616 MOps/s $\color{#35bf28}+4.90\%$
test_membership_nested 52.3210μs 2.0670μs 483.7943 KOps/s 474.2127 KOps/s $\color{#35bf28}+2.02\%$
test_membership_nested_leaf 14.7905μs 2.0208μs 494.8570 KOps/s 484.5995 KOps/s $\color{#35bf28}+2.12\%$
test_membership_stacked_nested 20.9600μs 2.1093μs 474.0985 KOps/s 461.1095 KOps/s $\color{#35bf28}+2.82\%$
test_membership_stacked_nested_leaf 33.8600μs 2.0906μs 478.3253 KOps/s 465.5680 KOps/s $\color{#35bf28}+2.74\%$
test_membership_nested_last 45.6010μs 3.0739μs 325.3243 KOps/s 328.6244 KOps/s $\color{#d91a1a}-1.00\%$
test_membership_nested_leaf_last 28.3410μs 3.1278μs 319.7113 KOps/s 328.7420 KOps/s $\color{#d91a1a}-2.75\%$
test_membership_stacked_nested_last 39.1710μs 3.0644μs 326.3239 KOps/s 330.0151 KOps/s $\color{#d91a1a}-1.12\%$
test_membership_stacked_nested_leaf_last 24.8200μs 3.0655μs 326.2060 KOps/s 331.5572 KOps/s $\color{#d91a1a}-1.61\%$
test_nested_getleaf 47.4610μs 6.1438μs 162.7652 KOps/s 157.2476 KOps/s $\color{#35bf28}+3.51\%$
test_nested_get 38.4010μs 5.8355μs 171.3660 KOps/s 165.1974 KOps/s $\color{#35bf28}+3.73\%$
test_stacked_getleaf 29.5000μs 6.1367μs 162.9530 KOps/s 160.3164 KOps/s $\color{#35bf28}+1.64\%$
test_stacked_get 28.5510μs 5.8423μs 171.1652 KOps/s 166.5217 KOps/s $\color{#35bf28}+2.79\%$
test_nested_getitemleaf 31.4410μs 6.2839μs 159.1361 KOps/s 154.7002 KOps/s $\color{#35bf28}+2.87\%$
test_nested_getitem 40.2400μs 5.9254μs 168.7640 KOps/s 166.7390 KOps/s $\color{#35bf28}+1.21\%$
test_stacked_getitemleaf 40.4310μs 6.2714μs 159.4532 KOps/s 157.1397 KOps/s $\color{#35bf28}+1.47\%$
test_stacked_getitem 31.3600μs 5.9309μs 168.6088 KOps/s 163.7026 KOps/s $\color{#35bf28}+3.00\%$
test_lock_nested 1.1208ms 0.3741ms 2.6728 KOps/s 2.5712 KOps/s $\color{#35bf28}+3.95\%$
test_lock_stack_nested 0.4240ms 0.3490ms 2.8655 KOps/s 2.8474 KOps/s $\color{#35bf28}+0.64\%$
test_unlock_nested 0.6795ms 0.3168ms 3.1562 KOps/s 3.0682 KOps/s $\color{#35bf28}+2.87\%$
test_unlock_stack_nested 0.3636ms 0.2875ms 3.4781 KOps/s 3.4278 KOps/s $\color{#35bf28}+1.47\%$
test_flatten_speed 0.1298ms 78.1367μs 12.7981 KOps/s 13.3723 KOps/s $\color{#d91a1a}-4.29\%$
test_unflatten_speed 0.4002ms 0.3301ms 3.0293 KOps/s 3.2554 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_common_ops 1.5884ms 0.6294ms 1.5888 KOps/s 1.6487 KOps/s $\color{#d91a1a}-3.63\%$
test_creation 0.1053ms 1.7682μs 565.5322 KOps/s 661.3696 KOps/s $\textbf{\color{#d91a1a}-14.49\%}$
test_creation_empty 28.8810μs 8.9199μs 112.1086 KOps/s 146.7376 KOps/s $\textbf{\color{#d91a1a}-23.60\%}$
test_creation_nested_1 37.8910μs 10.7207μs 93.2774 KOps/s 120.1179 KOps/s $\textbf{\color{#d91a1a}-22.35\%}$
test_creation_nested_2 44.8300μs 13.4249μs 74.4885 KOps/s 90.8115 KOps/s $\textbf{\color{#d91a1a}-17.97\%}$
test_clone 0.1220ms 10.9457μs 91.3599 KOps/s 86.4203 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_getitem[int] 1.7520ms 10.6828μs 93.6081 KOps/s 87.1197 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_getitem[slice_int] 0.1090ms 20.9910μs 47.6394 KOps/s 44.0688 KOps/s $\textbf{\color{#35bf28}+8.10\%}$
test_getitem[range] 0.1368ms 39.2152μs 25.5003 KOps/s 24.8989 KOps/s $\color{#35bf28}+2.42\%$
test_getitem[tuple] 0.1271ms 18.3903μs 54.3764 KOps/s 50.2955 KOps/s $\textbf{\color{#35bf28}+8.11\%}$
test_getitem[list] 0.1455ms 33.5030μs 29.8481 KOps/s 28.5334 KOps/s $\color{#35bf28}+4.61\%$
test_setitem_dim[int] 38.3910μs 18.7706μs 53.2748 KOps/s 49.6523 KOps/s $\textbf{\color{#35bf28}+7.30\%}$
test_setitem_dim[slice_int] 58.7310μs 38.5802μs 25.9200 KOps/s 24.4292 KOps/s $\textbf{\color{#35bf28}+6.10\%}$
test_setitem_dim[range] 95.3720μs 56.9753μs 17.5515 KOps/s 18.1715 KOps/s $\color{#d91a1a}-3.41\%$
test_setitem_dim[tuple] 71.7110μs 33.7851μs 29.5989 KOps/s 29.8639 KOps/s $\color{#d91a1a}-0.89\%$
test_setitem 0.1312ms 16.1165μs 62.0480 KOps/s 64.2920 KOps/s $\color{#d91a1a}-3.49\%$
test_set 0.1331ms 15.1291μs 66.0977 KOps/s 64.7493 KOps/s $\color{#35bf28}+2.08\%$
test_set_shared 1.5744ms 0.1516ms 6.5972 KOps/s 6.5463 KOps/s $\color{#35bf28}+0.78\%$
test_update 0.2485ms 18.4473μs 54.2083 KOps/s 57.0281 KOps/s $\color{#d91a1a}-4.94\%$
test_update_nested 0.1419ms 23.8098μs 41.9995 KOps/s 44.3927 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_update__nested 1.2569ms 25.5365μs 39.1597 KOps/s 38.6332 KOps/s $\color{#35bf28}+1.36\%$
test_set_nested 0.1403ms 16.9202μs 59.1010 KOps/s 60.8005 KOps/s $\color{#d91a1a}-2.80\%$
test_set_nested_new 0.1348ms 19.3798μs 51.6000 KOps/s 53.2783 KOps/s $\color{#d91a1a}-3.15\%$
test_select 0.1441ms 32.1895μs 31.0660 KOps/s 32.6681 KOps/s $\color{#d91a1a}-4.90\%$
test_select_nested 80.2310μs 43.6008μs 22.9354 KOps/s 23.3428 KOps/s $\color{#d91a1a}-1.75\%$
test_exclude_nested 94.4910μs 63.1670μs 15.8311 KOps/s 15.8718 KOps/s $\color{#d91a1a}-0.26\%$
test_empty[True] 0.3608ms 0.2954ms 3.3856 KOps/s 3.6003 KOps/s $\textbf{\color{#d91a1a}-5.96\%}$
test_empty[False] 4.7661μs 0.8328μs 1.2007 MOps/s 1.3148 MOps/s $\textbf{\color{#d91a1a}-8.67\%}$
test_to 87.7410μs 57.0937μs 17.5151 KOps/s 17.8096 KOps/s $\color{#d91a1a}-1.65\%$
test_to_nonblocking 99.6310μs 49.3781μs 20.2519 KOps/s 20.5775 KOps/s $\color{#d91a1a}-1.58\%$
test_unbind_speed 0.8189ms 0.2349ms 4.2575 KOps/s 4.0703 KOps/s $\color{#35bf28}+4.60\%$
test_unbind_speed_stack0 0.3421ms 0.2442ms 4.0942 KOps/s 4.0615 KOps/s $\color{#35bf28}+0.80\%$
test_unbind_speed_stack1 95.2797ms 0.6798ms 1.4710 KOps/s 1.4664 KOps/s $\color{#35bf28}+0.31\%$
test_split 95.0024ms 1.6256ms 615.1755 Ops/s 591.9639 Ops/s $\color{#35bf28}+3.92\%$
test_chunk 95.9126ms 1.6182ms 617.9616 Ops/s 590.3613 Ops/s $\color{#35bf28}+4.68\%$
test_consolidate[False-None] 97.4550ms 3.0399ms 328.9544 Ops/s 336.7065 Ops/s $\color{#d91a1a}-2.30\%$
test_consolidate[default-None] 1.8520ms 1.7404ms 574.5898 Ops/s 551.2569 Ops/s $\color{#35bf28}+4.23\%$
test_consolidate[reduce-overhead-None] 1.8892ms 1.7743ms 563.5982 Ops/s 539.2642 Ops/s $\color{#35bf28}+4.51\%$
test_consolidate_njt[False-None] 6.9065ms 6.6419ms 150.5601 Ops/s 148.6335 Ops/s $\color{#35bf28}+1.30\%$
test_to[False-False-None] 1.9138ms 1.7851ms 560.1840 Ops/s 572.3375 Ops/s $\color{#d91a1a}-2.12\%$
test_to[True-False-None] 1.4907ms 1.3617ms 734.3678 Ops/s 711.7779 Ops/s $\color{#35bf28}+3.17\%$
test_to[within-False-None] 4.4107ms 4.2404ms 235.8257 Ops/s 238.5648 Ops/s $\color{#d91a1a}-1.15\%$
test_to[True-default-None] 5.7922ms 5.4148ms 184.6777 Ops/s 184.7076 Ops/s $\color{#d91a1a}-0.02\%$
test_to_njt[False-False-None] 7.3211ms 7.0863ms 141.1174 Ops/s 142.2231 Ops/s $\color{#d91a1a}-0.78\%$
test_to_njt[True-False-None] 5.8400ms 5.5989ms 178.6054 Ops/s 180.4009 Ops/s $\color{#d91a1a}-1.00\%$
test_to_njt[within-False-None] 12.7418ms 12.4513ms 80.3127 Ops/s 80.0480 Ops/s $\color{#35bf28}+0.33\%$
test_creation[device0] 0.4561ms 80.4050μs 12.4370 KOps/s 12.2681 KOps/s $\color{#35bf28}+1.38\%$
test_creation_from_tensor 0.5593ms 85.0230μs 11.7615 KOps/s 11.5543 KOps/s $\color{#35bf28}+1.79\%$
test_add_one[memmap_tensor0] 0.4429ms 7.0705μs 141.4329 KOps/s 126.0567 KOps/s $\textbf{\color{#35bf28}+12.20\%}$
test_contiguous[memmap_tensor0] 3.1321μs 0.4244μs 2.3561 MOps/s 2.4227 MOps/s $\color{#d91a1a}-2.75\%$
test_stack[memmap_tensor0] 38.5310μs 4.4551μs 224.4595 KOps/s 199.6745 KOps/s $\textbf{\color{#35bf28}+12.41\%}$
test_memmaptd_index 1.6503ms 0.2586ms 3.8666 KOps/s 3.6256 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_memmaptd_index_astensor 0.6116ms 0.3207ms 3.1185 KOps/s 3.0446 KOps/s $\color{#35bf28}+2.43\%$
test_memmaptd_index_op 1.0535ms 0.6193ms 1.6148 KOps/s 1.6425 KOps/s $\color{#d91a1a}-1.68\%$
test_serialize_model 0.1325s 0.1312s 7.6208 Ops/s 7.5680 Ops/s $\color{#35bf28}+0.70\%$
test_serialize_model_pickle 1.3503s 1.2173s 0.8215 Ops/s 0.8393 Ops/s $\color{#d91a1a}-2.12\%$
test_serialize_weights 0.1327s 0.1306s 7.6573 Ops/s 7.6863 Ops/s $\color{#d91a1a}-0.38\%$
test_serialize_weights_returnearly 0.4226s 68.3957ms 14.6208 Ops/s 14.3411 Ops/s $\color{#35bf28}+1.95\%$
test_serialize_weights_pickle 1.3792s 1.1923s 0.8387 Ops/s 0.8379 Ops/s $\color{#35bf28}+0.10\%$
test_reshape_pytree 54.4510μs 22.7528μs 43.9507 KOps/s 42.8722 KOps/s $\color{#35bf28}+2.52\%$
test_reshape_td 54.8210μs 27.0139μs 37.0180 KOps/s 33.8452 KOps/s $\textbf{\color{#35bf28}+9.37\%}$
test_view_pytree 54.6310μs 23.0423μs 43.3985 KOps/s 43.3809 KOps/s $\color{#35bf28}+0.04\%$
test_view_td 94.6610μs 31.4306μs 31.8161 KOps/s 31.1678 KOps/s $\color{#35bf28}+2.08\%$
test_unbind_pytree 92.7610μs 28.8517μs 34.6600 KOps/s 33.2513 KOps/s $\color{#35bf28}+4.24\%$
test_unbind_td 0.7683ms 36.0847μs 27.7125 KOps/s 27.2373 KOps/s $\color{#35bf28}+1.74\%$
test_split_pytree 62.4810μs 30.8040μs 32.4633 KOps/s 32.0720 KOps/s $\color{#35bf28}+1.22\%$
test_split_td 0.9507ms 39.0221μs 25.6265 KOps/s 24.1281 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_add_pytree 88.8110μs 34.7900μs 28.7439 KOps/s 27.1096 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_add_td 77.3010μs 48.6865μs 20.5396 KOps/s 20.6672 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_add_one_nested[tensordict-compile] 0.1760ms 0.1232ms 8.1158 KOps/s 7.9707 KOps/s $\color{#35bf28}+1.82\%$
test_compile_add_one_nested[tensordict-eager] 0.2373ms 0.1306ms 7.6583 KOps/s 7.8328 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_one_nested[pytree-compile] 0.2956ms 97.4549μs 10.2612 KOps/s 10.1563 KOps/s $\color{#35bf28}+1.03\%$
test_compile_add_one_nested[pytree-eager] 0.2172ms 0.1499ms 6.6692 KOps/s 6.3092 KOps/s $\textbf{\color{#35bf28}+5.71\%}$
test_compile_copy_nested[tensordict-compile] 71.5310μs 23.7362μs 42.1297 KOps/s 31.5450 KOps/s $\textbf{\color{#35bf28}+33.55\%}$
test_compile_copy_nested[tensordict-eager] 70.6210μs 29.5202μs 33.8751 KOps/s 35.8583 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_compile_copy_nested[pytree-compile] 0.3350ms 65.9795μs 15.1562 KOps/s 15.4409 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_copy_nested[pytree-eager] 81.4810μs 49.5706μs 20.1733 KOps/s 19.6945 KOps/s $\color{#35bf28}+2.43\%$
test_compile_add_one_flat[tensordict-compile] 0.1889ms 0.1460ms 6.8487 KOps/s 6.8371 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensordict-eager] 0.3583ms 0.2191ms 4.5648 KOps/s 4.7101 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_add_one_flat[tensorclass-compile] 0.2337ms 0.1074ms 9.3130 KOps/s 9.5634 KOps/s $\color{#d91a1a}-2.62\%$
test_compile_add_one_flat[tensorclass-eager] 0.1428ms 56.8089μs 17.6029 KOps/s 18.7853 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_compile_add_one_flat[pytree-compile] 0.1792ms 0.1388ms 7.2026 KOps/s 7.2229 KOps/s $\color{#d91a1a}-0.28\%$
test_compile_add_one_flat[pytree-eager] 0.5540ms 0.4880ms 2.0490 KOps/s 1.9300 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_compile_add_self_flat[tensordict-eager] 0.3999ms 0.2656ms 3.7646 KOps/s 3.9564 KOps/s $\color{#d91a1a}-4.85\%$
test_compile_add_self_flat[tensordict-compile] 0.2283ms 0.1578ms 6.3356 KOps/s 6.9679 KOps/s $\textbf{\color{#d91a1a}-9.07\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1984ms 69.6334μs 14.3609 KOps/s 15.4247 KOps/s $\textbf{\color{#d91a1a}-6.90\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2054ms 0.1061ms 9.4217 KOps/s 10.0548 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_compile_add_self_flat[pytree-eager] 0.5830ms 0.4277ms 2.3382 KOps/s 2.3103 KOps/s $\color{#35bf28}+1.21\%$
test_compile_add_self_flat[pytree-compile] 0.2278ms 0.1416ms 7.0630 KOps/s 7.2791 KOps/s $\color{#d91a1a}-2.97\%$
test_compile_copy_flat[tensordict-compile] 0.1112ms 21.1035μs 47.3854 KOps/s 52.6410 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_compile_copy_flat[tensordict-eager] 0.1171ms 31.1034μs 32.1509 KOps/s 36.7809 KOps/s $\textbf{\color{#d91a1a}-12.59\%}$
test_compile_copy_flat[pytree-compile] 0.1665ms 71.5292μs 13.9803 KOps/s 14.0271 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_copy_flat[pytree-eager] 89.7120μs 52.1881μs 19.1615 KOps/s 18.7684 KOps/s $\color{#35bf28}+2.09\%$
test_compile_assign_and_add[tensordict-compile] 1.7831ms 0.4217ms 2.3715 KOps/s 2.1698 KOps/s $\textbf{\color{#35bf28}+9.29\%}$
test_compile_assign_and_add[tensordict-eager] 3.0859ms 2.7932ms 358.0118 Ops/s 369.1028 Ops/s $\color{#d91a1a}-3.00\%$
test_compile_assign_and_add[pytree-compile] 1.6854ms 0.4603ms 2.1724 KOps/s 2.1417 KOps/s $\color{#35bf28}+1.43\%$
test_compile_assign_and_add[pytree-eager] 2.7841ms 2.6714ms 374.3385 Ops/s 344.4674 Ops/s $\textbf{\color{#35bf28}+8.67\%}$
test_compile_indexing[tensor-tensordict-compile] 0.6743ms 0.1163ms 8.5969 KOps/s 8.1395 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5552ms 80.4096μs 12.4363 KOps/s 11.6037 KOps/s $\textbf{\color{#35bf28}+7.18\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.5047ms 0.1117ms 8.9490 KOps/s 8.7854 KOps/s $\color{#35bf28}+1.86\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1444ms 68.3671μs 14.6269 KOps/s 13.9402 KOps/s $\color{#35bf28}+4.93\%$
test_compile_indexing[tensor-pytree-compile] 0.2005ms 0.1097ms 9.1133 KOps/s 9.1394 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[tensor-pytree-eager] 0.1201ms 68.1339μs 14.6770 KOps/s 14.2721 KOps/s $\color{#35bf28}+2.84\%$
test_compile_indexing[slice-tensordict-compile] 0.1548ms 0.1028ms 9.7283 KOps/s 9.5121 KOps/s $\color{#35bf28}+2.27\%$
test_compile_indexing[slice-tensordict-eager] 0.1449ms 17.9003μs 55.8649 KOps/s 50.6696 KOps/s $\textbf{\color{#35bf28}+10.25\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1564ms 98.6346μs 10.1384 KOps/s 10.0737 KOps/s $\color{#35bf28}+0.64\%$
test_compile_indexing[slice-tensorclass-eager] 47.0410μs 15.9966μs 62.5133 KOps/s 50.8888 KOps/s $\textbf{\color{#35bf28}+22.84\%}$
test_compile_indexing[slice-pytree-compile] 0.1544ms 98.8590μs 10.1154 KOps/s 9.4806 KOps/s $\textbf{\color{#35bf28}+6.70\%}$
test_compile_indexing[slice-pytree-eager] 50.2710μs 15.9641μs 62.6407 KOps/s 59.2574 KOps/s $\textbf{\color{#35bf28}+5.71\%}$
test_compile_indexing[int-tensordict-compile] 0.1457ms 0.1039ms 9.6221 KOps/s 9.5414 KOps/s $\color{#35bf28}+0.85\%$
test_compile_indexing[int-tensordict-eager] 0.6377ms 17.4311μs 57.3688 KOps/s 53.4373 KOps/s $\textbf{\color{#35bf28}+7.36\%}$
test_compile_indexing[int-tensorclass-compile] 0.1556ms 99.3926μs 10.0611 KOps/s 9.9750 KOps/s $\color{#35bf28}+0.86\%$
test_compile_indexing[int-tensorclass-eager] 56.0710μs 16.0093μs 62.4638 KOps/s 57.6447 KOps/s $\textbf{\color{#35bf28}+8.36\%}$
test_compile_indexing[int-pytree-compile] 0.2034ms 99.7593μs 10.0241 KOps/s 10.0147 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[int-pytree-eager] 0.3879ms 15.9823μs 62.5691 KOps/s 60.7889 KOps/s $\color{#35bf28}+2.93\%$
test_mod_add[eager] 82.7810μs 39.0159μs 25.6305 KOps/s 26.1643 KOps/s $\color{#d91a1a}-2.04\%$
test_mod_add[compile] 0.3558ms 83.7414μs 11.9415 KOps/s 11.9919 KOps/s $\color{#d91a1a}-0.42\%$
test_mod_add[compile-overhead] 0.3317ms 0.1701ms 5.8801 KOps/s 5.5860 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_mod_wrap[eager] 0.3312ms 0.2551ms 3.9197 KOps/s 3.8794 KOps/s $\color{#35bf28}+1.04\%$
test_mod_wrap[compile] 0.8106ms 0.3006ms 3.3262 KOps/s 3.3955 KOps/s $\color{#d91a1a}-2.04\%$
test_mod_wrap[compile-overhead] 7.0813ms 3.7553ms 266.2900 Ops/s 269.3785 Ops/s $\color{#d91a1a}-1.15\%$
test_mod_wrap_and_backward[eager] 1.5237ms 1.3838ms 722.6359 Ops/s 704.2817 Ops/s $\color{#35bf28}+2.61\%$
test_mod_wrap_and_backward[compile] 1.3926ms 1.2793ms 781.6905 Ops/s 755.0722 Ops/s $\color{#35bf28}+3.53\%$
test_mod_wrap_and_backward[compile-overhead] 1.3723ms 0.9403ms 1.0635 KOps/s 1.0478 KOps/s $\color{#35bf28}+1.50\%$
test_seq_add[eager] 0.2097ms 0.1168ms 8.5602 KOps/s 8.6771 KOps/s $\color{#d91a1a}-1.35\%$
test_seq_add[compile] 0.1599ms 94.1626μs 10.6199 KOps/s 10.8749 KOps/s $\color{#d91a1a}-2.34\%$
test_seq_add[compile-overhead] 0.2343ms 0.1339ms 7.4689 KOps/s 7.5709 KOps/s $\color{#d91a1a}-1.35\%$
test_seq_wrap[eager] 0.5728ms 0.4281ms 2.3361 KOps/s 2.3429 KOps/s $\color{#d91a1a}-0.29\%$
test_seq_wrap[compile] 0.4283ms 0.3102ms 3.2236 KOps/s 3.2303 KOps/s $\color{#d91a1a}-0.21\%$
test_seq_wrap[compile-overhead] 0.3750ms 0.2308ms 4.3332 KOps/s 4.3001 KOps/s $\color{#35bf28}+0.77\%$
test_func_call_runtime[False-eager] 0.8162ms 0.7399ms 1.3515 KOps/s 1.2427 KOps/s $\textbf{\color{#35bf28}+8.76\%}$
test_func_call_runtime[False-compile] 0.8587ms 0.7555ms 1.3236 KOps/s 1.2489 KOps/s $\textbf{\color{#35bf28}+5.98\%}$
test_func_call_runtime[False-compile-overhead] 0.4657ms 0.3796ms 2.6346 KOps/s 2.6977 KOps/s $\color{#d91a1a}-2.34\%$
test_func_call_runtime[True-eager] 1.0743ms 0.9694ms 1.0316 KOps/s 1.0602 KOps/s $\color{#d91a1a}-2.70\%$
test_func_call_runtime[True-compile] 0.9143ms 0.7774ms 1.2863 KOps/s 1.2377 KOps/s $\color{#35bf28}+3.93\%$
test_func_call_runtime[True-compile-overhead] 0.4702ms 0.3930ms 2.5445 KOps/s 2.5619 KOps/s $\color{#d91a1a}-0.68\%$
test_func_call_cm_runtime[False-eager] 0.9544ms 0.7857ms 1.2728 KOps/s 1.2835 KOps/s $\color{#d91a1a}-0.83\%$
test_func_call_cm_runtime[False-compile] 1.3580ms 0.7659ms 1.3056 KOps/s 1.2940 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4501ms 0.3785ms 2.6423 KOps/s 2.6653 KOps/s $\color{#d91a1a}-0.86\%$
test_func_call_cm_runtime[True-eager] 1.2370ms 1.0735ms 931.5708 Ops/s 961.9666 Ops/s $\color{#d91a1a}-3.16\%$
test_func_call_cm_runtime[True-compile] 0.8721ms 0.8019ms 1.2470 KOps/s 1.2147 KOps/s $\color{#35bf28}+2.66\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4783ms 0.4156ms 2.4060 KOps/s 2.4021 KOps/s $\color{#35bf28}+0.17\%$
test_vmap_func_call_cm_runtime[eager] 2.6671ms 2.1381ms 467.6956 Ops/s 473.0842 Ops/s $\color{#d91a1a}-1.14\%$
test_vmap_func_call_cm_runtime[compile] 0.8936ms 0.8220ms 1.2165 KOps/s 1.1951 KOps/s $\color{#35bf28}+1.79\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5037ms 0.4203ms 2.3794 KOps/s 2.3939 KOps/s $\color{#d91a1a}-0.61\%$
test_distributed 3.0150ms 0.1826ms 5.4759 KOps/s 8.2448 KOps/s $\textbf{\color{#d91a1a}-33.58\%}$
test_tdmodule 41.7700μs 19.8335μs 50.4197 KOps/s 54.6574 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_tdmodule_dispatch 75.6210μs 35.8824μs 27.8688 KOps/s 30.3058 KOps/s $\textbf{\color{#d91a1a}-8.04\%}$
test_tdseq 33.6300μs 19.9755μs 50.0614 KOps/s 55.9266 KOps/s $\textbf{\color{#d91a1a}-10.49\%}$
test_tdseq_dispatch 60.7710μs 37.9953μs 26.3190 KOps/s 28.8713 KOps/s $\textbf{\color{#d91a1a}-8.84\%}$
test_instantiation_functorch 1.7152ms 1.5797ms 633.0336 Ops/s 629.4091 Ops/s $\color{#35bf28}+0.58\%$
test_exec_functorch 0.2538ms 0.1466ms 6.8190 KOps/s 6.5866 KOps/s $\color{#35bf28}+3.53\%$
test_exec_functional_call 0.1974ms 0.1386ms 7.2142 KOps/s 6.8697 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_exec_td_decorator 0.4017ms 0.1935ms 5.1682 KOps/s 5.2296 KOps/s $\color{#d91a1a}-1.17\%$
test_vmap_mlp_speed_decorator[True-True] 0.8176ms 0.7056ms 1.4173 KOps/s 1.4412 KOps/s $\color{#d91a1a}-1.65\%$
test_vmap_mlp_speed_decorator[True-False] 0.8472ms 0.7065ms 1.4155 KOps/s 1.4341 KOps/s $\color{#d91a1a}-1.30\%$
test_vmap_mlp_speed_decorator[False-True] 0.7380ms 0.6082ms 1.6442 KOps/s 1.6476 KOps/s $\color{#d91a1a}-0.21\%$
test_vmap_mlp_speed_decorator[False-False] 0.7366ms 0.6126ms 1.6324 KOps/s 1.6535 KOps/s $\color{#d91a1a}-1.28\%$
test_vmap_transformer_speed_decorator[True-True] 20.9042ms 19.7724ms 50.5755 Ops/s 51.1565 Ops/s $\color{#d91a1a}-1.14\%$
test_vmap_transformer_speed_decorator[True-False] 20.2292ms 19.7298ms 50.6848 Ops/s 51.1834 Ops/s $\color{#d91a1a}-0.97\%$
test_vmap_transformer_speed_decorator[False-True] 19.8349ms 19.4554ms 51.3996 Ops/s 51.3035 Ops/s $\color{#35bf28}+0.19\%$
test_vmap_transformer_speed_decorator[False-False] 19.8995ms 19.6240ms 50.9580 Ops/s 51.1823 Ops/s $\color{#d91a1a}-0.44\%$
test_to_module_speed[True] 1.1094ms 0.9885ms 1.0117 KOps/s 1.0553 KOps/s $\color{#d91a1a}-4.13\%$
test_to_module_speed[False] 1.0819ms 0.9642ms 1.0371 KOps/s 1.0864 KOps/s $\color{#d91a1a}-4.53\%$
test_tc_init 76.5610μs 39.1614μs 25.5353 KOps/s 28.0883 KOps/s $\textbf{\color{#d91a1a}-9.09\%}$
test_tc_init_nested 0.1203ms 79.5853μs 12.5651 KOps/s 14.1039 KOps/s $\textbf{\color{#d91a1a}-10.91\%}$
test_tc_first_layer_tensor 6.2844μs 0.7018μs 1.4249 MOps/s 1.4162 MOps/s $\color{#35bf28}+0.62\%$
test_tc_first_layer_nontensor 32.2210μs 2.3223μs 430.6046 KOps/s 434.8643 KOps/s $\color{#d91a1a}-0.98\%$
test_tc_second_layer_tensor 21.6110μs 1.4952μs 668.7876 KOps/s 709.7880 KOps/s $\textbf{\color{#d91a1a}-5.78\%}$
test_tc_second_layer_nontensor 51.7510μs 2.9882μs 334.6454 KOps/s 333.1105 KOps/s $\color{#35bf28}+0.46\%$
test_unbind 0.2223s 12.4299ms 80.4510 Ops/s 151.3654 Ops/s $\textbf{\color{#d91a1a}-46.85\%}$
test_full_like 9.3327ms 9.1299ms 109.5307 Ops/s 108.7212 Ops/s $\color{#35bf28}+0.74\%$
test_zeros_like 4.9736ms 4.3323ms 230.8232 Ops/s 235.8628 Ops/s $\color{#d91a1a}-2.14\%$
test_ones_like 4.4819ms 4.3293ms 230.9830 Ops/s 231.1349 Ops/s $\color{#d91a1a}-0.07\%$
test_clone 6.5080ms 6.3876ms 156.5533 Ops/s 109.4972 Ops/s $\textbf{\color{#35bf28}+42.97\%}$
test_squeeze 61.4010μs 10.8420μs 92.2341 KOps/s 103.7391 KOps/s $\textbf{\color{#d91a1a}-11.09\%}$
test_unsqueeze 0.1531ms 72.9781μs 13.7027 KOps/s 12.9282 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_split 0.3676ms 0.1602ms 6.2435 KOps/s 5.8298 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_permute 0.2435ms 0.1833ms 5.4563 KOps/s 5.4875 KOps/s $\color{#d91a1a}-0.57\%$
test_stack 51.5269ms 50.9897ms 19.6118 Ops/s 19.7146 Ops/s $\color{#d91a1a}-0.52\%$
test_cat 51.2405ms 50.6564ms 19.7408 Ops/s 19.8026 Ops/s $\color{#d91a1a}-0.31\%$

@vmoens vmoens merged commit 90010ab into gh/vmoens/35/base Dec 16, 2024
34 of 55 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: c6a8d4587df45e374f0d6cb59fe1c982c7818276
Pull Request resolved: #1139
@vmoens vmoens deleted the gh/vmoens/35/head branch December 16, 2024 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants