-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Better compile checks #1139
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Dec 15, 2024
ghstack-source-id: c6a8d4587df45e374f0d6cb59fe1c982c7818276 Pull Request resolved: #1139
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 15, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1601ms | 21.2751μs | 47.0032 KOps/s | 60.4768 KOps/s | |
test_plain_set_stack_nested | 61.6640μs | 21.4202μs | 46.6849 KOps/s | 59.2392 KOps/s | |
test_plain_set_nested_inplace | 62.8570μs | 23.1178μs | 43.2567 KOps/s | 53.3940 KOps/s | |
test_plain_set_stack_nested_inplace | 67.1550μs | 23.0113μs | 43.4569 KOps/s | 53.4481 KOps/s | |
test_items | 45.3040μs | 4.1854μs | 238.9266 KOps/s | 241.6223 KOps/s | |
test_items_nested | 0.6948ms | 0.4389ms | 2.2786 KOps/s | 2.5095 KOps/s | |
test_items_nested_locked | 0.9044ms | 0.4378ms | 2.2842 KOps/s | 2.5117 KOps/s | |
test_items_nested_leaf | 0.1503ms | 77.2468μs | 12.9455 KOps/s | 13.8968 KOps/s | |
test_items_stack_nested | 0.9527ms | 0.4455ms | 2.2447 KOps/s | 2.4887 KOps/s | |
test_items_stack_nested_leaf | 0.1709ms | 81.0893μs | 12.3321 KOps/s | 13.4219 KOps/s | |
test_items_stack_nested_locked | 0.6031ms | 0.4411ms | 2.2670 KOps/s | 2.5049 KOps/s | |
test_keys | 21.7400μs | 3.5013μs | 285.6063 KOps/s | 266.7866 KOps/s | |
test_keys_nested | 0.3059ms | 0.1682ms | 5.9451 KOps/s | 7.0685 KOps/s | |
test_keys_nested_locked | 0.5574ms | 0.1745ms | 5.7322 KOps/s | 7.0806 KOps/s | |
test_keys_nested_leaf | 1.5829ms | 0.1475ms | 6.7798 KOps/s | 8.6060 KOps/s | |
test_keys_stack_nested | 0.2888ms | 0.1656ms | 6.0403 KOps/s | 7.3496 KOps/s | |
test_keys_stack_nested_leaf | 0.2687ms | 0.1447ms | 6.9091 KOps/s | 8.6128 KOps/s | |
test_keys_stack_nested_locked | 0.2845ms | 0.1710ms | 5.8496 KOps/s | 7.1155 KOps/s | |
test_values | 6.9468μs | 1.0378μs | 963.5858 KOps/s | 953.4912 KOps/s | |
test_values_nested | 0.1033ms | 63.0724μs | 15.8548 KOps/s | 18.0447 KOps/s | |
test_values_nested_locked | 0.1035ms | 62.6917μs | 15.9511 KOps/s | 18.1903 KOps/s | |
test_values_nested_leaf | 0.1285ms | 72.8137μs | 13.7337 KOps/s | 16.4306 KOps/s | |
test_values_stack_nested | 0.1122ms | 62.8224μs | 15.9179 KOps/s | 17.5159 KOps/s | |
test_values_stack_nested_leaf | 0.1244ms | 72.0903μs | 13.8715 KOps/s | 16.3166 KOps/s | |
test_values_stack_nested_locked | 0.1453ms | 63.4492μs | 15.7606 KOps/s | 17.5097 KOps/s | |
test_membership | 20.7190μs | 0.8835μs | 1.1319 MOps/s | 1.0505 MOps/s | |
test_membership_nested | 38.5820μs | 3.0259μs | 330.4787 KOps/s | 335.6853 KOps/s | |
test_membership_nested_leaf | 27.5110μs | 3.0531μs | 327.5333 KOps/s | 332.3051 KOps/s | |
test_membership_stacked_nested | 33.6930μs | 2.9964μs | 333.7381 KOps/s | 334.2799 KOps/s | |
test_membership_stacked_nested_leaf | 38.6120μs | 2.9870μs | 334.7847 KOps/s | 332.9141 KOps/s | |
test_membership_nested_last | 30.3870μs | 4.5473μs | 219.9127 KOps/s | 221.9799 KOps/s | |
test_membership_nested_leaf_last | 44.7940μs | 4.5053μs | 221.9622 KOps/s | 215.9217 KOps/s | |
test_membership_stacked_nested_last | 65.6000μs | 7.7436μs | 129.1396 KOps/s | 149.0092 KOps/s | |
test_membership_stacked_nested_leaf_last | 45.7750μs | 7.7249μs | 129.4509 KOps/s | 150.9018 KOps/s | |
test_nested_getleaf | 32.7810μs | 10.9051μs | 91.6998 KOps/s | 91.3735 KOps/s | |
test_nested_get | 47.8890μs | 10.4375μs | 95.8087 KOps/s | 95.2783 KOps/s | |
test_stacked_getleaf | 52.4280μs | 10.8358μs | 92.2863 KOps/s | 91.1768 KOps/s | |
test_stacked_get | 30.0460μs | 10.4899μs | 95.3298 KOps/s | 95.8264 KOps/s | |
test_nested_getitemleaf | 53.7600μs | 11.4565μs | 87.2867 KOps/s | 87.0140 KOps/s | |
test_nested_getitem | 53.3390μs | 10.5825μs | 94.4955 KOps/s | 92.2768 KOps/s | |
test_stacked_getitemleaf | 29.8450μs | 11.2160μs | 89.1583 KOps/s | 87.1392 KOps/s | |
test_stacked_getitem | 50.7040μs | 10.4045μs | 96.1124 KOps/s | 93.5930 KOps/s | |
test_lock_nested | 1.9882ms | 0.4688ms | 2.1330 KOps/s | 2.2343 KOps/s | |
test_lock_stack_nested | 0.6549ms | 0.4296ms | 2.3277 KOps/s | 2.4022 KOps/s | |
test_unlock_nested | 0.7754ms | 0.3824ms | 2.6148 KOps/s | 2.7116 KOps/s | |
test_unlock_stack_nested | 0.6658ms | 0.3507ms | 2.8511 KOps/s | 2.9722 KOps/s | |
test_flatten_speed | 0.1881ms | 0.1012ms | 9.8859 KOps/s | 10.5085 KOps/s | |
test_unflatten_speed | 0.7024ms | 0.5371ms | 1.8618 KOps/s | 2.0306 KOps/s | |
test_common_ops | 5.1693ms | 0.8583ms | 1.1651 KOps/s | 1.3433 KOps/s | |
test_creation | 17.4330μs | 2.5474μs | 392.5602 KOps/s | 475.4615 KOps/s | |
test_creation_empty | 40.8360μs | 12.6333μs | 79.1556 KOps/s | 112.7130 KOps/s | |
test_creation_nested_1 | 1.3860ms | 15.6810μs | 63.7716 KOps/s | 85.8118 KOps/s | |
test_creation_nested_2 | 44.0220μs | 20.5870μs | 48.5743 KOps/s | 62.1526 KOps/s | |
test_clone | 0.2292ms | 13.4537μs | 74.3291 KOps/s | 73.6873 KOps/s | |
test_getitem[int] | 0.9355ms | 12.9994μs | 76.9264 KOps/s | 78.1467 KOps/s | |
test_getitem[slice_int] | 0.1404ms | 25.7628μs | 38.8157 KOps/s | 39.8992 KOps/s | |
test_getitem[range] | 0.1827ms | 49.7482μs | 20.1012 KOps/s | 19.8761 KOps/s | |
test_getitem[tuple] | 0.1636ms | 20.9669μs | 47.6942 KOps/s | 47.6216 KOps/s | |
test_getitem[list] | 0.1719ms | 44.6124μs | 22.4153 KOps/s | 21.9638 KOps/s | |
test_setitem_dim[int] | 48.6210μs | 25.6067μs | 39.0523 KOps/s | 38.5655 KOps/s | |
test_setitem_dim[slice_int] | 91.1000μs | 52.8905μs | 18.9070 KOps/s | 18.9716 KOps/s | |
test_setitem_dim[range] | 0.1321ms | 73.2438μs | 13.6530 KOps/s | 13.1858 KOps/s | |
test_setitem_dim[tuple] | 77.9750μs | 40.8121μs | 24.5025 KOps/s | 23.7973 KOps/s | |
test_setitem | 62.8770μs | 20.7951μs | 48.0883 KOps/s | 52.1707 KOps/s | |
test_set | 0.2402ms | 20.3257μs | 49.1988 KOps/s | 53.3529 KOps/s | |
test_set_shared | 3.5833ms | 0.1738ms | 5.7536 KOps/s | 5.7818 KOps/s | |
test_update | 0.4297ms | 23.9401μs | 41.7709 KOps/s | 50.0148 KOps/s | |
test_update_nested | 0.3857ms | 34.5011μs | 28.9846 KOps/s | 32.0803 KOps/s | |
test_update__nested | 0.5373ms | 34.5379μs | 28.9537 KOps/s | 29.8909 KOps/s | |
test_set_nested | 0.3418ms | 22.5232μs | 44.3987 KOps/s | 48.1743 KOps/s | |
test_set_nested_new | 0.1208ms | 27.5077μs | 36.3535 KOps/s | 39.9675 KOps/s | |
test_select | 0.4270ms | 45.8990μs | 21.7870 KOps/s | 23.0409 KOps/s | |
test_select_nested | 0.1244ms | 64.7644μs | 15.4406 KOps/s | 15.8805 KOps/s | |
test_exclude_nested | 0.1744ms | 83.0230μs | 12.0449 KOps/s | 12.2873 KOps/s | |
test_empty[True] | 0.5302ms | 0.4334ms | 2.3072 KOps/s | 2.5984 KOps/s | |
test_empty[False] | 19.1605μs | 1.4391μs | 694.8950 KOps/s | 765.2208 KOps/s | |
test_unbind_speed | 0.4775ms | 0.2782ms | 3.5940 KOps/s | 3.7772 KOps/s | |
test_unbind_speed_stack0 | 0.4453ms | 0.2744ms | 3.6447 KOps/s | 3.8419 KOps/s | |
test_unbind_speed_stack1 | 0.1128s | 0.8223ms | 1.2161 KOps/s | 1.4123 KOps/s | |
test_split | 1.8109ms | 1.6003ms | 624.8825 Ops/s | 551.0864 Ops/s | |
test_chunk | 0.1087s | 1.7743ms | 563.6167 Ops/s | 552.1593 Ops/s | |
test_consolidate_njt[False-None] | 0.1178s | 9.1142ms | 109.7185 Ops/s | 121.4908 Ops/s | |
test_creation[device0] | 0.2371ms | 90.7709μs | 11.0168 KOps/s | 10.8480 KOps/s | |
test_creation_from_tensor | 0.2550ms | 93.2986μs | 10.7183 KOps/s | 9.5541 KOps/s | |
test_add_one[memmap_tensor0] | 0.7275ms | 4.8110μs | 207.8577 KOps/s | 207.3946 KOps/s | |
test_contiguous[memmap_tensor0] | 11.7320μs | 0.5123μs | 1.9519 MOps/s | 1.9247 MOps/s | |
test_stack[memmap_tensor0] | 62.7460μs | 3.4663μs | 288.4929 KOps/s | 302.4127 KOps/s | |
test_memmaptd_index | 1.0133ms | 0.2383ms | 4.1964 KOps/s | 4.1565 KOps/s | |
test_memmaptd_index_astensor | 0.6021ms | 0.3258ms | 3.0692 KOps/s | 3.1085 KOps/s | |
test_memmaptd_index_op | 1.0777ms | 0.6174ms | 1.6196 KOps/s | 1.7681 KOps/s | |
test_serialize_model | 0.1253s | 0.1186s | 8.4343 Ops/s | 8.6634 Ops/s | |
test_serialize_model_pickle | 0.4585s | 0.3896s | 2.5665 Ops/s | 2.4748 Ops/s | |
test_serialize_weights | 0.1256s | 0.1136s | 8.8033 Ops/s | 7.3756 Ops/s | |
test_serialize_weights_returnearly | 0.1719s | 0.1606s | 6.2271 Ops/s | 6.2076 Ops/s | |
test_serialize_weights_pickle | 0.5524s | 0.4468s | 2.2381 Ops/s | 2.3731 Ops/s | |
test_serialize_weights_filesystem | 0.2641s | 0.1579s | 6.3325 Ops/s | 6.8523 Ops/s | |
test_serialize_model_filesystem | 0.1612s | 0.1464s | 6.8298 Ops/s | 6.7044 Ops/s | |
test_reshape_pytree | 62.1660μs | 27.4916μs | 36.3747 KOps/s | 37.3885 KOps/s | |
test_reshape_td | 85.4690μs | 33.9414μs | 29.4625 KOps/s | 29.7091 KOps/s | |
test_view_pytree | 0.1056ms | 27.3163μs | 36.6082 KOps/s | 37.3977 KOps/s | |
test_view_td | 80.1290μs | 39.4284μs | 25.3625 KOps/s | 25.4465 KOps/s | |
test_unbind_pytree | 67.6860μs | 29.7352μs | 33.6301 KOps/s | 33.3355 KOps/s | |
test_unbind_td | 0.3612ms | 40.5888μs | 24.6373 KOps/s | 25.6660 KOps/s | |
test_split_pytree | 70.2210μs | 29.5767μs | 33.8103 KOps/s | 34.0714 KOps/s | |
test_split_td | 0.2807ms | 47.1211μs | 21.2219 KOps/s | 17.4594 KOps/s | |
test_add_pytree | 80.7100μs | 36.3403μs | 27.5177 KOps/s | 28.0306 KOps/s | |
test_add_td | 0.1281ms | 59.8981μs | 16.6950 KOps/s | 19.0266 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1279ms | 62.0627μs | 16.1127 KOps/s | 15.8706 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.6648ms | 0.1722ms | 5.8066 KOps/s | 6.2047 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1432ms | 46.1449μs | 21.6709 KOps/s | 21.8041 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2789ms | 0.1204ms | 8.3071 KOps/s | 8.4219 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1187ms | 26.3840μs | 37.9018 KOps/s | 37.9086 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1260ms | 58.8310μs | 16.9978 KOps/s | 18.3582 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1609ms | 79.9105μs | 12.5140 KOps/s | 12.6980 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1400ms | 68.3961μs | 14.6207 KOps/s | 14.7726 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2381ms | 0.1039ms | 9.6218 KOps/s | 9.4112 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4321ms | 0.2187ms | 4.5715 KOps/s | 4.9844 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1409ms | 44.7953μs | 22.3237 KOps/s | 21.6021 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4611ms | 66.5312μs | 15.0305 KOps/s | 16.0107 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2355ms | 0.1032ms | 9.6920 KOps/s | 9.6133 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3936ms | 0.1990ms | 5.0263 KOps/s | 4.9665 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3456ms | 0.2343ms | 4.2689 KOps/s | 4.7001 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1892ms | 0.1049ms | 9.5371 KOps/s | 9.4686 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2870ms | 60.1341μs | 16.6295 KOps/s | 17.9689 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1902ms | 47.2474μs | 21.1652 KOps/s | 21.9768 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6279ms | 0.1576ms | 6.3467 KOps/s | 6.2774 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1761ms | 0.1034ms | 9.6728 KOps/s | 9.7048 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 61.5840μs | 21.2245μs | 47.1154 KOps/s | 48.2441 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1713ms | 65.5768μs | 15.2493 KOps/s | 17.1214 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1953ms | 79.7425μs | 12.5404 KOps/s | 12.1120 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1413ms | 67.9685μs | 14.7127 KOps/s | 14.2827 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3101ms | 0.2083ms | 4.8018 KOps/s | 4.8843 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5105ms | 1.3369ms | 748.0259 Ops/s | 757.1184 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4140ms | 0.2045ms | 4.8912 KOps/s | 4.8911 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3201ms | 0.7694ms | 1.2997 KOps/s | 1.2944 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5626ms | 0.4603ms | 2.1725 KOps/s | 2.1969 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.6665ms | 2.6798ms | 373.1579 Ops/s | 391.1088 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 90.0670μs | 36.1716μs | 27.6460 KOps/s | 27.1683 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6204ms | 34.1692μs | 29.2662 KOps/s | 29.6785 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 82.5040μs | 29.7559μs | 33.6068 KOps/s | 32.7780 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 75.3100μs | 23.5739μs | 42.4198 KOps/s | 44.0759 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1049ms | 30.0236μs | 33.3072 KOps/s | 31.9779 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 79.6480μs | 23.3686μs | 42.7925 KOps/s | 43.9666 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1116ms | 51.5383μs | 19.4031 KOps/s | 19.3779 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3897ms | 20.5688μs | 48.6174 KOps/s | 48.6333 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 92.9630μs | 43.8041μs | 22.8289 KOps/s | 22.1708 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 64.5200μs | 19.2995μs | 51.8148 KOps/s | 52.1782 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1047ms | 44.9943μs | 22.2250 KOps/s | 21.9326 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 65.8320μs | 18.9428μs | 52.7904 KOps/s | 52.1073 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1031ms | 52.8170μs | 18.9333 KOps/s | 19.0949 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9448ms | 20.6968μs | 48.3167 KOps/s | 49.5505 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1043ms | 45.3989μs | 22.0270 KOps/s | 21.8768 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 73.3360μs | 19.2296μs | 52.0033 KOps/s | 52.9081 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1071ms | 45.1918μs | 22.1279 KOps/s | 21.9717 KOps/s | |
test_compile_indexing[int-pytree-eager] | 66.3130μs | 19.0273μs | 52.5562 KOps/s | 52.4245 KOps/s | |
test_mod_add[eager] | 99.0540μs | 36.6647μs | 27.2742 KOps/s | 30.2976 KOps/s | |
test_mod_add[compile] | 0.1326ms | 48.1001μs | 20.7900 KOps/s | 20.6964 KOps/s | |
test_mod_add[compile-overhead] | 0.1523ms | 46.9181μs | 21.3137 KOps/s | 20.5383 KOps/s | |
test_mod_wrap[eager] | 0.3454ms | 0.2285ms | 4.3768 KOps/s | 4.4511 KOps/s | |
test_mod_wrap[compile] | 0.4262ms | 0.2094ms | 4.7757 KOps/s | 4.7861 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4145ms | 0.2072ms | 4.8267 KOps/s | 4.7462 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.7722ms | 12.7920ms | 78.1738 Ops/s | 82.6006 Ops/s | |
test_mod_wrap_and_backward[compile] | 15.5285ms | 12.7445ms | 78.4652 Ops/s | 77.6706 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 19.6406ms | 13.4669ms | 74.2560 Ops/s | 79.6049 Ops/s | |
test_seq_add[eager] | 0.2116ms | 0.1172ms | 8.5333 KOps/s | 9.0615 KOps/s | |
test_seq_add[compile] | 0.1166ms | 63.2254μs | 15.8164 KOps/s | 15.8974 KOps/s | |
test_seq_add[compile-overhead] | 0.1389ms | 59.7839μs | 16.7269 KOps/s | 16.4546 KOps/s | |
test_seq_wrap[eager] | 0.6014ms | 0.4541ms | 2.2020 KOps/s | 2.2488 KOps/s | |
test_seq_wrap[compile] | 0.4058ms | 0.2292ms | 4.3634 KOps/s | 4.3078 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3720ms | 0.2304ms | 4.3412 KOps/s | 4.3516 KOps/s | |
test_func_call_runtime[False-eager] | 1.0315ms | 0.5666ms | 1.7649 KOps/s | 1.7908 KOps/s | |
test_func_call_runtime[False-compile] | 0.7759ms | 0.4292ms | 2.3301 KOps/s | 2.3789 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5465ms | 0.4282ms | 2.3354 KOps/s | 2.3525 KOps/s | |
test_func_call_runtime[True-eager] | 1.0524ms | 0.7809ms | 1.2806 KOps/s | 1.2912 KOps/s | |
test_func_call_runtime[True-compile] | 0.6242ms | 0.4685ms | 2.1346 KOps/s | 2.1515 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6173ms | 0.4649ms | 2.1509 KOps/s | 2.1603 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9908ms | 0.5603ms | 1.7849 KOps/s | 1.7971 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5704ms | 0.4247ms | 2.3546 KOps/s | 2.3590 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5372ms | 0.4244ms | 2.3561 KOps/s | 2.3849 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3047ms | 0.9283ms | 1.0773 KOps/s | 1.1073 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6065ms | 0.4916ms | 2.0343 KOps/s | 2.0617 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5947ms | 0.4896ms | 2.0423 KOps/s | 2.0498 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6581ms | 1.9147ms | 522.2850 Ops/s | 528.9813 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7006ms | 0.5097ms | 1.9621 KOps/s | 1.9132 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7902ms | 0.5142ms | 1.9447 KOps/s | 1.9288 KOps/s | |
test_distributed | 0.6070ms | 0.1258ms | 7.9495 KOps/s | 7.7761 KOps/s | |
test_tdmodule | 49.6730μs | 27.2910μs | 36.6421 KOps/s | 39.2321 KOps/s | |
test_tdmodule_dispatch | 88.1940μs | 50.3850μs | 19.8472 KOps/s | 21.8207 KOps/s | |
test_tdseq | 48.1200μs | 27.4967μs | 36.3680 KOps/s | 39.4469 KOps/s | |
test_tdseq_dispatch | 85.3890μs | 53.5413μs | 18.6772 KOps/s | 20.9034 KOps/s | |
test_instantiation_functorch | 2.3046ms | 1.5514ms | 644.5874 Ops/s | 635.7030 Ops/s | |
test_exec_functorch | 0.3331ms | 0.1942ms | 5.1495 KOps/s | 5.5337 KOps/s | |
test_exec_functional_call | 0.3324ms | 0.1787ms | 5.5945 KOps/s | 5.8460 KOps/s | |
test_exec_td_decorator | 0.5659ms | 0.2385ms | 4.1932 KOps/s | 4.3835 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9844ms | 0.6677ms | 1.4977 KOps/s | 1.5405 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1899ms | 0.6748ms | 1.4818 KOps/s | 1.5169 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.4612ms | 0.5430ms | 1.8417 KOps/s | 1.8790 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7794ms | 0.5416ms | 1.8464 KOps/s | 1.8885 KOps/s | |
test_to_module_speed[True] | 1.5806ms | 1.3624ms | 734.0031 Ops/s | 774.3877 Ops/s | |
test_to_module_speed[False] | 1.5512ms | 1.3159ms | 759.9560 Ops/s | 792.0041 Ops/s | |
test_tc_init | 89.2860μs | 50.2034μs | 19.9190 KOps/s | 22.5920 KOps/s | |
test_tc_init_nested | 0.1734ms | 99.9034μs | 10.0097 KOps/s | 11.2331 KOps/s | |
test_tc_first_layer_tensor | 28.2030μs | 1.5523μs | 644.2165 KOps/s | 656.6603 KOps/s | |
test_tc_first_layer_nontensor | 36.3370μs | 4.6991μs | 212.8084 KOps/s | 201.7902 KOps/s | |
test_tc_second_layer_tensor | 42.5390μs | 2.8942μs | 345.5236 KOps/s | 353.3794 KOps/s | |
test_tc_second_layer_nontensor | 26.8790μs | 6.1357μs | 162.9806 KOps/s | 158.1763 KOps/s | |
test_unbind | 0.2498s | 16.1309ms | 61.9927 Ops/s | 76.7332 Ops/s | |
test_full_like | 8.7453ms | 7.5734ms | 132.0408 Ops/s | 77.9756 Ops/s | |
test_zeros_like | 4.1722ms | 3.0279ms | 330.2652 Ops/s | 133.0278 Ops/s | |
test_ones_like | 4.5101ms | 3.4497ms | 289.8807 Ops/s | 129.4414 Ops/s | |
test_clone | 16.2287ms | 8.1020ms | 123.4269 Ops/s | 103.2108 Ops/s | |
test_squeeze | 87.4430μs | 12.4358μs | 80.4133 KOps/s | 85.8042 KOps/s | |
test_unsqueeze | 0.1670ms | 92.1966μs | 10.8464 KOps/s | 11.1848 KOps/s | |
test_split | 0.3678ms | 0.1991ms | 5.0224 KOps/s | 5.1092 KOps/s | |
test_permute | 0.3678ms | 0.2151ms | 4.6488 KOps/s | 4.9864 KOps/s | |
test_stack | 32.1051ms | 25.8601ms | 38.6696 Ops/s | 37.0406 Ops/s | |
test_cat | 31.6346ms | 25.7211ms | 38.8786 Ops/s | 36.8797 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 33.4100μs | 12.7702μs | 78.3074 KOps/s | 94.9802 KOps/s | |
test_plain_set_stack_nested | 42.2100μs | 12.7880μs | 78.1981 KOps/s | 97.2236 KOps/s | |
test_plain_set_nested_inplace | 40.1110μs | 13.8443μs | 72.2321 KOps/s | 88.1753 KOps/s | |
test_plain_set_stack_nested_inplace | 36.8810μs | 13.7468μs | 72.7445 KOps/s | 89.2497 KOps/s | |
test_items | 32.2200μs | 2.9351μs | 340.7058 KOps/s | 342.3528 KOps/s | |
test_items_nested | 0.4220ms | 0.3647ms | 2.7418 KOps/s | 2.8282 KOps/s | |
test_items_nested_locked | 0.5129ms | 0.3610ms | 2.7699 KOps/s | 2.8101 KOps/s | |
test_items_nested_leaf | 90.5420μs | 62.0766μs | 16.1091 KOps/s | 16.7041 KOps/s | |
test_items_stack_nested | 0.4212ms | 0.3656ms | 2.7351 KOps/s | 2.7627 KOps/s | |
test_items_stack_nested_leaf | 96.7910μs | 62.8445μs | 15.9123 KOps/s | 16.5794 KOps/s | |
test_items_stack_nested_locked | 0.4259ms | 0.3675ms | 2.7212 KOps/s | 2.8112 KOps/s | |
test_keys | 25.3700μs | 3.4753μs | 287.7452 KOps/s | 287.9181 KOps/s | |
test_keys_nested | 0.1222ms | 82.9988μs | 12.0484 KOps/s | 13.8706 KOps/s | |
test_keys_nested_locked | 0.7744ms | 88.8801μs | 11.2511 KOps/s | 12.6655 KOps/s | |
test_keys_nested_leaf | 0.1125ms | 73.4958μs | 13.6062 KOps/s | 15.6792 KOps/s | |
test_keys_stack_nested | 0.1165ms | 84.2001μs | 11.8765 KOps/s | 13.9999 KOps/s | |
test_keys_stack_nested_leaf | 0.1770ms | 74.7232μs | 13.3827 KOps/s | 15.5928 KOps/s | |
test_keys_stack_nested_locked | 0.1253ms | 89.1143μs | 11.2215 KOps/s | 12.6503 KOps/s | |
test_values | 9.6418μs | 0.8676μs | 1.1526 MOps/s | 1.1585 MOps/s | |
test_values_nested | 64.3410μs | 34.7031μs | 28.8159 KOps/s | 32.1852 KOps/s | |
test_values_nested_locked | 85.3810μs | 35.7563μs | 27.9671 KOps/s | 30.4057 KOps/s | |
test_values_nested_leaf | 69.9410μs | 39.6482μs | 25.2218 KOps/s | 29.2303 KOps/s | |
test_values_stack_nested | 95.1320μs | 35.3759μs | 28.2678 KOps/s | 30.8504 KOps/s | |
test_values_stack_nested_leaf | 66.5010μs | 40.0030μs | 24.9982 KOps/s | 28.9424 KOps/s | |
test_values_stack_nested_locked | 67.0110μs | 36.6113μs | 27.3140 KOps/s | 29.5472 KOps/s | |
test_membership | 2.7740μs | 0.5121μs | 1.9527 MOps/s | 1.8616 MOps/s | |
test_membership_nested | 52.3210μs | 2.0670μs | 483.7943 KOps/s | 474.2127 KOps/s | |
test_membership_nested_leaf | 14.7905μs | 2.0208μs | 494.8570 KOps/s | 484.5995 KOps/s | |
test_membership_stacked_nested | 20.9600μs | 2.1093μs | 474.0985 KOps/s | 461.1095 KOps/s | |
test_membership_stacked_nested_leaf | 33.8600μs | 2.0906μs | 478.3253 KOps/s | 465.5680 KOps/s | |
test_membership_nested_last | 45.6010μs | 3.0739μs | 325.3243 KOps/s | 328.6244 KOps/s | |
test_membership_nested_leaf_last | 28.3410μs | 3.1278μs | 319.7113 KOps/s | 328.7420 KOps/s | |
test_membership_stacked_nested_last | 39.1710μs | 3.0644μs | 326.3239 KOps/s | 330.0151 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.8200μs | 3.0655μs | 326.2060 KOps/s | 331.5572 KOps/s | |
test_nested_getleaf | 47.4610μs | 6.1438μs | 162.7652 KOps/s | 157.2476 KOps/s | |
test_nested_get | 38.4010μs | 5.8355μs | 171.3660 KOps/s | 165.1974 KOps/s | |
test_stacked_getleaf | 29.5000μs | 6.1367μs | 162.9530 KOps/s | 160.3164 KOps/s | |
test_stacked_get | 28.5510μs | 5.8423μs | 171.1652 KOps/s | 166.5217 KOps/s | |
test_nested_getitemleaf | 31.4410μs | 6.2839μs | 159.1361 KOps/s | 154.7002 KOps/s | |
test_nested_getitem | 40.2400μs | 5.9254μs | 168.7640 KOps/s | 166.7390 KOps/s | |
test_stacked_getitemleaf | 40.4310μs | 6.2714μs | 159.4532 KOps/s | 157.1397 KOps/s | |
test_stacked_getitem | 31.3600μs | 5.9309μs | 168.6088 KOps/s | 163.7026 KOps/s | |
test_lock_nested | 1.1208ms | 0.3741ms | 2.6728 KOps/s | 2.5712 KOps/s | |
test_lock_stack_nested | 0.4240ms | 0.3490ms | 2.8655 KOps/s | 2.8474 KOps/s | |
test_unlock_nested | 0.6795ms | 0.3168ms | 3.1562 KOps/s | 3.0682 KOps/s | |
test_unlock_stack_nested | 0.3636ms | 0.2875ms | 3.4781 KOps/s | 3.4278 KOps/s | |
test_flatten_speed | 0.1298ms | 78.1367μs | 12.7981 KOps/s | 13.3723 KOps/s | |
test_unflatten_speed | 0.4002ms | 0.3301ms | 3.0293 KOps/s | 3.2554 KOps/s | |
test_common_ops | 1.5884ms | 0.6294ms | 1.5888 KOps/s | 1.6487 KOps/s | |
test_creation | 0.1053ms | 1.7682μs | 565.5322 KOps/s | 661.3696 KOps/s | |
test_creation_empty | 28.8810μs | 8.9199μs | 112.1086 KOps/s | 146.7376 KOps/s | |
test_creation_nested_1 | 37.8910μs | 10.7207μs | 93.2774 KOps/s | 120.1179 KOps/s | |
test_creation_nested_2 | 44.8300μs | 13.4249μs | 74.4885 KOps/s | 90.8115 KOps/s | |
test_clone | 0.1220ms | 10.9457μs | 91.3599 KOps/s | 86.4203 KOps/s | |
test_getitem[int] | 1.7520ms | 10.6828μs | 93.6081 KOps/s | 87.1197 KOps/s | |
test_getitem[slice_int] | 0.1090ms | 20.9910μs | 47.6394 KOps/s | 44.0688 KOps/s | |
test_getitem[range] | 0.1368ms | 39.2152μs | 25.5003 KOps/s | 24.8989 KOps/s | |
test_getitem[tuple] | 0.1271ms | 18.3903μs | 54.3764 KOps/s | 50.2955 KOps/s | |
test_getitem[list] | 0.1455ms | 33.5030μs | 29.8481 KOps/s | 28.5334 KOps/s | |
test_setitem_dim[int] | 38.3910μs | 18.7706μs | 53.2748 KOps/s | 49.6523 KOps/s | |
test_setitem_dim[slice_int] | 58.7310μs | 38.5802μs | 25.9200 KOps/s | 24.4292 KOps/s | |
test_setitem_dim[range] | 95.3720μs | 56.9753μs | 17.5515 KOps/s | 18.1715 KOps/s | |
test_setitem_dim[tuple] | 71.7110μs | 33.7851μs | 29.5989 KOps/s | 29.8639 KOps/s | |
test_setitem | 0.1312ms | 16.1165μs | 62.0480 KOps/s | 64.2920 KOps/s | |
test_set | 0.1331ms | 15.1291μs | 66.0977 KOps/s | 64.7493 KOps/s | |
test_set_shared | 1.5744ms | 0.1516ms | 6.5972 KOps/s | 6.5463 KOps/s | |
test_update | 0.2485ms | 18.4473μs | 54.2083 KOps/s | 57.0281 KOps/s | |
test_update_nested | 0.1419ms | 23.8098μs | 41.9995 KOps/s | 44.3927 KOps/s | |
test_update__nested | 1.2569ms | 25.5365μs | 39.1597 KOps/s | 38.6332 KOps/s | |
test_set_nested | 0.1403ms | 16.9202μs | 59.1010 KOps/s | 60.8005 KOps/s | |
test_set_nested_new | 0.1348ms | 19.3798μs | 51.6000 KOps/s | 53.2783 KOps/s | |
test_select | 0.1441ms | 32.1895μs | 31.0660 KOps/s | 32.6681 KOps/s | |
test_select_nested | 80.2310μs | 43.6008μs | 22.9354 KOps/s | 23.3428 KOps/s | |
test_exclude_nested | 94.4910μs | 63.1670μs | 15.8311 KOps/s | 15.8718 KOps/s | |
test_empty[True] | 0.3608ms | 0.2954ms | 3.3856 KOps/s | 3.6003 KOps/s | |
test_empty[False] | 4.7661μs | 0.8328μs | 1.2007 MOps/s | 1.3148 MOps/s | |
test_to | 87.7410μs | 57.0937μs | 17.5151 KOps/s | 17.8096 KOps/s | |
test_to_nonblocking | 99.6310μs | 49.3781μs | 20.2519 KOps/s | 20.5775 KOps/s | |
test_unbind_speed | 0.8189ms | 0.2349ms | 4.2575 KOps/s | 4.0703 KOps/s | |
test_unbind_speed_stack0 | 0.3421ms | 0.2442ms | 4.0942 KOps/s | 4.0615 KOps/s | |
test_unbind_speed_stack1 | 95.2797ms | 0.6798ms | 1.4710 KOps/s | 1.4664 KOps/s | |
test_split | 95.0024ms | 1.6256ms | 615.1755 Ops/s | 591.9639 Ops/s | |
test_chunk | 95.9126ms | 1.6182ms | 617.9616 Ops/s | 590.3613 Ops/s | |
test_consolidate[False-None] | 97.4550ms | 3.0399ms | 328.9544 Ops/s | 336.7065 Ops/s | |
test_consolidate[default-None] | 1.8520ms | 1.7404ms | 574.5898 Ops/s | 551.2569 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8892ms | 1.7743ms | 563.5982 Ops/s | 539.2642 Ops/s | |
test_consolidate_njt[False-None] | 6.9065ms | 6.6419ms | 150.5601 Ops/s | 148.6335 Ops/s | |
test_to[False-False-None] | 1.9138ms | 1.7851ms | 560.1840 Ops/s | 572.3375 Ops/s | |
test_to[True-False-None] | 1.4907ms | 1.3617ms | 734.3678 Ops/s | 711.7779 Ops/s | |
test_to[within-False-None] | 4.4107ms | 4.2404ms | 235.8257 Ops/s | 238.5648 Ops/s | |
test_to[True-default-None] | 5.7922ms | 5.4148ms | 184.6777 Ops/s | 184.7076 Ops/s | |
test_to_njt[False-False-None] | 7.3211ms | 7.0863ms | 141.1174 Ops/s | 142.2231 Ops/s | |
test_to_njt[True-False-None] | 5.8400ms | 5.5989ms | 178.6054 Ops/s | 180.4009 Ops/s | |
test_to_njt[within-False-None] | 12.7418ms | 12.4513ms | 80.3127 Ops/s | 80.0480 Ops/s | |
test_creation[device0] | 0.4561ms | 80.4050μs | 12.4370 KOps/s | 12.2681 KOps/s | |
test_creation_from_tensor | 0.5593ms | 85.0230μs | 11.7615 KOps/s | 11.5543 KOps/s | |
test_add_one[memmap_tensor0] | 0.4429ms | 7.0705μs | 141.4329 KOps/s | 126.0567 KOps/s | |
test_contiguous[memmap_tensor0] | 3.1321μs | 0.4244μs | 2.3561 MOps/s | 2.4227 MOps/s | |
test_stack[memmap_tensor0] | 38.5310μs | 4.4551μs | 224.4595 KOps/s | 199.6745 KOps/s | |
test_memmaptd_index | 1.6503ms | 0.2586ms | 3.8666 KOps/s | 3.6256 KOps/s | |
test_memmaptd_index_astensor | 0.6116ms | 0.3207ms | 3.1185 KOps/s | 3.0446 KOps/s | |
test_memmaptd_index_op | 1.0535ms | 0.6193ms | 1.6148 KOps/s | 1.6425 KOps/s | |
test_serialize_model | 0.1325s | 0.1312s | 7.6208 Ops/s | 7.5680 Ops/s | |
test_serialize_model_pickle | 1.3503s | 1.2173s | 0.8215 Ops/s | 0.8393 Ops/s | |
test_serialize_weights | 0.1327s | 0.1306s | 7.6573 Ops/s | 7.6863 Ops/s | |
test_serialize_weights_returnearly | 0.4226s | 68.3957ms | 14.6208 Ops/s | 14.3411 Ops/s | |
test_serialize_weights_pickle | 1.3792s | 1.1923s | 0.8387 Ops/s | 0.8379 Ops/s | |
test_reshape_pytree | 54.4510μs | 22.7528μs | 43.9507 KOps/s | 42.8722 KOps/s | |
test_reshape_td | 54.8210μs | 27.0139μs | 37.0180 KOps/s | 33.8452 KOps/s | |
test_view_pytree | 54.6310μs | 23.0423μs | 43.3985 KOps/s | 43.3809 KOps/s | |
test_view_td | 94.6610μs | 31.4306μs | 31.8161 KOps/s | 31.1678 KOps/s | |
test_unbind_pytree | 92.7610μs | 28.8517μs | 34.6600 KOps/s | 33.2513 KOps/s | |
test_unbind_td | 0.7683ms | 36.0847μs | 27.7125 KOps/s | 27.2373 KOps/s | |
test_split_pytree | 62.4810μs | 30.8040μs | 32.4633 KOps/s | 32.0720 KOps/s | |
test_split_td | 0.9507ms | 39.0221μs | 25.6265 KOps/s | 24.1281 KOps/s | |
test_add_pytree | 88.8110μs | 34.7900μs | 28.7439 KOps/s | 27.1096 KOps/s | |
test_add_td | 77.3010μs | 48.6865μs | 20.5396 KOps/s | 20.6672 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1760ms | 0.1232ms | 8.1158 KOps/s | 7.9707 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2373ms | 0.1306ms | 7.6583 KOps/s | 7.8328 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2956ms | 97.4549μs | 10.2612 KOps/s | 10.1563 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2172ms | 0.1499ms | 6.6692 KOps/s | 6.3092 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 71.5310μs | 23.7362μs | 42.1297 KOps/s | 31.5450 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 70.6210μs | 29.5202μs | 33.8751 KOps/s | 35.8583 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3350ms | 65.9795μs | 15.1562 KOps/s | 15.4409 KOps/s | |
test_compile_copy_nested[pytree-eager] | 81.4810μs | 49.5706μs | 20.1733 KOps/s | 19.6945 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1889ms | 0.1460ms | 6.8487 KOps/s | 6.8371 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3583ms | 0.2191ms | 4.5648 KOps/s | 4.7101 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2337ms | 0.1074ms | 9.3130 KOps/s | 9.5634 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1428ms | 56.8089μs | 17.6029 KOps/s | 18.7853 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1792ms | 0.1388ms | 7.2026 KOps/s | 7.2229 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5540ms | 0.4880ms | 2.0490 KOps/s | 1.9300 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3999ms | 0.2656ms | 3.7646 KOps/s | 3.9564 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2283ms | 0.1578ms | 6.3356 KOps/s | 6.9679 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1984ms | 69.6334μs | 14.3609 KOps/s | 15.4247 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2054ms | 0.1061ms | 9.4217 KOps/s | 10.0548 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5830ms | 0.4277ms | 2.3382 KOps/s | 2.3103 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2278ms | 0.1416ms | 7.0630 KOps/s | 7.2791 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1112ms | 21.1035μs | 47.3854 KOps/s | 52.6410 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1171ms | 31.1034μs | 32.1509 KOps/s | 36.7809 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1665ms | 71.5292μs | 13.9803 KOps/s | 14.0271 KOps/s | |
test_compile_copy_flat[pytree-eager] | 89.7120μs | 52.1881μs | 19.1615 KOps/s | 18.7684 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.7831ms | 0.4217ms | 2.3715 KOps/s | 2.1698 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.0859ms | 2.7932ms | 358.0118 Ops/s | 369.1028 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.6854ms | 0.4603ms | 2.1724 KOps/s | 2.1417 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7841ms | 2.6714ms | 374.3385 Ops/s | 344.4674 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6743ms | 0.1163ms | 8.5969 KOps/s | 8.1395 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5552ms | 80.4096μs | 12.4363 KOps/s | 11.6037 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.5047ms | 0.1117ms | 8.9490 KOps/s | 8.7854 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1444ms | 68.3671μs | 14.6269 KOps/s | 13.9402 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2005ms | 0.1097ms | 9.1133 KOps/s | 9.1394 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1201ms | 68.1339μs | 14.6770 KOps/s | 14.2721 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1548ms | 0.1028ms | 9.7283 KOps/s | 9.5121 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1449ms | 17.9003μs | 55.8649 KOps/s | 50.6696 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1564ms | 98.6346μs | 10.1384 KOps/s | 10.0737 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 47.0410μs | 15.9966μs | 62.5133 KOps/s | 50.8888 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1544ms | 98.8590μs | 10.1154 KOps/s | 9.4806 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 50.2710μs | 15.9641μs | 62.6407 KOps/s | 59.2574 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1457ms | 0.1039ms | 9.6221 KOps/s | 9.5414 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6377ms | 17.4311μs | 57.3688 KOps/s | 53.4373 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1556ms | 99.3926μs | 10.0611 KOps/s | 9.9750 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 56.0710μs | 16.0093μs | 62.4638 KOps/s | 57.6447 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2034ms | 99.7593μs | 10.0241 KOps/s | 10.0147 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.3879ms | 15.9823μs | 62.5691 KOps/s | 60.7889 KOps/s | |
test_mod_add[eager] | 82.7810μs | 39.0159μs | 25.6305 KOps/s | 26.1643 KOps/s | |
test_mod_add[compile] | 0.3558ms | 83.7414μs | 11.9415 KOps/s | 11.9919 KOps/s | |
test_mod_add[compile-overhead] | 0.3317ms | 0.1701ms | 5.8801 KOps/s | 5.5860 KOps/s | |
test_mod_wrap[eager] | 0.3312ms | 0.2551ms | 3.9197 KOps/s | 3.8794 KOps/s | |
test_mod_wrap[compile] | 0.8106ms | 0.3006ms | 3.3262 KOps/s | 3.3955 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0813ms | 3.7553ms | 266.2900 Ops/s | 269.3785 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5237ms | 1.3838ms | 722.6359 Ops/s | 704.2817 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3926ms | 1.2793ms | 781.6905 Ops/s | 755.0722 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3723ms | 0.9403ms | 1.0635 KOps/s | 1.0478 KOps/s | |
test_seq_add[eager] | 0.2097ms | 0.1168ms | 8.5602 KOps/s | 8.6771 KOps/s | |
test_seq_add[compile] | 0.1599ms | 94.1626μs | 10.6199 KOps/s | 10.8749 KOps/s | |
test_seq_add[compile-overhead] | 0.2343ms | 0.1339ms | 7.4689 KOps/s | 7.5709 KOps/s | |
test_seq_wrap[eager] | 0.5728ms | 0.4281ms | 2.3361 KOps/s | 2.3429 KOps/s | |
test_seq_wrap[compile] | 0.4283ms | 0.3102ms | 3.2236 KOps/s | 3.2303 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3750ms | 0.2308ms | 4.3332 KOps/s | 4.3001 KOps/s | |
test_func_call_runtime[False-eager] | 0.8162ms | 0.7399ms | 1.3515 KOps/s | 1.2427 KOps/s | |
test_func_call_runtime[False-compile] | 0.8587ms | 0.7555ms | 1.3236 KOps/s | 1.2489 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4657ms | 0.3796ms | 2.6346 KOps/s | 2.6977 KOps/s | |
test_func_call_runtime[True-eager] | 1.0743ms | 0.9694ms | 1.0316 KOps/s | 1.0602 KOps/s | |
test_func_call_runtime[True-compile] | 0.9143ms | 0.7774ms | 1.2863 KOps/s | 1.2377 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4702ms | 0.3930ms | 2.5445 KOps/s | 2.5619 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9544ms | 0.7857ms | 1.2728 KOps/s | 1.2835 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.3580ms | 0.7659ms | 1.3056 KOps/s | 1.2940 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4501ms | 0.3785ms | 2.6423 KOps/s | 2.6653 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2370ms | 1.0735ms | 931.5708 Ops/s | 961.9666 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8721ms | 0.8019ms | 1.2470 KOps/s | 1.2147 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4783ms | 0.4156ms | 2.4060 KOps/s | 2.4021 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6671ms | 2.1381ms | 467.6956 Ops/s | 473.0842 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8936ms | 0.8220ms | 1.2165 KOps/s | 1.1951 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5037ms | 0.4203ms | 2.3794 KOps/s | 2.3939 KOps/s | |
test_distributed | 3.0150ms | 0.1826ms | 5.4759 KOps/s | 8.2448 KOps/s | |
test_tdmodule | 41.7700μs | 19.8335μs | 50.4197 KOps/s | 54.6574 KOps/s | |
test_tdmodule_dispatch | 75.6210μs | 35.8824μs | 27.8688 KOps/s | 30.3058 KOps/s | |
test_tdseq | 33.6300μs | 19.9755μs | 50.0614 KOps/s | 55.9266 KOps/s | |
test_tdseq_dispatch | 60.7710μs | 37.9953μs | 26.3190 KOps/s | 28.8713 KOps/s | |
test_instantiation_functorch | 1.7152ms | 1.5797ms | 633.0336 Ops/s | 629.4091 Ops/s | |
test_exec_functorch | 0.2538ms | 0.1466ms | 6.8190 KOps/s | 6.5866 KOps/s | |
test_exec_functional_call | 0.1974ms | 0.1386ms | 7.2142 KOps/s | 6.8697 KOps/s | |
test_exec_td_decorator | 0.4017ms | 0.1935ms | 5.1682 KOps/s | 5.2296 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8176ms | 0.7056ms | 1.4173 KOps/s | 1.4412 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8472ms | 0.7065ms | 1.4155 KOps/s | 1.4341 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7380ms | 0.6082ms | 1.6442 KOps/s | 1.6476 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7366ms | 0.6126ms | 1.6324 KOps/s | 1.6535 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.9042ms | 19.7724ms | 50.5755 Ops/s | 51.1565 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2292ms | 19.7298ms | 50.6848 Ops/s | 51.1834 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.8349ms | 19.4554ms | 51.3996 Ops/s | 51.3035 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.8995ms | 19.6240ms | 50.9580 Ops/s | 51.1823 Ops/s | |
test_to_module_speed[True] | 1.1094ms | 0.9885ms | 1.0117 KOps/s | 1.0553 KOps/s | |
test_to_module_speed[False] | 1.0819ms | 0.9642ms | 1.0371 KOps/s | 1.0864 KOps/s | |
test_tc_init | 76.5610μs | 39.1614μs | 25.5353 KOps/s | 28.0883 KOps/s | |
test_tc_init_nested | 0.1203ms | 79.5853μs | 12.5651 KOps/s | 14.1039 KOps/s | |
test_tc_first_layer_tensor | 6.2844μs | 0.7018μs | 1.4249 MOps/s | 1.4162 MOps/s | |
test_tc_first_layer_nontensor | 32.2210μs | 2.3223μs | 430.6046 KOps/s | 434.8643 KOps/s | |
test_tc_second_layer_tensor | 21.6110μs | 1.4952μs | 668.7876 KOps/s | 709.7880 KOps/s | |
test_tc_second_layer_nontensor | 51.7510μs | 2.9882μs | 334.6454 KOps/s | 333.1105 KOps/s | |
test_unbind | 0.2223s | 12.4299ms | 80.4510 Ops/s | 151.3654 Ops/s | |
test_full_like | 9.3327ms | 9.1299ms | 109.5307 Ops/s | 108.7212 Ops/s | |
test_zeros_like | 4.9736ms | 4.3323ms | 230.8232 Ops/s | 235.8628 Ops/s | |
test_ones_like | 4.4819ms | 4.3293ms | 230.9830 Ops/s | 231.1349 Ops/s | |
test_clone | 6.5080ms | 6.3876ms | 156.5533 Ops/s | 109.4972 Ops/s | |
test_squeeze | 61.4010μs | 10.8420μs | 92.2341 KOps/s | 103.7391 KOps/s | |
test_unsqueeze | 0.1531ms | 72.9781μs | 13.7027 KOps/s | 12.9282 KOps/s | |
test_split | 0.3676ms | 0.1602ms | 6.2435 KOps/s | 5.8298 KOps/s | |
test_permute | 0.2435ms | 0.1833ms | 5.4563 KOps/s | 5.4875 KOps/s | |
test_stack | 51.5269ms | 50.9897ms | 19.6118 Ops/s | 19.7146 Ops/s | |
test_cat | 51.2405ms | 50.6564ms | 19.7408 Ops/s | 19.8026 Ops/s |
vmoens
added a commit
that referenced
this pull request
Dec 16, 2024
ghstack-source-id: c6a8d4587df45e374f0d6cb59fe1c982c7818276 Pull Request resolved: #1139
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):