-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Expose WrapModule #1118
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Nov 29, 2024
Merged
vmoens
added a commit
that referenced
this pull request
Nov 29, 2024
ghstack-source-id: 3d40ae8fcd9394d77dfbb9f9a36988b9eb7877ed Pull Request resolved: #1118
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 29, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 33.2820μs | 17.5673μs | 56.9241 KOps/s | 59.0513 KOps/s | |
test_plain_set_stack_nested | 53.9210μs | 17.9519μs | 55.7046 KOps/s | 57.6049 KOps/s | |
test_plain_set_nested_inplace | 47.6600μs | 19.3355μs | 51.7182 KOps/s | 53.1901 KOps/s | |
test_plain_set_stack_nested_inplace | 47.4980μs | 19.4093μs | 51.5217 KOps/s | 52.1501 KOps/s | |
test_items | 26.1390μs | 4.2594μs | 234.7726 KOps/s | 236.7745 KOps/s | |
test_items_nested | 0.6761ms | 0.4058ms | 2.4643 KOps/s | 2.4818 KOps/s | |
test_items_nested_locked | 0.5937ms | 0.4058ms | 2.4640 KOps/s | 2.4189 KOps/s | |
test_items_nested_leaf | 0.1765ms | 72.2290μs | 13.8449 KOps/s | 13.9092 KOps/s | |
test_items_stack_nested | 0.6414ms | 0.4102ms | 2.4379 KOps/s | 2.4507 KOps/s | |
test_items_stack_nested_leaf | 0.1249ms | 74.9491μs | 13.3424 KOps/s | 13.3466 KOps/s | |
test_items_stack_nested_locked | 0.5214ms | 0.4083ms | 2.4492 KOps/s | 2.4559 KOps/s | |
test_keys | 31.1080μs | 3.5411μs | 282.3993 KOps/s | 246.7501 KOps/s | |
test_keys_nested | 0.2219ms | 0.1375ms | 7.2721 KOps/s | 7.2766 KOps/s | |
test_keys_nested_locked | 1.6880ms | 0.1438ms | 6.9562 KOps/s | 6.9019 KOps/s | |
test_keys_nested_leaf | 0.1983ms | 0.1161ms | 8.6159 KOps/s | 8.2555 KOps/s | |
test_keys_stack_nested | 0.2340ms | 0.1359ms | 7.3577 KOps/s | 7.1717 KOps/s | |
test_keys_stack_nested_leaf | 0.7961ms | 0.1236ms | 8.0888 KOps/s | 8.2226 KOps/s | |
test_keys_stack_nested_locked | 0.2640ms | 0.1409ms | 7.0980 KOps/s | 6.9122 KOps/s | |
test_values | 5.8390μs | 1.0370μs | 964.3465 KOps/s | 945.3625 KOps/s | |
test_values_nested | 0.1089ms | 55.3965μs | 18.0517 KOps/s | 17.8031 KOps/s | |
test_values_nested_locked | 0.1069ms | 55.2511μs | 18.0992 KOps/s | 17.9020 KOps/s | |
test_values_nested_leaf | 0.1122ms | 60.0540μs | 16.6517 KOps/s | 16.3475 KOps/s | |
test_values_stack_nested | 0.1108ms | 57.2152μs | 17.4779 KOps/s | 17.2526 KOps/s | |
test_values_stack_nested_leaf | 0.4421ms | 62.9550μs | 15.8844 KOps/s | 16.2551 KOps/s | |
test_values_stack_nested_locked | 0.1930ms | 57.8647μs | 17.2817 KOps/s | 17.3432 KOps/s | |
test_membership | 7.1881μs | 0.7495μs | 1.3342 MOps/s | 1.0521 MOps/s | |
test_membership_nested | 19.6470μs | 2.9613μs | 337.6897 KOps/s | 326.8228 KOps/s | |
test_membership_nested_leaf | 40.9970μs | 2.9579μs | 338.0784 KOps/s | 335.1323 KOps/s | |
test_membership_stacked_nested | 19.7470μs | 2.9532μs | 338.6117 KOps/s | 339.3087 KOps/s | |
test_membership_stacked_nested_leaf | 23.4640μs | 2.9330μs | 340.9529 KOps/s | 333.2816 KOps/s | |
test_membership_nested_last | 24.3360μs | 4.1673μs | 239.9653 KOps/s | 236.6469 KOps/s | |
test_membership_nested_leaf_last | 24.7360μs | 4.1754μs | 239.4983 KOps/s | 228.2722 KOps/s | |
test_membership_stacked_nested_last | 22.1920μs | 4.1723μs | 239.6778 KOps/s | 181.5077 KOps/s | |
test_membership_stacked_nested_leaf_last | 18.9360μs | 4.1770μs | 239.4077 KOps/s | 182.2488 KOps/s | |
test_nested_getleaf | 37.5400μs | 10.4974μs | 95.2614 KOps/s | 91.7633 KOps/s | |
test_nested_get | 38.9430μs | 9.9790μs | 100.2104 KOps/s | 95.6148 KOps/s | |
test_stacked_getleaf | 35.6370μs | 10.5183μs | 95.0723 KOps/s | 92.3099 KOps/s | |
test_stacked_get | 38.1220μs | 9.9261μs | 100.7447 KOps/s | 95.6268 KOps/s | |
test_nested_getitemleaf | 41.0170μs | 10.9255μs | 91.5287 KOps/s | 88.0834 KOps/s | |
test_nested_getitem | 37.2800μs | 10.1421μs | 98.5987 KOps/s | 93.8168 KOps/s | |
test_stacked_getitemleaf | 33.2020μs | 10.8986μs | 91.7548 KOps/s | 87.6285 KOps/s | |
test_stacked_getitem | 33.3020μs | 10.2670μs | 97.3994 KOps/s | 93.9322 KOps/s | |
test_lock_nested | 3.2900ms | 0.4517ms | 2.2139 KOps/s | 2.2219 KOps/s | |
test_lock_stack_nested | 0.7964ms | 0.4122ms | 2.4259 KOps/s | 2.4547 KOps/s | |
test_unlock_nested | 1.0361ms | 0.3598ms | 2.7790 KOps/s | 2.7535 KOps/s | |
test_unlock_stack_nested | 0.7214ms | 0.3300ms | 3.0306 KOps/s | 3.0914 KOps/s | |
test_flatten_speed | 0.1915ms | 94.6084μs | 10.5699 KOps/s | 10.5566 KOps/s | |
test_unflatten_speed | 0.9801ms | 0.4906ms | 2.0385 KOps/s | 2.0035 KOps/s | |
test_common_ops | 6.8170ms | 0.7768ms | 1.2874 KOps/s | 1.3465 KOps/s | |
test_creation | 41.6380μs | 2.0731μs | 482.3608 KOps/s | 466.1474 KOps/s | |
test_creation_empty | 42.8910μs | 10.7104μs | 93.3674 KOps/s | 110.7458 KOps/s | |
test_creation_nested_1 | 39.8650μs | 14.0890μs | 70.9771 KOps/s | 83.1877 KOps/s | |
test_creation_nested_2 | 1.4018ms | 18.1343μs | 55.1440 KOps/s | 62.6853 KOps/s | |
test_clone | 70.1210μs | 12.9808μs | 77.0367 KOps/s | 75.6326 KOps/s | |
test_getitem[int] | 0.8538ms | 12.5823μs | 79.4767 KOps/s | 78.1959 KOps/s | |
test_getitem[slice_int] | 0.1406ms | 24.8527μs | 40.2371 KOps/s | 40.6779 KOps/s | |
test_getitem[range] | 0.1734ms | 48.6461μs | 20.5566 KOps/s | 20.5759 KOps/s | |
test_getitem[tuple] | 0.1533ms | 20.1108μs | 49.7244 KOps/s | 48.3454 KOps/s | |
test_getitem[list] | 0.2631ms | 43.9203μs | 22.7685 KOps/s | 22.7502 KOps/s | |
test_setitem_dim[int] | 51.2870μs | 24.7330μs | 40.4318 KOps/s | 40.1950 KOps/s | |
test_setitem_dim[slice_int] | 95.0090μs | 50.5961μs | 19.7644 KOps/s | 19.3771 KOps/s | |
test_setitem_dim[range] | 0.1122ms | 73.7231μs | 13.5643 KOps/s | 13.6739 KOps/s | |
test_setitem_dim[tuple] | 77.2850μs | 39.9485μs | 25.0322 KOps/s | 24.6696 KOps/s | |
test_setitem | 0.1250ms | 19.7713μs | 50.5785 KOps/s | 51.0585 KOps/s | |
test_set | 90.6200μs | 19.5537μs | 51.1411 KOps/s | 52.7968 KOps/s | |
test_set_shared | 1.1701ms | 0.1702ms | 5.8754 KOps/s | 5.8391 KOps/s | |
test_update | 0.1690ms | 21.8613μs | 45.7430 KOps/s | 48.7064 KOps/s | |
test_update_nested | 0.1158ms | 32.0870μs | 31.1652 KOps/s | 32.6266 KOps/s | |
test_update__nested | 0.5891ms | 32.1070μs | 31.1458 KOps/s | 29.2169 KOps/s | |
test_set_nested | 89.8790μs | 21.0834μs | 47.4306 KOps/s | 48.1025 KOps/s | |
test_set_nested_new | 77.1040μs | 25.5947μs | 39.0706 KOps/s | 38.0520 KOps/s | |
test_select | 96.1610μs | 40.8172μs | 24.4995 KOps/s | 23.9059 KOps/s | |
test_select_nested | 0.1205ms | 59.2728μs | 16.8712 KOps/s | 16.3615 KOps/s | |
test_exclude_nested | 0.1457ms | 77.3120μs | 12.9346 KOps/s | 12.6130 KOps/s | |
test_empty[True] | 0.6965ms | 0.3832ms | 2.6097 KOps/s | 2.5433 KOps/s | |
test_empty[False] | 10.7050μs | 1.2571μs | 795.4841 KOps/s | 813.8327 KOps/s | |
test_unbind_speed | 0.3738ms | 0.2589ms | 3.8632 KOps/s | 3.7260 KOps/s | |
test_unbind_speed_stack0 | 0.5661ms | 0.2576ms | 3.8816 KOps/s | 3.9151 KOps/s | |
test_unbind_speed_stack1 | 0.1091s | 0.7730ms | 1.2936 KOps/s | 1.4439 KOps/s | |
test_split | 1.7540ms | 1.5664ms | 638.4210 Ops/s | 577.7624 Ops/s | |
test_chunk | 0.1372s | 2.0013ms | 499.6771 Ops/s | 580.8788 Ops/s | |
test_consolidate_njt[False-None] | 10.9117ms | 8.3357ms | 119.9661 Ops/s | 118.3285 Ops/s | |
test_creation[device0] | 4.7255ms | 94.6477μs | 10.5655 KOps/s | 10.8092 KOps/s | |
test_creation_from_tensor | 0.2747ms | 95.8137μs | 10.4369 KOps/s | 10.3733 KOps/s | |
test_add_one[memmap_tensor0] | 0.1772ms | 4.7428μs | 210.8453 KOps/s | 206.7204 KOps/s | |
test_contiguous[memmap_tensor0] | 16.7810μs | 0.5231μs | 1.9115 MOps/s | 1.8787 MOps/s | |
test_stack[memmap_tensor0] | 45.0940μs | 3.4809μs | 287.2855 KOps/s | 285.9437 KOps/s | |
test_memmaptd_index | 0.4648ms | 0.2357ms | 4.2427 KOps/s | 4.2643 KOps/s | |
test_memmaptd_index_astensor | 0.7214ms | 0.3154ms | 3.1701 KOps/s | 3.2045 KOps/s | |
test_memmaptd_index_op | 1.0967ms | 0.5678ms | 1.7612 KOps/s | 1.8749 KOps/s | |
test_serialize_model | 0.1242s | 0.1161s | 8.6101 Ops/s | 7.6101 Ops/s | |
test_serialize_model_pickle | 0.4612s | 0.3899s | 2.5650 Ops/s | 2.5478 Ops/s | |
test_serialize_weights | 0.1165s | 0.1114s | 8.9799 Ops/s | 8.8102 Ops/s | |
test_serialize_weights_returnearly | 0.1686s | 0.1582s | 6.3217 Ops/s | 6.3263 Ops/s | |
test_serialize_weights_pickle | 0.4811s | 0.4158s | 2.4050 Ops/s | 2.4565 Ops/s | |
test_serialize_weights_filesystem | 0.1450s | 0.1396s | 7.1611 Ops/s | 6.3750 Ops/s | |
test_serialize_model_filesystem | 0.1590s | 0.1495s | 6.6872 Ops/s | 6.7395 Ops/s | |
test_reshape_pytree | 65.1120μs | 27.2427μs | 36.7070 KOps/s | 35.2676 KOps/s | |
test_reshape_td | 79.9100μs | 32.7476μs | 30.5365 KOps/s | 28.7128 KOps/s | |
test_view_pytree | 63.4590μs | 26.8320μs | 37.2689 KOps/s | 35.4408 KOps/s | |
test_view_td | 84.5280μs | 38.5172μs | 25.9624 KOps/s | 25.0621 KOps/s | |
test_unbind_pytree | 70.0020μs | 29.7853μs | 33.5736 KOps/s | 32.5127 KOps/s | |
test_unbind_td | 0.3603ms | 37.8814μs | 26.3982 KOps/s | 24.8315 KOps/s | |
test_split_pytree | 84.8090μs | 29.6777μs | 33.6954 KOps/s | 32.9880 KOps/s | |
test_split_td | 0.5340ms | 43.7754μs | 22.8439 KOps/s | 21.9723 KOps/s | |
test_add_pytree | 80.7610μs | 35.4619μs | 28.1993 KOps/s | 27.5441 KOps/s | |
test_add_td | 0.1239ms | 54.2282μs | 18.4406 KOps/s | 18.6896 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1123ms | 63.0661μs | 15.8564 KOps/s | 15.9386 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3980ms | 0.1624ms | 6.1582 KOps/s | 6.1148 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1077ms | 46.0642μs | 21.7088 KOps/s | 21.6378 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2219ms | 0.1177ms | 8.4965 KOps/s | 8.3394 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 60.8040μs | 27.0506μs | 36.9677 KOps/s | 38.1991 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1164ms | 54.4116μs | 18.3784 KOps/s | 18.2916 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1430ms | 78.7136μs | 12.7043 KOps/s | 12.2951 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1401ms | 67.7101μs | 14.7688 KOps/s | 14.3879 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2120ms | 0.1061ms | 9.4224 KOps/s | 9.5894 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3936ms | 0.1964ms | 5.0909 KOps/s | 4.7830 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1092ms | 45.1494μs | 22.1487 KOps/s | 20.8586 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5038ms | 60.7590μs | 16.4585 KOps/s | 15.9867 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2252ms | 0.1054ms | 9.4850 KOps/s | 9.7329 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4178ms | 0.2011ms | 4.9720 KOps/s | 4.9100 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4213ms | 0.2112ms | 4.7344 KOps/s | 4.5292 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5118ms | 0.1055ms | 9.4809 KOps/s | 9.4558 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1484ms | 56.3516μs | 17.7457 KOps/s | 18.0966 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1078ms | 46.4425μs | 21.5320 KOps/s | 20.2482 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6246ms | 0.1580ms | 6.3286 KOps/s | 6.1990 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1784ms | 0.1076ms | 9.2967 KOps/s | 9.7115 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 58.5690μs | 21.0134μs | 47.5887 KOps/s | 48.0024 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1304ms | 58.8361μs | 16.9964 KOps/s | 16.9372 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1535ms | 80.5863μs | 12.4091 KOps/s | 11.9847 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1550ms | 68.4502μs | 14.6092 KOps/s | 14.2999 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2973ms | 0.2063ms | 4.8476 KOps/s | 4.8455 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.3187ms | 1.2868ms | 777.1200 Ops/s | 780.1837 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2889ms | 0.2047ms | 4.8841 KOps/s | 4.9173 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9736ms | 0.7666ms | 1.3044 KOps/s | 1.2838 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.8404ms | 0.4588ms | 2.1798 KOps/s | 2.1905 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.4529ms | 2.5380ms | 394.0150 Ops/s | 402.9492 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 92.3530μs | 37.6551μs | 26.5568 KOps/s | 27.5640 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4641ms | 31.9286μs | 31.3198 KOps/s | 29.2526 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 88.3130μs | 29.8931μs | 33.4526 KOps/s | 33.5629 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1003ms | 23.8088μs | 42.0013 KOps/s | 41.2118 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 79.7490μs | 31.3104μs | 31.9383 KOps/s | 33.4021 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 90.8440μs | 22.9705μs | 43.5340 KOps/s | 42.2034 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1413ms | 52.5882μs | 19.0157 KOps/s | 18.9894 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5064ms | 18.9290μs | 52.8289 KOps/s | 47.8903 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1060ms | 44.8911μs | 22.2761 KOps/s | 22.0622 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 52.8890μs | 18.8106μs | 53.1616 KOps/s | 50.9766 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1007ms | 45.2132μs | 22.1174 KOps/s | 21.6304 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 88.7290μs | 18.8591μs | 53.0249 KOps/s | 51.0477 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1261ms | 54.1619μs | 18.4632 KOps/s | 18.4483 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0628ms | 19.5419μs | 51.1721 KOps/s | 48.0766 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 95.7830μs | 45.4368μs | 22.0086 KOps/s | 21.6267 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 53.0900μs | 18.9474μs | 52.7778 KOps/s | 51.3739 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1026ms | 45.3792μs | 22.0365 KOps/s | 21.6936 KOps/s | |
test_compile_indexing[int-pytree-eager] | 54.5320μs | 18.5241μs | 53.9837 KOps/s | 50.9326 KOps/s | |
test_mod_add[eager] | 85.1990μs | 35.0325μs | 28.5449 KOps/s | 30.0564 KOps/s | |
test_mod_add[compile] | 0.1851ms | 48.6247μs | 20.5657 KOps/s | 20.4849 KOps/s | |
test_mod_add[compile-overhead] | 96.2000μs | 48.6416μs | 20.5585 KOps/s | 20.6692 KOps/s | |
test_mod_wrap[eager] | 0.4267ms | 0.2236ms | 4.4719 KOps/s | 4.5486 KOps/s | |
test_mod_wrap[compile] | 0.4195ms | 0.2075ms | 4.8192 KOps/s | 4.7638 KOps/s | |
test_mod_wrap[compile-overhead] | 0.6244ms | 0.2160ms | 4.6304 KOps/s | 4.8402 KOps/s | |
test_mod_wrap_and_backward[eager] | 16.2305ms | 12.3105ms | 81.2315 Ops/s | 90.6588 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.4814ms | 13.1081ms | 76.2888 Ops/s | 92.5357 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.7886ms | 12.6445ms | 79.0855 Ops/s | 91.9302 Ops/s | |
test_seq_add[eager] | 0.1824ms | 0.1106ms | 9.0437 KOps/s | 9.0032 KOps/s | |
test_seq_add[compile] | 0.1420ms | 63.3038μs | 15.7968 KOps/s | 15.4752 KOps/s | |
test_seq_add[compile-overhead] | 0.1347ms | 61.3361μs | 16.3036 KOps/s | 16.5818 KOps/s | |
test_seq_wrap[eager] | 0.8200ms | 0.4446ms | 2.2491 KOps/s | 2.2989 KOps/s | |
test_seq_wrap[compile] | 0.3378ms | 0.2296ms | 4.3552 KOps/s | 4.2937 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3589ms | 0.2279ms | 4.3884 KOps/s | 4.3505 KOps/s | |
test_func_call_runtime[False-eager] | 0.9526ms | 0.5334ms | 1.8747 KOps/s | 1.8263 KOps/s | |
test_func_call_runtime[False-compile] | 0.5935ms | 0.4280ms | 2.3367 KOps/s | 2.3328 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5481ms | 0.4258ms | 2.3485 KOps/s | 2.3056 KOps/s | |
test_func_call_runtime[True-eager] | 1.4821ms | 0.7500ms | 1.3333 KOps/s | 1.3029 KOps/s | |
test_func_call_runtime[True-compile] | 0.8678ms | 0.4671ms | 2.1409 KOps/s | 2.1210 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5629ms | 0.4663ms | 2.1443 KOps/s | 2.1281 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.2099ms | 0.5329ms | 1.8765 KOps/s | 1.8313 KOps/s | |
test_func_call_cm_runtime[False-compile] | 2.3404ms | 0.4425ms | 2.2599 KOps/s | 2.3181 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7042ms | 0.4270ms | 2.3421 KOps/s | 2.3354 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0199ms | 0.8841ms | 1.1312 KOps/s | 1.1082 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1926ms | 0.4946ms | 2.0220 KOps/s | 2.0145 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6515ms | 0.4906ms | 2.0385 KOps/s | 2.0153 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5610ms | 1.8827ms | 531.1460 Ops/s | 525.5197 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0086ms | 0.5240ms | 1.9085 KOps/s | 1.9011 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.8819ms | 0.5277ms | 1.8950 KOps/s | 1.8704 KOps/s | |
test_distributed | 0.2873ms | 0.1276ms | 7.8395 KOps/s | 7.7147 KOps/s | |
test_tdmodule | 44.9740μs | 26.5220μs | 37.7045 KOps/s | 39.1729 KOps/s | |
test_tdmodule_dispatch | 78.6780μs | 47.3055μs | 21.1392 KOps/s | 20.9383 KOps/s | |
test_tdseq | 53.5710μs | 26.4006μs | 37.8779 KOps/s | 38.5894 KOps/s | |
test_tdseq_dispatch | 81.0520μs | 50.3826μs | 19.8481 KOps/s | 20.8071 KOps/s | |
test_instantiation_functorch | 2.0105ms | 1.5238ms | 656.2519 Ops/s | 624.1480 Ops/s | |
test_exec_functorch | 0.3177ms | 0.1807ms | 5.5328 KOps/s | 5.4460 KOps/s | |
test_exec_functional_call | 0.4739ms | 0.1780ms | 5.6186 KOps/s | 5.7303 KOps/s | |
test_exec_td_decorator | 0.5122ms | 0.2308ms | 4.3320 KOps/s | 4.3311 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0307ms | 0.6659ms | 1.5017 KOps/s | 1.5365 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1518ms | 0.6511ms | 1.5358 KOps/s | 1.5014 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7328ms | 0.5214ms | 1.9177 KOps/s | 1.8736 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9628ms | 0.5198ms | 1.9237 KOps/s | 1.8754 KOps/s | |
test_to_module_speed[True] | 2.1132ms | 1.2853ms | 778.0082 Ops/s | 758.2307 Ops/s | |
test_to_module_speed[False] | 1.6646ms | 1.2564ms | 795.9469 Ops/s | 772.1191 Ops/s | |
test_tc_init | 84.3780μs | 45.9568μs | 21.7596 KOps/s | 22.1186 KOps/s | |
test_tc_init_nested | 0.2087ms | 91.9134μs | 10.8798 KOps/s | 11.3385 KOps/s | |
test_tc_first_layer_tensor | 38.7480μs | 1.5485μs | 645.7697 KOps/s | 648.6115 KOps/s | |
test_tc_first_layer_nontensor | 22.9120μs | 4.7785μs | 209.2697 KOps/s | 211.7119 KOps/s | |
test_tc_second_layer_tensor | 20.4980μs | 2.8649μs | 349.0475 KOps/s | 353.2499 KOps/s | |
test_tc_second_layer_nontensor | 59.1680μs | 6.0248μs | 165.9806 KOps/s | 163.8880 KOps/s | |
test_unbind | 0.2225s | 12.9455ms | 77.2469 Ops/s | 79.8032 Ops/s | |
test_full_like | 16.5745ms | 11.8527ms | 84.3692 Ops/s | 80.0977 Ops/s | |
test_zeros_like | 13.0459ms | 7.8517ms | 127.3611 Ops/s | 134.7831 Ops/s | |
test_ones_like | 12.4532ms | 7.8739ms | 127.0018 Ops/s | 128.3639 Ops/s | |
test_clone | 16.7010ms | 9.5218ms | 105.0222 Ops/s | 103.5484 Ops/s | |
test_squeeze | 64.2210μs | 12.5219μs | 79.8598 KOps/s | 81.6634 KOps/s | |
test_unsqueeze | 0.1476ms | 89.2255μs | 11.2076 KOps/s | 11.0986 KOps/s | |
test_split | 0.4811ms | 0.1955ms | 5.1147 KOps/s | 5.0754 KOps/s | |
test_permute | 0.3244ms | 0.2236ms | 4.4714 KOps/s | 4.5257 KOps/s | |
test_stack | 29.3148ms | 25.6389ms | 39.0032 Ops/s | 39.8665 Ops/s | |
test_cat | 30.5586ms | 25.5284ms | 39.1720 Ops/s | 40.1367 Ops/s |
vmoens
added a commit
that referenced
this pull request
Nov 29, 2024
ghstack-source-id: 55caa5d7c39e0f98c1e0558af2a076fee15f7984 Pull Request resolved: #1118
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 26.0500μs | 11.0223μs | 90.7255 KOps/s | 92.9009 KOps/s | |
test_plain_set_stack_nested | 33.7110μs | 10.9487μs | 91.3351 KOps/s | 92.0098 KOps/s | |
test_plain_set_nested_inplace | 45.4010μs | 11.8148μs | 84.6398 KOps/s | 84.8179 KOps/s | |
test_plain_set_stack_nested_inplace | 46.5610μs | 11.7985μs | 84.7569 KOps/s | 86.2045 KOps/s | |
test_items | 33.8800μs | 2.8963μs | 345.2735 KOps/s | 342.8511 KOps/s | |
test_items_nested | 0.3873ms | 0.3529ms | 2.8339 KOps/s | 2.8653 KOps/s | |
test_items_nested_locked | 0.3854ms | 0.3515ms | 2.8449 KOps/s | 2.8662 KOps/s | |
test_items_nested_leaf | 87.4610μs | 58.2732μs | 17.1606 KOps/s | 17.3396 KOps/s | |
test_items_stack_nested | 0.3911ms | 0.3509ms | 2.8502 KOps/s | 2.8499 KOps/s | |
test_items_stack_nested_leaf | 91.8110μs | 58.8856μs | 16.9821 KOps/s | 16.7199 KOps/s | |
test_items_stack_nested_locked | 0.3848ms | 0.3500ms | 2.8569 KOps/s | 2.8652 KOps/s | |
test_keys | 28.7500μs | 3.4309μs | 291.4698 KOps/s | 290.7440 KOps/s | |
test_keys_nested | 0.1010ms | 69.9521μs | 14.2955 KOps/s | 14.3021 KOps/s | |
test_keys_nested_locked | 0.8127ms | 75.3316μs | 13.2746 KOps/s | 13.2600 KOps/s | |
test_keys_nested_leaf | 94.3010μs | 61.0617μs | 16.3769 KOps/s | 16.2076 KOps/s | |
test_keys_stack_nested | 0.1019ms | 70.8065μs | 14.1230 KOps/s | 14.2093 KOps/s | |
test_keys_stack_nested_leaf | 96.9810μs | 61.2943μs | 16.3147 KOps/s | 16.0995 KOps/s | |
test_keys_stack_nested_locked | 0.1172ms | 75.8249μs | 13.1883 KOps/s | 13.0900 KOps/s | |
test_values | 4.6033μs | 0.8489μs | 1.1780 MOps/s | 1.1480 MOps/s | |
test_values_nested | 71.1410μs | 31.2360μs | 32.0143 KOps/s | 32.2005 KOps/s | |
test_values_nested_locked | 71.4510μs | 32.6532μs | 30.6248 KOps/s | 30.6686 KOps/s | |
test_values_nested_leaf | 59.1010μs | 33.4272μs | 29.9158 KOps/s | 29.8841 KOps/s | |
test_values_stack_nested | 56.4410μs | 31.5443μs | 31.7015 KOps/s | 31.4382 KOps/s | |
test_values_stack_nested_leaf | 69.7710μs | 33.8385μs | 29.5522 KOps/s | 29.1959 KOps/s | |
test_values_stack_nested_locked | 74.2510μs | 33.0397μs | 30.2666 KOps/s | 30.1143 KOps/s | |
test_membership | 2.0235μs | 0.5095μs | 1.9625 MOps/s | 1.9910 MOps/s | |
test_membership_nested | 19.5900μs | 1.9791μs | 505.2928 KOps/s | 531.4262 KOps/s | |
test_membership_nested_leaf | 18.6055μs | 1.9714μs | 507.2458 KOps/s | 517.5170 KOps/s | |
test_membership_stacked_nested | 34.1800μs | 2.0581μs | 485.8855 KOps/s | 496.9786 KOps/s | |
test_membership_stacked_nested_leaf | 48.3500μs | 2.0922μs | 477.9698 KOps/s | 501.8514 KOps/s | |
test_membership_nested_last | 38.7810μs | 2.9583μs | 338.0322 KOps/s | 348.0033 KOps/s | |
test_membership_nested_leaf_last | 26.0900μs | 2.9570μs | 338.1861 KOps/s | 347.2838 KOps/s | |
test_membership_stacked_nested_last | 25.5100μs | 2.9408μs | 340.0404 KOps/s | 297.2915 KOps/s | |
test_membership_stacked_nested_leaf_last | 55.3110μs | 2.9284μs | 341.4820 KOps/s | 301.4147 KOps/s | |
test_nested_getleaf | 30.8100μs | 6.0825μs | 164.4068 KOps/s | 164.5754 KOps/s | |
test_nested_get | 40.0910μs | 5.7630μs | 173.5201 KOps/s | 173.0323 KOps/s | |
test_stacked_getleaf | 40.1010μs | 6.0971μs | 164.0112 KOps/s | 165.1822 KOps/s | |
test_stacked_get | 31.1810μs | 5.7704μs | 173.2990 KOps/s | 173.1350 KOps/s | |
test_nested_getitemleaf | 0.7511ms | 6.1052μs | 163.7952 KOps/s | 162.5913 KOps/s | |
test_nested_getitem | 37.3310μs | 5.8377μs | 171.3004 KOps/s | 171.2466 KOps/s | |
test_stacked_getitemleaf | 0.3885ms | 6.1634μs | 162.2486 KOps/s | 162.2402 KOps/s | |
test_stacked_getitem | 27.9500μs | 5.8428μs | 171.1509 KOps/s | 171.3648 KOps/s | |
test_lock_nested | 9.3991ms | 0.3705ms | 2.6992 KOps/s | 2.7220 KOps/s | |
test_lock_stack_nested | 0.7214ms | 0.3312ms | 3.0191 KOps/s | 3.0443 KOps/s | |
test_unlock_nested | 0.7194ms | 0.3009ms | 3.3229 KOps/s | 3.3338 KOps/s | |
test_unlock_stack_nested | 0.6643ms | 0.2709ms | 3.6920 KOps/s | 3.7282 KOps/s | |
test_flatten_speed | 0.1150ms | 74.4722μs | 13.4278 KOps/s | 13.3838 KOps/s | |
test_unflatten_speed | 0.6876ms | 0.3030ms | 3.3003 KOps/s | 3.3534 KOps/s | |
test_common_ops | 1.6540ms | 0.5950ms | 1.6808 KOps/s | 1.6943 KOps/s | |
test_creation | 0.1757ms | 1.4162μs | 706.1339 KOps/s | 697.7189 KOps/s | |
test_creation_empty | 37.4000μs | 7.9120μs | 126.3903 KOps/s | 130.5294 KOps/s | |
test_creation_nested_1 | 34.1210μs | 9.5462μs | 104.7534 KOps/s | 108.7915 KOps/s | |
test_creation_nested_2 | 36.7800μs | 12.0786μs | 82.7910 KOps/s | 85.4482 KOps/s | |
test_clone | 57.5010μs | 10.2455μs | 97.6042 KOps/s | 95.2751 KOps/s | |
test_getitem[int] | 92.9929ms | 15.5477μs | 64.3182 KOps/s | 95.7063 KOps/s | |
test_getitem[slice_int] | 0.1049ms | 20.0667μs | 49.8338 KOps/s | 50.2216 KOps/s | |
test_getitem[range] | 0.1285ms | 35.7686μs | 27.9575 KOps/s | 26.3236 KOps/s | |
test_getitem[tuple] | 0.1046ms | 17.1682μs | 58.2473 KOps/s | 56.7023 KOps/s | |
test_getitem[list] | 0.2189ms | 31.5775μs | 31.6681 KOps/s | 30.2904 KOps/s | |
test_setitem_dim[int] | 37.0500μs | 17.8681μs | 55.9658 KOps/s | 54.0842 KOps/s | |
test_setitem_dim[slice_int] | 61.0010μs | 36.7094μs | 27.2410 KOps/s | 26.9746 KOps/s | |
test_setitem_dim[range] | 76.0710μs | 50.8721μs | 19.6571 KOps/s | 19.1068 KOps/s | |
test_setitem_dim[tuple] | 58.4910μs | 30.3960μs | 32.8990 KOps/s | 31.8455 KOps/s | |
test_setitem | 84.3420μs | 14.6196μs | 68.4013 KOps/s | 66.5654 KOps/s | |
test_set | 96.1910μs | 14.3789μs | 69.5465 KOps/s | 69.3309 KOps/s | |
test_set_shared | 1.4763ms | 0.1444ms | 6.9272 KOps/s | 6.8823 KOps/s | |
test_update | 0.6485ms | 17.0839μs | 58.5348 KOps/s | 58.5338 KOps/s | |
test_update_nested | 89.1320μs | 21.7210μs | 46.0385 KOps/s | 45.0854 KOps/s | |
test_update__nested | 0.7610ms | 23.6804μs | 42.2290 KOps/s | 41.2018 KOps/s | |
test_set_nested | 85.0720μs | 15.3814μs | 65.0135 KOps/s | 64.2435 KOps/s | |
test_set_nested_new | 0.1015ms | 17.4464μs | 57.3183 KOps/s | 54.9287 KOps/s | |
test_select | 90.6120μs | 29.5601μs | 33.8293 KOps/s | 33.2074 KOps/s | |
test_select_nested | 91.0310μs | 41.2689μs | 24.2313 KOps/s | 24.0493 KOps/s | |
test_exclude_nested | 96.9420μs | 59.2190μs | 16.8865 KOps/s | 16.9276 KOps/s | |
test_empty[True] | 0.3090ms | 0.2685ms | 3.7242 KOps/s | 3.7025 KOps/s | |
test_empty[False] | 5.3461μs | 0.7389μs | 1.3533 MOps/s | 1.3503 MOps/s | |
test_to | 85.4810μs | 52.9633μs | 18.8810 KOps/s | 18.5573 KOps/s | |
test_to_nonblocking | 76.9510μs | 45.1119μs | 22.1671 KOps/s | 20.8234 KOps/s | |
test_unbind_speed | 0.2587ms | 0.2268ms | 4.4089 KOps/s | 4.4400 KOps/s | |
test_unbind_speed_stack0 | 0.3304ms | 0.2254ms | 4.4371 KOps/s | 4.3853 KOps/s | |
test_unbind_speed_stack1 | 92.5130ms | 0.6434ms | 1.5543 KOps/s | 1.5607 KOps/s | |
test_split | 93.8196ms | 1.6762ms | 596.5977 Ops/s | 584.8149 Ops/s | |
test_chunk | 96.4195ms | 1.5600ms | 641.0322 Ops/s | 693.1673 Ops/s | |
test_consolidate[False-None] | 2.6493ms | 2.5695ms | 389.1876 Ops/s | 350.6690 Ops/s | |
test_consolidate[default-None] | 1.8096ms | 1.6510ms | 605.6769 Ops/s | 594.3022 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7554ms | 1.6914ms | 591.2312 Ops/s | 576.3324 Ops/s | |
test_consolidate_njt[False-None] | 6.6730ms | 6.3885ms | 156.5322 Ops/s | 156.8570 Ops/s | |
test_to[False-False-None] | 1.7701ms | 1.6834ms | 594.0210 Ops/s | 577.0158 Ops/s | |
test_to[True-False-None] | 1.5043ms | 1.2637ms | 791.3137 Ops/s | 774.4353 Ops/s | |
test_to[within-False-None] | 0.2916s | 5.1795ms | 193.0690 Ops/s | 252.9487 Ops/s | |
test_to[True-default-None] | 5.3816ms | 5.0941ms | 196.3055 Ops/s | 191.8540 Ops/s | |
test_to_njt[False-False-None] | 6.9438ms | 6.8324ms | 146.3625 Ops/s | 145.3686 Ops/s | |
test_to_njt[True-False-None] | 6.0032ms | 5.4224ms | 184.4185 Ops/s | 183.3660 Ops/s | |
test_to_njt[within-False-None] | 11.9939ms | 11.7700ms | 84.9617 Ops/s | 84.8540 Ops/s | |
test_creation[device0] | 0.5800ms | 78.5332μs | 12.7335 KOps/s | 12.4178 KOps/s | |
test_creation_from_tensor | 0.4699ms | 80.6563μs | 12.3983 KOps/s | 12.1472 KOps/s | |
test_add_one[memmap_tensor0] | 0.2901ms | 6.6587μs | 150.1803 KOps/s | 148.2092 KOps/s | |
test_contiguous[memmap_tensor0] | 1.7695μs | 0.3932μs | 2.5434 MOps/s | 2.5738 MOps/s | |
test_stack[memmap_tensor0] | 41.4110μs | 4.1203μs | 242.6994 KOps/s | 229.3789 KOps/s | |
test_memmaptd_index | 1.9910ms | 0.2453ms | 4.0772 KOps/s | 4.0022 KOps/s | |
test_memmaptd_index_astensor | 0.9649ms | 0.2960ms | 3.3784 KOps/s | 3.2627 KOps/s | |
test_memmaptd_index_op | 1.1045ms | 0.5639ms | 1.7732 KOps/s | 1.7190 KOps/s | |
test_serialize_model | 0.1315s | 0.1300s | 7.6931 Ops/s | 7.7262 Ops/s | |
test_serialize_model_pickle | 1.3465s | 1.1868s | 0.8426 Ops/s | 0.8252 Ops/s | |
test_serialize_weights | 0.1305s | 0.1298s | 7.7058 Ops/s | 7.7287 Ops/s | |
test_serialize_weights_returnearly | 48.9859ms | 39.5732ms | 25.2696 Ops/s | 23.3636 Ops/s | |
test_serialize_weights_pickle | 1.3500s | 1.2128s | 0.8245 Ops/s | 0.8210 Ops/s | |
test_reshape_pytree | 62.3710μs | 21.4635μs | 46.5908 KOps/s | 46.0688 KOps/s | |
test_reshape_td | 91.0510μs | 25.9026μs | 38.6061 KOps/s | 38.5314 KOps/s | |
test_view_pytree | 0.1589ms | 21.4921μs | 46.5287 KOps/s | 46.6435 KOps/s | |
test_view_td | 60.9310μs | 29.4466μs | 33.9598 KOps/s | 32.7994 KOps/s | |
test_unbind_pytree | 51.9810μs | 27.3246μs | 36.5971 KOps/s | 36.0964 KOps/s | |
test_unbind_td | 0.5978ms | 35.4813μs | 28.1838 KOps/s | 28.1339 KOps/s | |
test_split_pytree | 56.1910μs | 28.9911μs | 34.4933 KOps/s | 34.5595 KOps/s | |
test_split_td | 0.1843ms | 36.7004μs | 27.2476 KOps/s | 26.4372 KOps/s | |
test_add_pytree | 0.1194ms | 33.1364μs | 30.1783 KOps/s | 29.5141 KOps/s | |
test_add_td | 88.8310μs | 44.7957μs | 22.3236 KOps/s | 23.3904 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1707ms | 0.1180ms | 8.4730 KOps/s | 8.0876 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2130ms | 0.1215ms | 8.2295 KOps/s | 8.0471 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1326ms | 94.5347μs | 10.5781 KOps/s | 10.2434 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.0952ms | 0.1479ms | 6.7615 KOps/s | 6.6079 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 55.8410μs | 22.7213μs | 44.0115 KOps/s | 43.8482 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 85.1220μs | 25.8927μs | 38.6209 KOps/s | 37.7770 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4357ms | 63.3612μs | 15.7825 KOps/s | 15.5856 KOps/s | |
test_compile_copy_nested[pytree-eager] | 87.2710μs | 48.9006μs | 20.4496 KOps/s | 20.1436 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1876ms | 0.1422ms | 7.0339 KOps/s | 6.7613 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3169ms | 0.2088ms | 4.7898 KOps/s | 4.8140 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1431ms | 99.8145μs | 10.0186 KOps/s | 10.2668 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1382ms | 50.0036μs | 19.9985 KOps/s | 19.7863 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1903ms | 0.1360ms | 7.3546 KOps/s | 7.2883 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5817ms | 0.4787ms | 2.0891 KOps/s | 2.0520 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3682ms | 0.2452ms | 4.0787 KOps/s | 4.0493 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2476ms | 0.1470ms | 6.8010 KOps/s | 6.9421 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1757ms | 61.4333μs | 16.2778 KOps/s | 16.4469 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1493ms | 98.2023μs | 10.1831 KOps/s | 10.1389 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4874ms | 0.4054ms | 2.4667 KOps/s | 2.4174 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1725ms | 0.1370ms | 7.3008 KOps/s | 7.4126 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 53.9710μs | 18.2816μs | 54.6998 KOps/s | 42.5452 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 75.2110μs | 26.7448μs | 37.3904 KOps/s | 37.8889 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1020ms | 69.2380μs | 14.4429 KOps/s | 14.6107 KOps/s | |
test_compile_copy_flat[pytree-eager] | 96.1120μs | 50.7765μs | 19.6942 KOps/s | 19.5742 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6007ms | 0.3882ms | 2.5761 KOps/s | 2.2411 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6994ms | 2.6022ms | 384.2870 Ops/s | 380.3197 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5597ms | 0.3744ms | 2.6712 KOps/s | 2.2721 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7505ms | 2.6580ms | 376.2240 Ops/s | 372.5106 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.6549ms | 0.1132ms | 8.8309 KOps/s | 8.7207 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5513ms | 78.6618μs | 12.7126 KOps/s | 12.6444 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.3251ms | 0.1064ms | 9.3970 KOps/s | 8.9838 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1054ms | 67.2123μs | 14.8782 KOps/s | 14.2384 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1668ms | 0.1067ms | 9.3745 KOps/s | 8.9425 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1061ms | 67.6097μs | 14.7908 KOps/s | 14.1630 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1394ms | 99.1323μs | 10.0875 KOps/s | 9.9367 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1420ms | 16.5953μs | 60.2580 KOps/s | 59.0751 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1379ms | 94.2783μs | 10.6069 KOps/s | 10.4424 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 50.4310μs | 15.4939μs | 64.5417 KOps/s | 64.1663 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2237ms | 95.7753μs | 10.4411 KOps/s | 10.4505 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 51.2110μs | 15.3505μs | 65.1443 KOps/s | 62.1145 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1674ms | 99.7366μs | 10.0264 KOps/s | 9.8966 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5985ms | 16.4235μs | 60.8885 KOps/s | 58.9186 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1543ms | 95.1127μs | 10.5138 KOps/s | 10.2986 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 47.3210μs | 15.3174μs | 65.2852 KOps/s | 64.7845 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1367ms | 95.5803μs | 10.4624 KOps/s | 10.4983 KOps/s | |
test_compile_indexing[int-pytree-eager] | 51.8610μs | 15.2492μs | 65.5774 KOps/s | 64.7428 KOps/s | |
test_mod_add[eager] | 78.5210μs | 36.4018μs | 27.4712 KOps/s | 27.1484 KOps/s | |
test_mod_add[compile] | 0.1611ms | 84.7502μs | 11.7994 KOps/s | 12.5228 KOps/s | |
test_mod_add[compile-overhead] | 0.3309ms | 0.1667ms | 6.0006 KOps/s | 5.7554 KOps/s | |
test_mod_wrap[eager] | 0.3863ms | 0.2466ms | 4.0559 KOps/s | 3.9688 KOps/s | |
test_mod_wrap[compile] | 0.3456ms | 0.2798ms | 3.5734 KOps/s | 3.4930 KOps/s | |
test_mod_wrap[compile-overhead] | 7.0751ms | 3.7633ms | 265.7278 Ops/s | 265.1473 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4610ms | 1.3574ms | 736.6808 Ops/s | 677.5424 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3851ms | 1.2544ms | 797.1666 Ops/s | 721.8783 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3576ms | 0.9250ms | 1.0811 KOps/s | 981.8195 Ops/s | |
test_seq_add[eager] | 0.1638ms | 0.1144ms | 8.7401 KOps/s | 8.7650 KOps/s | |
test_seq_add[compile] | 0.1684ms | 90.3926μs | 11.0628 KOps/s | 11.1204 KOps/s | |
test_seq_add[compile-overhead] | 0.1692ms | 0.1274ms | 7.8500 KOps/s | 7.9276 KOps/s | |
test_seq_wrap[eager] | 0.4855ms | 0.4091ms | 2.4441 KOps/s | 2.3315 KOps/s | |
test_seq_wrap[compile] | 0.4015ms | 0.2968ms | 3.3687 KOps/s | 3.3143 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2749ms | 0.2225ms | 4.4941 KOps/s | 4.4408 KOps/s | |
test_func_call_runtime[False-eager] | 0.8028ms | 0.7386ms | 1.3539 KOps/s | 1.3103 KOps/s | |
test_func_call_runtime[False-compile] | 1.0303ms | 0.7318ms | 1.3665 KOps/s | 1.3304 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4199ms | 0.3558ms | 2.8105 KOps/s | 2.8011 KOps/s | |
test_func_call_runtime[True-eager] | 0.9855ms | 0.8990ms | 1.1123 KOps/s | 1.0939 KOps/s | |
test_func_call_runtime[True-compile] | 0.8184ms | 0.7467ms | 1.3393 KOps/s | 1.3004 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4509ms | 0.3758ms | 2.6607 KOps/s | 2.6618 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8217ms | 0.7662ms | 1.3051 KOps/s | 1.3504 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.7950ms | 0.7318ms | 1.3664 KOps/s | 1.3302 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4034ms | 0.3574ms | 2.7983 KOps/s | 2.7942 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0974ms | 0.9972ms | 1.0028 KOps/s | 988.8271 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.9241ms | 0.7798ms | 1.2824 KOps/s | 1.2512 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4545ms | 0.4015ms | 2.4905 KOps/s | 2.4800 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5265ms | 2.0816ms | 480.3954 Ops/s | 477.3523 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8672ms | 0.7946ms | 1.2585 KOps/s | 1.2291 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4743ms | 0.4037ms | 2.4772 KOps/s | 2.4675 KOps/s | |
test_distributed | 2.8188ms | 0.2105ms | 4.7499 KOps/s | 8.8704 KOps/s | |
test_tdmodule | 54.7310μs | 19.3627μs | 51.6457 KOps/s | 51.6601 KOps/s | |
test_tdmodule_dispatch | 88.7610μs | 34.9961μs | 28.5746 KOps/s | 29.3002 KOps/s | |
test_tdseq | 43.2710μs | 19.2796μs | 51.8683 KOps/s | 52.0132 KOps/s | |
test_tdseq_dispatch | 60.6910μs | 36.6242μs | 27.3044 KOps/s | 27.2952 KOps/s | |
test_instantiation_functorch | 1.7101ms | 1.4932ms | 669.7001 Ops/s | 648.6043 Ops/s | |
test_exec_functorch | 0.1910ms | 0.1375ms | 7.2711 KOps/s | 6.9379 KOps/s | |
test_exec_functional_call | 0.1752ms | 0.1308ms | 7.6447 KOps/s | 7.2072 KOps/s | |
test_exec_td_decorator | 0.3614ms | 0.1781ms | 5.6138 KOps/s | 5.4537 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7795ms | 0.7008ms | 1.4269 KOps/s | 1.4623 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8781ms | 0.6818ms | 1.4666 KOps/s | 1.4651 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7975ms | 0.6019ms | 1.6615 KOps/s | 1.6761 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7282ms | 0.6115ms | 1.6354 KOps/s | 1.6761 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1032ms | 19.2628ms | 51.9135 Ops/s | 51.8116 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2065ms | 19.7039ms | 50.7513 Ops/s | 51.8385 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.0857ms | 19.3097ms | 51.7874 Ops/s | 52.1480 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.1365ms | 19.2304ms | 52.0011 Ops/s | 52.1280 Ops/s | |
test_to_module_speed[True] | 1.0406ms | 0.9265ms | 1.0793 KOps/s | 1.0662 KOps/s | |
test_to_module_speed[False] | 1.2143ms | 0.9072ms | 1.1023 KOps/s | 1.0866 KOps/s | |
test_tc_init | 67.1310μs | 33.5903μs | 29.7705 KOps/s | 28.1818 KOps/s | |
test_tc_init_nested | 0.1531ms | 68.3522μs | 14.6301 KOps/s | 13.4406 KOps/s | |
test_tc_first_layer_tensor | 3.8716μs | 0.6881μs | 1.4533 MOps/s | 1.4434 MOps/s | |
test_tc_first_layer_nontensor | 22.7010μs | 2.2866μs | 437.3248 KOps/s | 435.1903 KOps/s | |
test_tc_second_layer_tensor | 6.6100μs | 1.3902μs | 719.3031 KOps/s | 708.7440 KOps/s | |
test_tc_second_layer_nontensor | 20.9700μs | 3.0327μs | 329.7435 KOps/s | 328.0304 KOps/s | |
test_unbind | 0.2297s | 10.0905ms | 99.1031 Ops/s | 151.8510 Ops/s | |
test_full_like | 9.3884ms | 9.1746ms | 108.9968 Ops/s | 108.3733 Ops/s | |
test_zeros_like | 4.8671ms | 4.3224ms | 231.3554 Ops/s | 113.9843 Ops/s | |
test_ones_like | 5.4575ms | 4.3274ms | 231.0874 Ops/s | 235.6219 Ops/s | |
test_clone | 6.6878ms | 6.4400ms | 155.2798 Ops/s | 155.2020 Ops/s | |
test_squeeze | 56.7010μs | 9.3882μs | 106.5163 KOps/s | 111.4314 KOps/s | |
test_unsqueeze | 0.2137ms | 70.3517μs | 14.2143 KOps/s | 14.5195 KOps/s | |
test_split | 0.2594ms | 0.1524ms | 6.5638 KOps/s | 6.4092 KOps/s | |
test_permute | 0.2203ms | 0.1721ms | 5.8117 KOps/s | 5.7378 KOps/s | |
test_stack | 50.7144ms | 50.4273ms | 19.8305 Ops/s | 19.8731 Ops/s | |
test_cat | 53.5734ms | 50.8799ms | 19.6541 Ops/s | 19.9377 Ops/s |
vmoens
added a commit
that referenced
this pull request
Dec 2, 2024
ghstack-source-id: 55caa5d7c39e0f98c1e0558af2a076fee15f7984 Pull Request resolved: #1118
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):