Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TD3-bc compatibility with compile #2657

Merged
merged 14 commits into from
Dec 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 16, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2657

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 7 Unrelated Failures

As of commit 8206ec0 with merge base 87a59fb (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 16, 2024
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 530b321b67ed62812f76c3ac37cf1e9b82961531
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 0db0897be82b4c663ff596db895e1a63fc0c5b5d
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 75c666ded2eae7457f181f4a257e1e20d3209308
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: f51ecaf8052d073ff566d8ccd71cb6075f2dd3b0
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 2114a0cc0074c53b5a5754f80662c3126fe7215c
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 62789d7d412acdc46dd8419ea0cbb7f332968409
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 7a30f7c8591f4f029379fe415ca2b93976691862
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 4d87db760f280a4bddf4f7b54a7bc1819f5fa2b9
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: d45bbc707def700e72d2bfc7515c631f8a337451
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 945a6f3d039f0653d09a23efca175144bfc6fa0a
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: ea9b13d63c2e63f6465dfe828d202d7eb3ece193
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: a210a36df2e3da3426f1c06766f6185817b0ed29
Pull Request resolved: #2657
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: a210a36df2e3da3426f1c06766f6185817b0ed29
Pull Request resolved: #2657
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: a210a36df2e3da3426f1c06766f6185817b0ed29
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: ea5ddee5919663234d8fb84c57267fc234964c79
Pull Request resolved: #2657
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 8a33e39829f620c1e1a579a0255162ba93eaca91
Pull Request resolved: #2657
@vmoens vmoens added the enhancement New feature or request label Dec 16, 2024
@vmoens vmoens merged commit 8206ec0 into gh/vmoens/58/base Dec 16, 2024
61 of 66 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 8a33e39829f620c1e1a579a0255162ba93eaca91
Pull Request resolved: #2657
@vmoens vmoens deleted the gh/vmoens/58/head branch December 16, 2024 04:14
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4305s 0.4286s 2.3331 Ops/s 2.1945 Ops/s $\textbf{\color{#35bf28}+6.32\%}$
test_transformed 0.6036s 0.6016s 1.6621 Ops/s 1.5951 Ops/s $\color{#35bf28}+4.20\%$
test_serial 1.3500s 1.3450s 0.7435 Ops/s 0.7283 Ops/s $\color{#35bf28}+2.09\%$
test_parallel 1.3817s 1.3030s 0.7675 Ops/s 0.7520 Ops/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[True-True-True-True-True] 0.3609ms 29.0596μs 34.4121 KOps/s 33.5514 KOps/s $\color{#35bf28}+2.57\%$
test_step_mdp_speed[True-True-True-True-False] 75.7020μs 17.1801μs 58.2067 KOps/s 57.3838 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[True-True-True-False-True] 47.7600μs 16.3518μs 61.1553 KOps/s 59.4005 KOps/s $\color{#35bf28}+2.95\%$
test_step_mdp_speed[True-True-True-False-False] 31.3690μs 9.6605μs 103.5148 KOps/s 100.4622 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[True-True-False-True-True] 72.3860μs 31.4912μs 31.7549 KOps/s 31.3210 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-True-False-True-False] 52.6580μs 19.4871μs 51.3159 KOps/s 51.4454 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-True-False-False-True] 41.5080μs 18.3691μs 54.4392 KOps/s 52.5359 KOps/s $\color{#35bf28}+3.62\%$
test_step_mdp_speed[True-True-False-False-False] 35.5260μs 11.6095μs 86.1362 KOps/s 86.1433 KOps/s $-0.01\%$
test_step_mdp_speed[True-False-True-True-True] 72.8970μs 33.3750μs 29.9625 KOps/s 29.8749 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-False-True-True-False] 46.4370μs 21.2202μs 47.1250 KOps/s 47.1471 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-True-False-True] 60.2030μs 18.6558μs 53.6027 KOps/s 53.8316 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-True-False-False] 37.2400μs 11.5821μs 86.3400 KOps/s 85.5559 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[True-False-False-True-True] 0.1098ms 34.9070μs 28.6476 KOps/s 28.2657 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[True-False-False-True-False] 44.8330μs 22.7503μs 43.9555 KOps/s 43.5068 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-False-False-False-True] 60.9240μs 20.0087μs 49.9781 KOps/s 49.3019 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-False-False-False-False] 52.2780μs 13.2525μs 75.4575 KOps/s 75.3342 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[False-True-True-True-True] 0.2457ms 35.1531μs 28.4470 KOps/s 29.5111 KOps/s $\color{#d91a1a}-3.61\%$
test_step_mdp_speed[False-True-True-True-False] 52.6580μs 21.0032μs 47.6118 KOps/s 43.8865 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_step_mdp_speed[False-True-True-False-True] 58.0290μs 21.3056μs 46.9360 KOps/s 47.6584 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[False-True-True-False-False] 33.9730μs 12.7911μs 78.1792 KOps/s 78.2526 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-True-False-True-True] 75.6410μs 34.8476μs 28.6964 KOps/s 28.2760 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[False-True-False-True-False] 50.6150μs 22.7874μs 43.8840 KOps/s 43.7818 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-True-False-False-True] 2.8466ms 23.3422μs 42.8408 KOps/s 44.1560 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[False-True-False-False-False] 41.1170μs 14.5402μs 68.7749 KOps/s 68.5907 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-True-True-True] 83.7460μs 36.9046μs 27.0969 KOps/s 26.9326 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-False-True-True-False] 0.1595ms 24.6748μs 40.5271 KOps/s 40.4902 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-False-True-False-True] 0.1090ms 22.6233μs 44.2022 KOps/s 43.5397 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-False-True-False-False] 0.6398ms 14.6577μs 68.2237 KOps/s 68.8289 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[False-False-False-True-True] 74.8110μs 38.0640μs 26.2715 KOps/s 25.9902 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[False-False-False-True-False] 55.8440μs 26.1890μs 38.1840 KOps/s 38.3208 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[False-False-False-False-True] 62.9470μs 24.3725μs 41.0298 KOps/s 41.5361 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-False-False-False-False] 52.4480μs 16.1878μs 61.7748 KOps/s 61.7577 KOps/s $\color{#35bf28}+0.03\%$
test_values[generalized_advantage_estimate-True-True] 12.4954ms 9.6852ms 103.2498 Ops/s 105.7452 Ops/s $\color{#d91a1a}-2.36\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.1762ms 33.8169ms 29.5710 Ops/s 28.1794 Ops/s $\color{#35bf28}+4.94\%$
test_values[td0_return_estimate-False-False] 0.3159ms 0.1979ms 5.0536 KOps/s 5.4486 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_values[td1_return_estimate-False-False] 27.2168ms 24.0923ms 41.5071 Ops/s 42.0853 Ops/s $\color{#d91a1a}-1.37\%$
test_values[vec_td1_return_estimate-False-False] 35.9076ms 33.4947ms 29.8555 Ops/s 28.1275 Ops/s $\textbf{\color{#35bf28}+6.14\%}$
test_values[td_lambda_return_estimate-True-False] 38.5497ms 34.7542ms 28.7735 Ops/s 29.2790 Ops/s $\color{#d91a1a}-1.73\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.2503ms 33.5387ms 29.8163 Ops/s 28.1042 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.5926ms 8.3767ms 119.3791 Ops/s 120.9762 Ops/s $\color{#d91a1a}-1.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3938ms 1.9583ms 510.6555 Ops/s 539.3720 Ops/s $\textbf{\color{#d91a1a}-5.32\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5936ms 0.3586ms 2.7887 KOps/s 2.7964 KOps/s $\color{#d91a1a}-0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.0402ms 48.2306ms 20.7337 Ops/s 20.0678 Ops/s $\color{#35bf28}+3.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.3723ms 3.0300ms 330.0334 Ops/s 313.3887 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_dqn_speed[False-None] 5.8980ms 1.3711ms 729.3592 Ops/s 717.0549 Ops/s $\color{#35bf28}+1.72\%$
test_dqn_speed[False-backward] 2.4940ms 1.8828ms 531.1270 Ops/s 535.1949 Ops/s $\color{#d91a1a}-0.76\%$
test_dqn_speed[True-None] 0.7261ms 0.4654ms 2.1488 KOps/s 2.1158 KOps/s $\color{#35bf28}+1.56\%$
test_dqn_speed[True-backward] 0.9737ms 0.8811ms 1.1349 KOps/s 828.3730 Ops/s $\textbf{\color{#35bf28}+37.00\%}$
test_dqn_speed[reduce-overhead-None] 0.8566ms 0.4684ms 2.1351 KOps/s 2.1375 KOps/s $\color{#d91a1a}-0.11\%$
test_dqn_speed[reduce-overhead-backward] 0.9351ms 0.8781ms 1.1388 KOps/s 1.1105 KOps/s $\color{#35bf28}+2.54\%$
test_ddpg_speed[False-None] 3.6227ms 2.8963ms 345.2710 Ops/s 348.5998 Ops/s $\color{#d91a1a}-0.95\%$
test_ddpg_speed[False-backward] 4.2916ms 3.9841ms 250.9989 Ops/s 250.2042 Ops/s $\color{#35bf28}+0.32\%$
test_ddpg_speed[True-None] 1.6850ms 1.0160ms 984.2341 Ops/s 990.7216 Ops/s $\color{#d91a1a}-0.65\%$
test_ddpg_speed[True-backward] 1.9856ms 1.9235ms 519.8980 Ops/s 454.2110 Ops/s $\textbf{\color{#35bf28}+14.46\%}$
test_ddpg_speed[reduce-overhead-None] 1.3838ms 1.0060ms 993.9892 Ops/s 990.5173 Ops/s $\color{#35bf28}+0.35\%$
test_ddpg_speed[reduce-overhead-backward] 1.9308ms 1.8956ms 527.5455 Ops/s 520.4276 Ops/s $\color{#35bf28}+1.37\%$
test_sac_speed[False-None] 9.1979ms 8.0008ms 124.9872 Ops/s 122.4288 Ops/s $\color{#35bf28}+2.09\%$
test_sac_speed[False-backward] 11.0565ms 10.7326ms 93.1742 Ops/s 91.2236 Ops/s $\color{#35bf28}+2.14\%$
test_sac_speed[True-None] 2.0440ms 1.8182ms 549.9804 Ops/s 545.4180 Ops/s $\color{#35bf28}+0.84\%$
test_sac_speed[True-backward] 3.7119ms 3.5374ms 282.6904 Ops/s 280.0774 Ops/s $\color{#35bf28}+0.93\%$
test_sac_speed[reduce-overhead-None] 2.5026ms 1.8370ms 544.3683 Ops/s 542.6021 Ops/s $\color{#35bf28}+0.33\%$
test_sac_speed[reduce-overhead-backward] 3.6759ms 3.5684ms 280.2382 Ops/s 280.0872 Ops/s $\color{#35bf28}+0.05\%$
test_redq_speed[False-None] 19.2684ms 13.4043ms 74.6031 Ops/s 70.5154 Ops/s $\textbf{\color{#35bf28}+5.80\%}$
test_redq_speed[False-backward] 25.9526ms 22.3611ms 44.7206 Ops/s 42.5934 Ops/s $\color{#35bf28}+4.99\%$
test_redq_speed[True-None] 5.5311ms 4.5811ms 218.2862 Ops/s 180.3496 Ops/s $\textbf{\color{#35bf28}+21.04\%}$
test_redq_speed[True-backward] 12.5026ms 12.2131ms 81.8793 Ops/s 80.6788 Ops/s $\color{#35bf28}+1.49\%$
test_redq_speed[reduce-overhead-None] 5.2739ms 4.5693ms 218.8516 Ops/s 203.4472 Ops/s $\textbf{\color{#35bf28}+7.57\%}$
test_redq_speed[reduce-overhead-backward] 12.9847ms 12.3106ms 81.2311 Ops/s 79.9422 Ops/s $\color{#35bf28}+1.61\%$
test_redq_deprec_speed[False-None] 14.5953ms 12.8043ms 78.0988 Ops/s 75.6448 Ops/s $\color{#35bf28}+3.24\%$
test_redq_deprec_speed[False-backward] 20.1145ms 18.5968ms 53.7727 Ops/s 52.0846 Ops/s $\color{#35bf28}+3.24\%$
test_redq_deprec_speed[True-None] 4.3256ms 3.6184ms 276.3634 Ops/s 271.8395 Ops/s $\color{#35bf28}+1.66\%$
test_redq_deprec_speed[True-backward] 8.2336ms 7.9827ms 125.2709 Ops/s 119.7558 Ops/s $\color{#35bf28}+4.61\%$
test_redq_deprec_speed[reduce-overhead-None] 4.9429ms 3.6071ms 277.2305 Ops/s 262.3498 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_redq_deprec_speed[reduce-overhead-backward] 11.0754ms 8.1405ms 122.8419 Ops/s 118.2368 Ops/s $\color{#35bf28}+3.89\%$
test_td3_speed[False-None] 9.1165ms 7.9908ms 125.1445 Ops/s 121.5181 Ops/s $\color{#35bf28}+2.98\%$
test_td3_speed[False-backward] 12.6563ms 10.5411ms 94.8669 Ops/s 93.3079 Ops/s $\color{#35bf28}+1.67\%$
test_td3_speed[True-None] 1.7717ms 1.6999ms 588.2741 Ops/s 564.8527 Ops/s $\color{#35bf28}+4.15\%$
test_td3_speed[True-backward] 3.3711ms 3.3104ms 302.0758 Ops/s 291.9134 Ops/s $\color{#35bf28}+3.48\%$
test_td3_speed[reduce-overhead-None] 1.9007ms 1.6974ms 589.1382 Ops/s 564.4677 Ops/s $\color{#35bf28}+4.37\%$
test_td3_speed[reduce-overhead-backward] 3.3763ms 3.3093ms 302.1765 Ops/s 291.7873 Ops/s $\color{#35bf28}+3.56\%$
test_cql_speed[False-None] 38.4833ms 36.3346ms 27.5220 Ops/s 26.9654 Ops/s $\color{#35bf28}+2.06\%$
test_cql_speed[False-backward] 50.8317ms 47.0684ms 21.2457 Ops/s 21.1855 Ops/s $\color{#35bf28}+0.28\%$
test_cql_speed[True-None] 18.8506ms 15.7670ms 63.4238 Ops/s 62.2687 Ops/s $\color{#35bf28}+1.85\%$
test_cql_speed[True-backward] 23.6991ms 22.3757ms 44.6914 Ops/s 43.7720 Ops/s $\color{#35bf28}+2.10\%$
test_cql_speed[reduce-overhead-None] 16.6021ms 15.5884ms 64.1502 Ops/s 61.4382 Ops/s $\color{#35bf28}+4.41\%$
test_cql_speed[reduce-overhead-backward] 23.3725ms 22.5731ms 44.3006 Ops/s 43.8159 Ops/s $\color{#35bf28}+1.11\%$
test_a2c_speed[False-None] 9.0144ms 7.4674ms 133.9156 Ops/s 136.3530 Ops/s $\color{#d91a1a}-1.79\%$
test_a2c_speed[False-backward] 21.9047ms 15.2099ms 65.7466 Ops/s 69.0443 Ops/s $\color{#d91a1a}-4.78\%$
test_a2c_speed[True-None] 4.8077ms 4.1973ms 238.2481 Ops/s 227.1679 Ops/s $\color{#35bf28}+4.88\%$
test_a2c_speed[True-backward] 11.2181ms 10.7832ms 92.7367 Ops/s 91.4278 Ops/s $\color{#35bf28}+1.43\%$
test_a2c_speed[reduce-overhead-None] 5.0550ms 4.1836ms 239.0275 Ops/s 229.0280 Ops/s $\color{#35bf28}+4.37\%$
test_a2c_speed[reduce-overhead-backward] 12.4170ms 10.8499ms 92.1670 Ops/s 92.2554 Ops/s $\color{#d91a1a}-0.10\%$
test_ppo_speed[False-None] 9.3561ms 7.5257ms 132.8783 Ops/s 132.5202 Ops/s $\color{#35bf28}+0.27\%$
test_ppo_speed[False-backward] 16.5353ms 14.9784ms 66.7629 Ops/s 65.9470 Ops/s $\color{#35bf28}+1.24\%$
test_ppo_speed[True-None] 4.1183ms 3.6757ms 272.0534 Ops/s 263.4782 Ops/s $\color{#35bf28}+3.25\%$
test_ppo_speed[True-backward] 10.0970ms 9.6661ms 103.4543 Ops/s 103.6539 Ops/s $\color{#d91a1a}-0.19\%$
test_ppo_speed[reduce-overhead-None] 4.3980ms 3.6643ms 272.9048 Ops/s 269.2827 Ops/s $\color{#35bf28}+1.35\%$
test_ppo_speed[reduce-overhead-backward] 10.1623ms 9.6203ms 103.9469 Ops/s 101.9845 Ops/s $\color{#35bf28}+1.92\%$
test_reinforce_speed[False-None] 7.6789ms 6.5897ms 151.7509 Ops/s 150.4512 Ops/s $\color{#35bf28}+0.86\%$
test_reinforce_speed[False-backward] 10.2141ms 9.8381ms 101.6461 Ops/s 99.1491 Ops/s $\color{#35bf28}+2.52\%$
test_reinforce_speed[True-None] 3.1331ms 2.6465ms 377.8556 Ops/s 372.5589 Ops/s $\color{#35bf28}+1.42\%$
test_reinforce_speed[True-backward] 9.1920ms 8.5740ms 116.6315 Ops/s 113.8240 Ops/s $\color{#35bf28}+2.47\%$
test_reinforce_speed[reduce-overhead-None] 3.1583ms 2.6835ms 372.6526 Ops/s 366.9712 Ops/s $\color{#35bf28}+1.55\%$
test_reinforce_speed[reduce-overhead-backward] 8.9603ms 8.5374ms 117.1316 Ops/s 114.9586 Ops/s $\color{#35bf28}+1.89\%$
test_iql_speed[False-None] 32.8880ms 32.0400ms 31.2110 Ops/s 30.8255 Ops/s $\color{#35bf28}+1.25\%$
test_iql_speed[False-backward] 46.3654ms 45.3626ms 22.0446 Ops/s 22.0196 Ops/s $\color{#35bf28}+0.11\%$
test_iql_speed[True-None] 11.2207ms 10.5677ms 94.6280 Ops/s 91.2373 Ops/s $\color{#35bf28}+3.72\%$
test_iql_speed[True-backward] 23.0912ms 22.1542ms 45.1381 Ops/s 45.2856 Ops/s $\color{#d91a1a}-0.33\%$
test_iql_speed[reduce-overhead-None] 11.5552ms 10.4902ms 95.3272 Ops/s 89.4150 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_iql_speed[reduce-overhead-backward] 22.3382ms 21.5115ms 46.4867 Ops/s 45.4917 Ops/s $\color{#35bf28}+2.19\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3766ms 4.8100ms 207.8999 Ops/s 199.2504 Ops/s $\color{#35bf28}+4.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2940ms 0.4993ms 2.0026 KOps/s 1.9639 KOps/s $\color{#35bf28}+1.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7859ms 0.4786ms 2.0893 KOps/s 2.0559 KOps/s $\color{#35bf28}+1.63\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1827ms 4.5712ms 218.7592 Ops/s 206.8212 Ops/s $\textbf{\color{#35bf28}+5.77\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8512ms 0.4833ms 2.0693 KOps/s 2.0176 KOps/s $\color{#35bf28}+2.56\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6853ms 0.4594ms 2.1766 KOps/s 2.1348 KOps/s $\color{#35bf28}+1.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5107ms 1.6217ms 616.6490 Ops/s 612.4337 Ops/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.5183ms 1.5693ms 637.2388 Ops/s 625.3895 Ops/s $\color{#35bf28}+1.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0405ms 4.7807ms 209.1762 Ops/s 203.6409 Ops/s $\color{#35bf28}+2.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0167ms 0.6489ms 1.5411 KOps/s 1.5472 KOps/s $\color{#d91a1a}-0.40\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9724ms 0.6126ms 1.6323 KOps/s 1.6193 KOps/s $\color{#35bf28}+0.80\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.2298ms 4.6406ms 215.4894 Ops/s 203.2534 Ops/s $\textbf{\color{#35bf28}+6.02\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9670ms 0.5023ms 1.9909 KOps/s 1.9461 KOps/s $\color{#35bf28}+2.30\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7102ms 0.4822ms 2.0738 KOps/s 2.0308 KOps/s $\color{#35bf28}+2.12\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.7670ms 4.8940ms 204.3324 Ops/s 211.4927 Ops/s $\color{#d91a1a}-3.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7330ms 0.4819ms 2.0752 KOps/s 1.9981 KOps/s $\color{#35bf28}+3.86\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.0737ms 0.4740ms 2.1097 KOps/s 2.1335 KOps/s $\color{#d91a1a}-1.11\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.5130ms 4.7954ms 208.5318 Ops/s 202.0762 Ops/s $\color{#35bf28}+3.19\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.6860ms 0.6308ms 1.5854 KOps/s 1.5655 KOps/s $\color{#35bf28}+1.27\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0548ms 0.6055ms 1.6516 KOps/s 1.5999 KOps/s $\color{#35bf28}+3.23\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.8093ms 4.2034ms 237.9021 Ops/s 252.3878 Ops/s $\textbf{\color{#d91a1a}-5.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.8983ms 2.2577ms 442.9299 Ops/s 41.0678 Ops/s $\textbf{\color{#35bf28}+978.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.7389ms 1.3619ms 734.2597 Ops/s 740.4563 Ops/s $\color{#d91a1a}-0.84\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4199s 12.6307ms 79.1720 Ops/s 223.0388 Ops/s $\textbf{\color{#d91a1a}-64.50\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.1467ms 2.2911ms 436.4774 Ops/s 421.6378 Ops/s $\color{#35bf28}+3.52\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.8886ms 1.2129ms 824.4390 Ops/s 762.5197 Ops/s $\textbf{\color{#35bf28}+8.12\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.7654ms 4.3921ms 227.6824 Ops/s 227.1385 Ops/s $\color{#35bf28}+0.24\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.3625ms 2.4883ms 401.8748 Ops/s 406.2264 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.9404ms 1.5171ms 659.1432 Ops/s 645.7455 Ops/s $\color{#35bf28}+2.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.3986ms 11.7236ms 85.2979 Ops/s 81.3776 Ops/s $\color{#35bf28}+4.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.6340ms 15.1194ms 66.1401 Ops/s 65.7891 Ops/s $\color{#35bf28}+0.53\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.6922ms 20.3538ms 49.1310 Ops/s 48.2780 Ops/s $\color{#35bf28}+1.77\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.3896ms 15.2896ms 65.4038 Ops/s 64.4396 Ops/s $\color{#35bf28}+1.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 22.5063ms 20.3047ms 49.2497 Ops/s 48.6477 Ops/s $\color{#35bf28}+1.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.1843ms 16.6126ms 60.1954 Ops/s 60.3407 Ops/s $\color{#d91a1a}-0.24\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}26$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7204s 0.7110s 1.4064 Ops/s 1.3664 Ops/s $\color{#35bf28}+2.93\%$
test_transformed 0.9665s 0.9627s 1.0387 Ops/s 1.0436 Ops/s $\color{#d91a1a}-0.47\%$
test_serial 2.1798s 2.1194s 0.4718 Ops/s 0.4739 Ops/s $\color{#d91a1a}-0.43\%$
test_parallel 2.0251s 2.0062s 0.4984 Ops/s 0.5236 Ops/s $\color{#d91a1a}-4.80\%$
test_step_mdp_speed[True-True-True-True-True] 0.1808ms 40.4186μs 24.7411 KOps/s 25.5207 KOps/s $\color{#d91a1a}-3.05\%$
test_step_mdp_speed[True-True-True-True-False] 45.4910μs 22.8374μs 43.7879 KOps/s 44.4731 KOps/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[True-True-True-False-True] 51.5310μs 22.3139μs 44.8151 KOps/s 46.4071 KOps/s $\color{#d91a1a}-3.43\%$
test_step_mdp_speed[True-True-True-False-False] 39.2810μs 12.9705μs 77.0982 KOps/s 79.9770 KOps/s $\color{#d91a1a}-3.60\%$
test_step_mdp_speed[True-True-False-True-True] 84.5120μs 42.8681μs 23.3274 KOps/s 24.0101 KOps/s $\color{#d91a1a}-2.84\%$
test_step_mdp_speed[True-True-False-True-False] 60.9810μs 25.2834μs 39.5516 KOps/s 39.9503 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[True-True-False-False-True] 52.9910μs 24.4397μs 40.9170 KOps/s 40.6355 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[True-True-False-False-False] 52.0910μs 15.2668μs 65.5016 KOps/s 65.6339 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[True-False-True-True-True] 93.2520μs 44.6690μs 22.3869 KOps/s 22.5632 KOps/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[True-False-True-True-False] 62.5310μs 27.4102μs 36.4828 KOps/s 36.5915 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-False-True-False-True] 64.3610μs 24.6958μs 40.4927 KOps/s 40.6400 KOps/s $\color{#d91a1a}-0.36\%$
test_step_mdp_speed[True-False-True-False-False] 55.4610μs 15.1060μs 66.1987 KOps/s 68.9854 KOps/s $\color{#d91a1a}-4.04\%$
test_step_mdp_speed[True-False-False-True-True] 93.8820μs 46.6102μs 21.4545 KOps/s 21.6761 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-False-False-True-False] 65.1310μs 29.6459μs 33.7314 KOps/s 34.2411 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[True-False-False-False-True] 58.9210μs 26.5776μs 37.6257 KOps/s 37.6754 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-False-False-False-False] 49.5810μs 17.2043μs 58.1250 KOps/s 58.2845 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[False-True-True-True-True] 0.1114ms 44.5276μs 22.4580 KOps/s 22.4995 KOps/s $\color{#d91a1a}-0.18\%$
test_step_mdp_speed[False-True-True-True-False] 54.0510μs 27.6065μs 36.2234 KOps/s 36.1337 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-True-True-False-True] 89.5720μs 28.2842μs 35.3554 KOps/s 35.9912 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-True-True-False-False] 46.0610μs 16.7220μs 59.8014 KOps/s 59.5348 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[False-True-False-True-True] 83.9120μs 47.2083μs 21.1827 KOps/s 21.2878 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-True-False-True-False] 56.3010μs 29.7495μs 33.6140 KOps/s 33.5269 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[False-True-False-False-True] 3.1240ms 30.4339μs 32.8581 KOps/s 32.6794 KOps/s $\color{#35bf28}+0.55\%$
test_step_mdp_speed[False-True-False-False-False] 44.4810μs 18.8005μs 53.1900 KOps/s 52.5214 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-True-True-True] 96.7120μs 49.3444μs 20.2657 KOps/s 20.3155 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[False-False-True-True-False] 59.3110μs 32.1135μs 31.1395 KOps/s 31.3188 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-False-True-False-True] 63.0920μs 30.0455μs 33.2828 KOps/s 33.0987 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[False-False-True-False-False] 49.6610μs 18.7875μs 53.2270 KOps/s 53.8502 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-False-False-True-True] 83.1420μs 50.9415μs 19.6304 KOps/s 19.9452 KOps/s $\color{#d91a1a}-1.58\%$
test_step_mdp_speed[False-False-False-True-False] 76.1410μs 34.0944μs 29.3303 KOps/s 30.2188 KOps/s $\color{#d91a1a}-2.94\%$
test_step_mdp_speed[False-False-False-False-True] 59.9910μs 32.0812μs 31.1709 KOps/s 31.6903 KOps/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[False-False-False-False-False] 47.8600μs 20.9887μs 47.6448 KOps/s 48.4027 KOps/s $\color{#d91a1a}-1.57\%$
test_values[generalized_advantage_estimate-True-True] 24.8167ms 24.2215ms 41.2857 Ops/s 41.5306 Ops/s $\color{#d91a1a}-0.59\%$
test_values[vec_generalized_advantage_estimate-True-True] 91.3947ms 2.7206ms 367.5699 Ops/s 348.7528 Ops/s $\textbf{\color{#35bf28}+5.40\%}$
test_values[td0_return_estimate-False-False] 0.1043ms 79.1626μs 12.6322 KOps/s 12.7479 KOps/s $\color{#d91a1a}-0.91\%$
test_values[td1_return_estimate-False-False] 55.9117ms 53.7230ms 18.6140 Ops/s 18.6473 Ops/s $\color{#d91a1a}-0.18\%$
test_values[vec_td1_return_estimate-False-False] 1.3800ms 1.0717ms 933.0710 Ops/s 918.0121 Ops/s $\color{#35bf28}+1.64\%$
test_values[td_lambda_return_estimate-True-False] 86.9215ms 85.4489ms 11.7029 Ops/s 11.5035 Ops/s $\color{#35bf28}+1.73\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3812ms 1.0707ms 933.9811 Ops/s 924.7778 Ops/s $\color{#35bf28}+1.00\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.5895ms 23.9466ms 41.7596 Ops/s 42.3744 Ops/s $\color{#d91a1a}-1.45\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0343ms 0.7436ms 1.3448 KOps/s 1.3432 KOps/s $\color{#35bf28}+0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7607ms 0.6606ms 1.5139 KOps/s 1.5229 KOps/s $\color{#d91a1a}-0.59\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5083ms 1.4731ms 678.8222 Ops/s 677.8535 Ops/s $\color{#35bf28}+0.14\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.0645ms 0.6753ms 1.4808 KOps/s 1.4885 KOps/s $\color{#d91a1a}-0.52\%$
test_dqn_speed[False-None] 6.8688ms 1.5072ms 663.4730 Ops/s 678.4979 Ops/s $\color{#d91a1a}-2.21\%$
test_dqn_speed[False-backward] 2.1401ms 2.0993ms 476.3460 Ops/s 484.1788 Ops/s $\color{#d91a1a}-1.62\%$
test_dqn_speed[True-None] 0.6288ms 0.5301ms 1.8865 KOps/s 1.8411 KOps/s $\color{#35bf28}+2.47\%$
test_dqn_speed[True-backward] 1.1272ms 1.0822ms 924.0588 Ops/s 830.3707 Ops/s $\textbf{\color{#35bf28}+11.28\%}$
test_dqn_speed[reduce-overhead-None] 0.6271ms 0.5661ms 1.7664 KOps/s 1.7993 KOps/s $\color{#d91a1a}-1.83\%$
test_dqn_speed[reduce-overhead-backward] 1.0981ms 1.0611ms 942.4299 Ops/s 1.0390 KOps/s $\textbf{\color{#d91a1a}-9.29\%}$
test_ddpg_speed[False-None] 3.1165ms 2.8356ms 352.6561 Ops/s 354.5408 Ops/s $\color{#d91a1a}-0.53\%$
test_ddpg_speed[False-backward] 4.3254ms 4.2104ms 237.5048 Ops/s 250.0910 Ops/s $\textbf{\color{#d91a1a}-5.03\%}$
test_ddpg_speed[True-None] 1.1924ms 1.1188ms 893.8157 Ops/s 929.6504 Ops/s $\color{#d91a1a}-3.85\%$
test_ddpg_speed[True-backward] 2.3302ms 2.2664ms 441.2195 Ops/s 464.6588 Ops/s $\textbf{\color{#d91a1a}-5.04\%}$
test_ddpg_speed[reduce-overhead-None] 1.1923ms 1.0839ms 922.6198 Ops/s 921.0796 Ops/s $\color{#35bf28}+0.17\%$
test_ddpg_speed[reduce-overhead-backward] 1.8116ms 1.7493ms 571.6535 Ops/s 607.8118 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_sac_speed[False-None] 8.3328ms 7.9421ms 125.9114 Ops/s 126.0588 Ops/s $\color{#d91a1a}-0.12\%$
test_sac_speed[False-backward] 11.3731ms 11.0130ms 90.8018 Ops/s 92.8343 Ops/s $\color{#d91a1a}-2.19\%$
test_sac_speed[True-None] 1.6244ms 1.5273ms 654.7428 Ops/s 645.4869 Ops/s $\color{#35bf28}+1.43\%$
test_sac_speed[True-backward] 3.2757ms 3.2021ms 312.2935 Ops/s 292.1552 Ops/s $\textbf{\color{#35bf28}+6.89\%}$
test_sac_speed[reduce-overhead-None] 23.4808ms 12.8587ms 77.7681 Ops/s 79.5134 Ops/s $\color{#d91a1a}-2.19\%$
test_sac_speed[reduce-overhead-backward] 1.3727ms 1.3175ms 759.0009 Ops/s 758.8400 Ops/s $\color{#35bf28}+0.02\%$
test_redq_speed[False-None] 8.1134ms 7.3825ms 135.4554 Ops/s 133.6538 Ops/s $\color{#35bf28}+1.35\%$
test_redq_speed[False-backward] 11.8011ms 11.0525ms 90.4770 Ops/s 89.5640 Ops/s $\color{#35bf28}+1.02\%$
test_redq_speed[True-None] 2.0869ms 1.9883ms 502.9354 Ops/s 500.2749 Ops/s $\color{#35bf28}+0.53\%$
test_redq_speed[True-backward] 3.8024ms 3.6334ms 275.2247 Ops/s 254.5750 Ops/s $\textbf{\color{#35bf28}+8.11\%}$
test_redq_speed[reduce-overhead-None] 2.0689ms 1.9807ms 504.8781 Ops/s 502.0924 Ops/s $\color{#35bf28}+0.55\%$
test_redq_speed[reduce-overhead-backward] 3.7762ms 3.6109ms 276.9419 Ops/s 256.2789 Ops/s $\textbf{\color{#35bf28}+8.06\%}$
test_redq_deprec_speed[False-None] 9.7351ms 8.9810ms 111.3460 Ops/s 106.5511 Ops/s $\color{#35bf28}+4.50\%$
test_redq_deprec_speed[False-backward] 12.3341ms 11.7811ms 84.8818 Ops/s 80.7055 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_redq_deprec_speed[True-None] 2.5565ms 2.4401ms 409.8212 Ops/s 431.1752 Ops/s $\color{#d91a1a}-4.95\%$
test_redq_deprec_speed[True-backward] 4.2438ms 3.9671ms 252.0752 Ops/s 250.2065 Ops/s $\color{#35bf28}+0.75\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4271ms 2.3138ms 432.1936 Ops/s 406.5840 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_redq_deprec_speed[reduce-overhead-backward] 4.0687ms 3.9649ms 252.2132 Ops/s 234.8941 Ops/s $\textbf{\color{#35bf28}+7.37\%}$
test_td3_speed[False-None] 8.1336ms 7.8401ms 127.5490 Ops/s 127.7500 Ops/s $\color{#d91a1a}-0.16\%$
test_td3_speed[False-backward] 10.7086ms 10.1283ms 98.7329 Ops/s 99.9850 Ops/s $\color{#d91a1a}-1.25\%$
test_td3_speed[True-None] 1.5837ms 1.5630ms 639.8040 Ops/s 606.4456 Ops/s $\textbf{\color{#35bf28}+5.50\%}$
test_td3_speed[True-backward] 3.1680ms 3.0865ms 323.9891 Ops/s 319.0060 Ops/s $\color{#35bf28}+1.56\%$
test_td3_speed[reduce-overhead-None] 82.4244ms 26.3977ms 37.8821 Ops/s 36.4777 Ops/s $\color{#35bf28}+3.85\%$
test_td3_speed[reduce-overhead-backward] 1.3474ms 1.2785ms 782.1806 Ops/s 775.9409 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[False-None] 17.0889ms 16.4782ms 60.6864 Ops/s 60.2625 Ops/s $\color{#35bf28}+0.70\%$
test_cql_speed[False-backward] 22.2944ms 21.4397ms 46.6425 Ops/s 46.6508 Ops/s $\color{#d91a1a}-0.02\%$
test_cql_speed[True-None] 3.0223ms 2.9227ms 342.1539 Ops/s 336.8818 Ops/s $\color{#35bf28}+1.56\%$
test_cql_speed[True-backward] 5.6416ms 5.2266ms 191.3291 Ops/s 188.2914 Ops/s $\color{#35bf28}+1.61\%$
test_cql_speed[reduce-overhead-None] 21.8657ms 13.3252ms 75.0458 Ops/s 75.4539 Ops/s $\color{#d91a1a}-0.54\%$
test_cql_speed[reduce-overhead-backward] 1.5415ms 1.4950ms 668.8965 Ops/s 661.8450 Ops/s $\color{#35bf28}+1.07\%$
test_a2c_speed[False-None] 3.4064ms 3.1641ms 316.0496 Ops/s 315.2601 Ops/s $\color{#35bf28}+0.25\%$
test_a2c_speed[False-backward] 6.5458ms 5.9817ms 167.1755 Ops/s 167.7215 Ops/s $\color{#d91a1a}-0.33\%$
test_a2c_speed[True-None] 1.1044ms 0.9986ms 1.0014 KOps/s 946.8683 Ops/s $\textbf{\color{#35bf28}+5.76\%}$
test_a2c_speed[True-backward] 2.6041ms 2.5593ms 390.7293 Ops/s 376.3210 Ops/s $\color{#35bf28}+3.83\%$
test_a2c_speed[reduce-overhead-None] 22.1547ms 11.6929ms 85.5220 Ops/s 85.5793 Ops/s $\color{#d91a1a}-0.07\%$
test_a2c_speed[reduce-overhead-backward] 1.0003ms 0.9600ms 1.0416 KOps/s 881.9347 Ops/s $\textbf{\color{#35bf28}+18.11\%}$
test_ppo_speed[False-None] 3.7784ms 3.6567ms 273.4680 Ops/s 269.3170 Ops/s $\color{#35bf28}+1.54\%$
test_ppo_speed[False-backward] 7.0700ms 6.6637ms 150.0675 Ops/s 145.1074 Ops/s $\color{#35bf28}+3.42\%$
test_ppo_speed[True-None] 1.0274ms 0.9495ms 1.0531 KOps/s 1.0572 KOps/s $\color{#d91a1a}-0.38\%$
test_ppo_speed[True-backward] 2.5966ms 2.5101ms 398.3887 Ops/s 370.1479 Ops/s $\textbf{\color{#35bf28}+7.63\%}$
test_ppo_speed[reduce-overhead-None] 0.5724ms 0.5114ms 1.9554 KOps/s 1.9170 KOps/s $\color{#35bf28}+2.00\%$
test_ppo_speed[reduce-overhead-backward] 1.0637ms 0.9637ms 1.0376 KOps/s 885.5948 Ops/s $\textbf{\color{#35bf28}+17.17\%}$
test_reinforce_speed[False-None] 2.4005ms 2.2614ms 442.1977 Ops/s 446.0759 Ops/s $\color{#d91a1a}-0.87\%$
test_reinforce_speed[False-backward] 3.6294ms 3.2393ms 308.7080 Ops/s 300.2320 Ops/s $\color{#35bf28}+2.82\%$
test_reinforce_speed[True-None] 0.9304ms 0.8453ms 1.1831 KOps/s 1.1980 KOps/s $\color{#d91a1a}-1.25\%$
test_reinforce_speed[True-backward] 2.6350ms 2.3904ms 418.3470 Ops/s 386.5232 Ops/s $\textbf{\color{#35bf28}+8.23\%}$
test_reinforce_speed[reduce-overhead-None] 22.0486ms 11.7297ms 85.2534 Ops/s 87.1871 Ops/s $\color{#d91a1a}-2.22\%$
test_reinforce_speed[reduce-overhead-backward] 1.0945ms 1.0272ms 973.5519 Ops/s 837.2415 Ops/s $\textbf{\color{#35bf28}+16.28\%}$
test_iql_speed[False-None] 9.6478ms 9.1075ms 109.7998 Ops/s 110.5229 Ops/s $\color{#d91a1a}-0.65\%$
test_iql_speed[False-backward] 13.3183ms 12.7619ms 78.3579 Ops/s 77.3630 Ops/s $\color{#35bf28}+1.29\%$
test_iql_speed[True-None] 1.9102ms 1.7723ms 564.2307 Ops/s 571.6212 Ops/s $\color{#d91a1a}-1.29\%$
test_iql_speed[True-backward] 4.4341ms 4.2623ms 234.6161 Ops/s 226.0413 Ops/s $\color{#35bf28}+3.79\%$
test_iql_speed[reduce-overhead-None] 20.9226ms 11.7658ms 84.9921 Ops/s 90.0286 Ops/s $\textbf{\color{#d91a1a}-5.59\%}$
test_iql_speed[reduce-overhead-backward] 1.4731ms 1.4101ms 709.1899 Ops/s 629.9712 Ops/s $\textbf{\color{#35bf28}+12.57\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9686ms 6.4474ms 155.1002 Ops/s 153.7026 Ops/s $\color{#35bf28}+0.91\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4929ms 0.2750ms 3.6367 KOps/s 2.9038 KOps/s $\textbf{\color{#35bf28}+25.24\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4493ms 0.2546ms 3.9276 KOps/s 3.0175 KOps/s $\textbf{\color{#35bf28}+30.16\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5613ms 6.2123ms 160.9706 Ops/s 160.0223 Ops/s $\color{#35bf28}+0.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1772ms 0.3365ms 2.9719 KOps/s 3.2507 KOps/s $\textbf{\color{#d91a1a}-8.58\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4511ms 0.2439ms 4.1001 KOps/s 4.1340 KOps/s $\color{#d91a1a}-0.82\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4587ms 1.2468ms 802.0845 Ops/s 720.4488 Ops/s $\textbf{\color{#35bf28}+11.33\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4343ms 1.2052ms 829.7152 Ops/s 842.4680 Ops/s $\color{#d91a1a}-1.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4349ms 6.3447ms 157.6110 Ops/s 155.2142 Ops/s $\color{#35bf28}+1.54\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7534ms 0.4526ms 2.2093 KOps/s 2.3999 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6364ms 0.4406ms 2.2696 KOps/s 2.3947 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4454ms 6.2478ms 160.0559 Ops/s 158.7084 Ops/s $\color{#35bf28}+0.85\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0003ms 0.3348ms 2.9868 KOps/s 2.9191 KOps/s $\color{#35bf28}+2.32\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5303ms 0.3210ms 3.1154 KOps/s 3.4546 KOps/s $\textbf{\color{#d91a1a}-9.82\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4271ms 6.1541ms 162.4944 Ops/s 161.3134 Ops/s $\color{#35bf28}+0.73\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5363ms 0.2926ms 3.4180 KOps/s 2.7488 KOps/s $\textbf{\color{#35bf28}+24.35\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5429ms 0.2740ms 3.6498 KOps/s 2.8704 KOps/s $\textbf{\color{#35bf28}+27.15\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5012ms 6.3120ms 158.4286 Ops/s 157.3078 Ops/s $\color{#35bf28}+0.71\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4201ms 0.4506ms 2.2194 KOps/s 2.0068 KOps/s $\textbf{\color{#35bf28}+10.60\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7009ms 0.4577ms 2.1847 KOps/s 2.3119 KOps/s $\textbf{\color{#d91a1a}-5.50\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9183ms 5.2829ms 189.2888 Ops/s 187.9388 Ops/s $\color{#35bf28}+0.72\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.4827ms 2.0411ms 489.9250 Ops/s 422.3320 Ops/s $\textbf{\color{#35bf28}+16.00\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.3837ms 1.2386ms 807.3614 Ops/s 795.0714 Ops/s $\color{#35bf28}+1.55\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.9657ms 5.3871ms 185.6288 Ops/s 190.9287 Ops/s $\color{#d91a1a}-2.78\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.8541ms 2.0893ms 478.6360 Ops/s 448.1945 Ops/s $\textbf{\color{#35bf28}+6.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.0728ms 1.1924ms 838.6444 Ops/s 774.4284 Ops/s $\textbf{\color{#35bf28}+8.29\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5101s 15.6143ms 64.0437 Ops/s 33.1661 Ops/s $\textbf{\color{#35bf28}+93.10\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.2250ms 2.2574ms 442.9823 Ops/s 462.8328 Ops/s $\color{#d91a1a}-4.29\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 9.5650ms 1.4662ms 682.0168 Ops/s 857.9132 Ops/s $\textbf{\color{#d91a1a}-20.50\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.4521ms 13.6165ms 73.4402 Ops/s 74.9529 Ops/s $\color{#d91a1a}-2.02\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.5288ms 17.3896ms 57.5056 Ops/s 58.1141 Ops/s $\color{#d91a1a}-1.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3277ms 17.7915ms 56.2067 Ops/s 54.1264 Ops/s $\color{#35bf28}+3.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.6408ms 17.5308ms 57.0426 Ops/s 56.8963 Ops/s $\color{#35bf28}+0.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.3322ms 17.5699ms 56.9154 Ops/s 55.1834 Ops/s $\color{#35bf28}+3.14\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3637ms 19.2678ms 51.9001 Ops/s 53.0984 Ops/s $\color{#d91a1a}-2.26\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants