Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TensorSpec.enumerate() #2354

Merged
merged 11 commits into from
Nov 8, 2024
Merged

[Feature] TensorSpec.enumerate() #2354

merged 11 commits into from
Nov 8, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 4, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Aug 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2354

Note: Links to docs will display an error until the docs builds have been completed.

❌ 18 New Failures, 4 Unrelated Failures

As of commit 323f4d7 with merge base 8a8b4c3 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 4, 2024
vmoens added a commit that referenced this pull request Aug 4, 2024
ghstack-source-id: 47c3c22ce6b9feb83852a4ed7c823a834a7ace21
Pull Request resolved: #2354
vmoens added a commit that referenced this pull request Aug 7, 2024
ghstack-source-id: 47c3c22ce6b9feb83852a4ed7c823a834a7ace21
Pull Request resolved: #2354
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Oct 26, 2024
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Nov 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}50$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4169s 0.4154s 2.4075 Ops/s 2.2063 Ops/s $\textbf{\color{#35bf28}+9.12\%}$
test_transformed 0.5860s 0.5820s 1.7181 Ops/s 1.6443 Ops/s $\color{#35bf28}+4.49\%$
test_serial 1.3214s 1.3200s 0.7576 Ops/s 0.7246 Ops/s $\color{#35bf28}+4.55\%$
test_parallel 1.4026s 1.3080s 0.7645 Ops/s 0.7390 Ops/s $\color{#35bf28}+3.46\%$
test_step_mdp_speed[True-True-True-True-True] 0.2101ms 27.3158μs 36.6089 KOps/s 36.0859 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[True-True-True-True-False] 51.3050μs 15.7950μs 63.3112 KOps/s 62.9287 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[True-True-True-False-True] 77.5270μs 15.4243μs 64.8326 KOps/s 63.0011 KOps/s $\color{#35bf28}+2.91\%$
test_step_mdp_speed[True-True-True-False-False] 50.5040μs 9.0757μs 110.1846 KOps/s 108.8654 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[True-True-False-True-True] 70.5720μs 29.0924μs 34.3733 KOps/s 34.0383 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[True-True-False-True-False] 53.0890μs 17.7851μs 56.2268 KOps/s 55.6421 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-True-False-False-True] 61.2440μs 17.0519μs 58.6443 KOps/s 57.5853 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-False-False-False] 48.1410μs 10.6790μs 93.6421 KOps/s 92.5432 KOps/s $\color{#35bf28}+1.19\%$
test_step_mdp_speed[True-False-True-True-True] 72.1050μs 30.8876μs 32.3754 KOps/s 31.8783 KOps/s $\color{#35bf28}+1.56\%$
test_step_mdp_speed[True-False-True-True-False] 62.8870μs 19.7976μs 50.5111 KOps/s 51.1387 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[True-False-True-False-True] 75.9200μs 16.9519μs 58.9904 KOps/s 57.1063 KOps/s $\color{#35bf28}+3.30\%$
test_step_mdp_speed[True-False-True-False-False] 38.2110μs 10.8745μs 91.9587 KOps/s 91.8685 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-False-False-True-True] 0.1096ms 32.3567μs 30.9055 KOps/s 30.4812 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[True-False-False-True-False] 74.2870μs 21.1725μs 47.2311 KOps/s 47.2736 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-False-False-False-True] 54.6220μs 18.5525μs 53.9012 KOps/s 52.2485 KOps/s $\color{#35bf28}+3.16\%$
test_step_mdp_speed[True-False-False-False-False] 71.1030μs 12.2644μs 81.5366 KOps/s 80.5023 KOps/s $\color{#35bf28}+1.28\%$
test_step_mdp_speed[False-True-True-True-True] 82.5260μs 30.6572μs 32.6188 KOps/s 32.5714 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-True-True-False] 72.6040μs 19.2262μs 52.0122 KOps/s 48.1695 KOps/s $\textbf{\color{#35bf28}+7.98\%}$
test_step_mdp_speed[False-True-True-False-True] 58.1190μs 19.6634μs 50.8558 KOps/s 50.1081 KOps/s $\color{#35bf28}+1.49\%$
test_step_mdp_speed[False-True-True-False-False] 93.3520μs 12.0005μs 83.3300 KOps/s 81.4005 KOps/s $\color{#35bf28}+2.37\%$
test_step_mdp_speed[False-True-False-True-True] 68.0370μs 32.2378μs 31.0195 KOps/s 30.4773 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-False-True-False] 64.0390μs 21.0773μs 47.4445 KOps/s 47.6479 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-False-True] 3.1321ms 21.0683μs 47.4646 KOps/s 45.3267 KOps/s $\color{#35bf28}+4.72\%$
test_step_mdp_speed[False-True-False-False-False] 46.0360μs 13.5194μs 73.9676 KOps/s 72.5941 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[False-False-True-True-True] 96.2390μs 33.4699μs 29.8776 KOps/s 29.3031 KOps/s $\color{#35bf28}+1.96\%$
test_step_mdp_speed[False-False-True-True-False] 51.1350μs 22.4407μs 44.5618 KOps/s 44.2759 KOps/s $\color{#35bf28}+0.65\%$
test_step_mdp_speed[False-False-True-False-True] 72.1240μs 21.0214μs 47.5705 KOps/s 45.9157 KOps/s $\color{#35bf28}+3.60\%$
test_step_mdp_speed[False-False-True-False-False] 47.2280μs 13.6503μs 73.2586 KOps/s 73.0787 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-False-False-True-True] 72.7060μs 34.7264μs 28.7965 KOps/s 27.9885 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[False-False-False-True-False] 91.8520μs 24.2370μs 41.2593 KOps/s 41.6598 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[False-False-False-False-True] 70.1310μs 22.4611μs 44.5213 KOps/s 43.2235 KOps/s $\color{#35bf28}+3.00\%$
test_step_mdp_speed[False-False-False-False-False] 71.0620μs 15.2213μs 65.6973 KOps/s 64.8264 KOps/s $\color{#35bf28}+1.34\%$
test_values[generalized_advantage_estimate-True-True] 10.0668ms 9.7948ms 102.0952 Ops/s 101.7708 Ops/s $\color{#35bf28}+0.32\%$
test_values[vec_generalized_advantage_estimate-True-True] 39.1046ms 35.8173ms 27.9195 Ops/s 29.5092 Ops/s $\textbf{\color{#d91a1a}-5.39\%}$
test_values[td0_return_estimate-False-False] 0.2260ms 0.1877ms 5.3271 KOps/s 5.5431 KOps/s $\color{#d91a1a}-3.90\%$
test_values[td1_return_estimate-False-False] 28.7915ms 24.4857ms 40.8402 Ops/s 40.4814 Ops/s $\color{#35bf28}+0.89\%$
test_values[vec_td1_return_estimate-False-False] 38.4056ms 35.8606ms 27.8858 Ops/s 29.4339 Ops/s $\textbf{\color{#d91a1a}-5.26\%}$
test_values[td_lambda_return_estimate-True-False] 36.5660ms 35.1553ms 28.4452 Ops/s 28.2190 Ops/s $\color{#35bf28}+0.80\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.5259ms 35.8148ms 27.9214 Ops/s 29.3729 Ops/s $\color{#d91a1a}-4.94\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.6285ms 8.3926ms 119.1532 Ops/s 117.1559 Ops/s $\color{#35bf28}+1.70\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3152ms 1.9371ms 516.2290 Ops/s 494.0285 Ops/s $\color{#35bf28}+4.49\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6234ms 0.3603ms 2.7757 KOps/s 2.7535 KOps/s $\color{#35bf28}+0.80\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.5520ms 47.0157ms 21.2695 Ops/s 24.4520 Ops/s $\textbf{\color{#d91a1a}-13.02\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0804ms 3.0632ms 326.4590 Ops/s 321.0920 Ops/s $\color{#35bf28}+1.67\%$
test_dqn_speed[False-None] 2.1794ms 1.3379ms 747.4353 Ops/s 732.9291 Ops/s $\color{#35bf28}+1.98\%$
test_dqn_speed[False-backward] 2.2126ms 1.8810ms 531.6356 Ops/s 537.7398 Ops/s $\color{#d91a1a}-1.14\%$
test_dqn_speed[True-None] 0.6300ms 0.4691ms 2.1317 KOps/s 2.1276 KOps/s $\color{#35bf28}+0.19\%$
test_dqn_speed[True-backward] 0.9563ms 0.8961ms 1.1160 KOps/s 1.1209 KOps/s $\color{#d91a1a}-0.44\%$
test_dqn_speed[reduce-overhead-None] 0.7466ms 0.4750ms 2.1052 KOps/s 2.1308 KOps/s $\color{#d91a1a}-1.20\%$
test_dqn_speed[reduce-overhead-backward] 0.9509ms 0.8928ms 1.1201 KOps/s 1.1120 KOps/s $\color{#35bf28}+0.72\%$
test_ddpg_speed[False-None] 4.3819ms 2.8070ms 356.2509 Ops/s 355.6953 Ops/s $\color{#35bf28}+0.16\%$
test_ddpg_speed[False-backward] 4.0729ms 3.9332ms 254.2468 Ops/s 249.0771 Ops/s $\color{#35bf28}+2.08\%$
test_ddpg_speed[True-None] 1.4599ms 1.0149ms 985.3172 Ops/s 985.7107 Ops/s $\color{#d91a1a}-0.04\%$
test_ddpg_speed[True-backward] 1.9821ms 1.9328ms 517.3923 Ops/s 517.3153 Ops/s $\color{#35bf28}+0.01\%$
test_ddpg_speed[reduce-overhead-None] 1.6968ms 1.0226ms 977.8798 Ops/s 988.2767 Ops/s $\color{#d91a1a}-1.05\%$
test_ddpg_speed[reduce-overhead-backward] 2.0850ms 1.9636ms 509.2765 Ops/s 520.0501 Ops/s $\color{#d91a1a}-2.07\%$
test_sac_speed[False-None] 9.9026ms 8.0671ms 123.9603 Ops/s 126.6896 Ops/s $\color{#d91a1a}-2.15\%$
test_sac_speed[False-backward] 11.5237ms 10.8954ms 91.7821 Ops/s 93.7309 Ops/s $\color{#d91a1a}-2.08\%$
test_sac_speed[True-None] 2.4963ms 1.9047ms 525.0061 Ops/s 536.8118 Ops/s $\color{#d91a1a}-2.20\%$
test_sac_speed[True-backward] 4.1048ms 3.8548ms 259.4198 Ops/s 283.3001 Ops/s $\textbf{\color{#d91a1a}-8.43\%}$
test_sac_speed[reduce-overhead-None] 2.4780ms 1.9012ms 525.9705 Ops/s 541.0848 Ops/s $\color{#d91a1a}-2.79\%$
test_sac_speed[reduce-overhead-backward] 4.5273ms 3.8692ms 258.4509 Ops/s 279.9851 Ops/s $\textbf{\color{#d91a1a}-7.69\%}$
test_redq_speed[False-None] 15.2682ms 13.7942ms 72.4943 Ops/s 77.1681 Ops/s $\textbf{\color{#d91a1a}-6.06\%}$
test_redq_speed[False-backward] 25.0105ms 23.2020ms 43.0998 Ops/s 45.2377 Ops/s $\color{#d91a1a}-4.73\%$
test_redq_speed[True-None] 6.2395ms 5.6725ms 176.2878 Ops/s 219.9944 Ops/s $\textbf{\color{#d91a1a}-19.87\%}$
test_redq_speed[True-backward] 13.4172ms 13.1587ms 75.9952 Ops/s 79.3021 Ops/s $\color{#d91a1a}-4.17\%$
test_redq_speed[reduce-overhead-None] 6.5735ms 5.6344ms 177.4813 Ops/s 215.9054 Ops/s $\textbf{\color{#d91a1a}-17.80\%}$
test_redq_speed[reduce-overhead-backward] 14.2809ms 13.4379ms 74.4163 Ops/s 82.6857 Ops/s $\textbf{\color{#d91a1a}-10.00\%}$
test_redq_deprec_speed[False-None] 16.2764ms 14.4718ms 69.1001 Ops/s 78.8784 Ops/s $\textbf{\color{#d91a1a}-12.40\%}$
test_redq_deprec_speed[False-backward] 21.8384ms 20.2491ms 49.3850 Ops/s 54.0614 Ops/s $\textbf{\color{#d91a1a}-8.65\%}$
test_redq_deprec_speed[True-None] 4.7668ms 4.2700ms 234.1928 Ops/s 278.8286 Ops/s $\textbf{\color{#d91a1a}-16.01\%}$
test_redq_deprec_speed[True-backward] 10.8665ms 9.2620ms 107.9680 Ops/s 112.1597 Ops/s $\color{#d91a1a}-3.74\%$
test_redq_deprec_speed[reduce-overhead-None] 4.9752ms 4.2054ms 237.7872 Ops/s 278.3189 Ops/s $\textbf{\color{#d91a1a}-14.56\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.3342ms 8.7849ms 113.8313 Ops/s 117.5720 Ops/s $\color{#d91a1a}-3.18\%$
test_td3_speed[False-None] 8.5458ms 8.0283ms 124.5600 Ops/s 125.5908 Ops/s $\color{#d91a1a}-0.82\%$
test_td3_speed[False-backward] 12.3850ms 11.3067ms 88.4433 Ops/s 95.5568 Ops/s $\textbf{\color{#d91a1a}-7.44\%}$
test_td3_speed[True-None] 1.9908ms 1.7692ms 565.2124 Ops/s 577.3972 Ops/s $\color{#d91a1a}-2.11\%$
test_td3_speed[True-backward] 3.5084ms 3.3829ms 295.6062 Ops/s 298.6915 Ops/s $\color{#d91a1a}-1.03\%$
test_td3_speed[reduce-overhead-None] 1.8953ms 1.7674ms 565.7961 Ops/s 574.6178 Ops/s $\color{#d91a1a}-1.54\%$
test_td3_speed[reduce-overhead-backward] 3.7423ms 3.5518ms 281.5441 Ops/s 254.7118 Ops/s $\textbf{\color{#35bf28}+10.53\%}$
test_cql_speed[False-None] 38.2028ms 36.3483ms 27.5116 Ops/s 27.6449 Ops/s $\color{#d91a1a}-0.48\%$
test_cql_speed[False-backward] 53.0614ms 47.5856ms 21.0147 Ops/s 21.7571 Ops/s $\color{#d91a1a}-3.41\%$
test_cql_speed[True-None] 17.7792ms 16.1816ms 61.7987 Ops/s 63.8066 Ops/s $\color{#d91a1a}-3.15\%$
test_cql_speed[True-backward] 25.1589ms 23.4431ms 42.6565 Ops/s 44.7453 Ops/s $\color{#d91a1a}-4.67\%$
test_cql_speed[reduce-overhead-None] 16.6899ms 16.1584ms 61.8873 Ops/s 63.6062 Ops/s $\color{#d91a1a}-2.70\%$
test_cql_speed[reduce-overhead-backward] 24.6185ms 23.6186ms 42.3395 Ops/s 43.9251 Ops/s $\color{#d91a1a}-3.61\%$
test_a2c_speed[False-None] 10.9669ms 7.8876ms 126.7807 Ops/s 139.1949 Ops/s $\textbf{\color{#d91a1a}-8.92\%}$
test_a2c_speed[False-backward] 16.2154ms 15.5780ms 64.1933 Ops/s 70.4640 Ops/s $\textbf{\color{#d91a1a}-8.90\%}$
test_a2c_speed[True-None] 4.0724ms 3.6246ms 275.8935 Ops/s 298.2485 Ops/s $\textbf{\color{#d91a1a}-7.50\%}$
test_a2c_speed[True-backward] 11.3045ms 10.6501ms 93.8962 Ops/s 102.6342 Ops/s $\textbf{\color{#d91a1a}-8.51\%}$
test_a2c_speed[reduce-overhead-None] 3.8425ms 3.5864ms 278.8324 Ops/s 295.8810 Ops/s $\textbf{\color{#d91a1a}-5.76\%}$
test_a2c_speed[reduce-overhead-backward] 11.8980ms 10.7991ms 92.5999 Ops/s 102.4980 Ops/s $\textbf{\color{#d91a1a}-9.66\%}$
test_ppo_speed[False-None] 9.2229ms 8.3486ms 119.7803 Ops/s 134.2738 Ops/s $\textbf{\color{#d91a1a}-10.79\%}$
test_ppo_speed[False-backward] 17.1331ms 16.0721ms 62.2197 Ops/s 68.0407 Ops/s $\textbf{\color{#d91a1a}-8.56\%}$
test_ppo_speed[True-None] 4.6729ms 4.1772ms 239.3943 Ops/s 264.1094 Ops/s $\textbf{\color{#d91a1a}-9.36\%}$
test_ppo_speed[True-backward] 11.1705ms 10.5573ms 94.7212 Ops/s 104.0378 Ops/s $\textbf{\color{#d91a1a}-8.96\%}$
test_ppo_speed[reduce-overhead-None] 4.7955ms 4.1948ms 238.3908 Ops/s 264.3320 Ops/s $\textbf{\color{#d91a1a}-9.81\%}$
test_ppo_speed[reduce-overhead-backward] 11.2309ms 10.5907ms 94.4223 Ops/s 103.8760 Ops/s $\textbf{\color{#d91a1a}-9.10\%}$
test_reinforce_speed[False-None] 10.1457ms 7.0457ms 141.9305 Ops/s 152.4725 Ops/s $\textbf{\color{#d91a1a}-6.91\%}$
test_reinforce_speed[False-backward] 12.1507ms 10.5699ms 94.6079 Ops/s 102.7161 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_reinforce_speed[True-None] 3.6869ms 3.0547ms 327.3621 Ops/s 370.7282 Ops/s $\textbf{\color{#d91a1a}-11.70\%}$
test_reinforce_speed[True-backward] 11.1387ms 9.6113ms 104.0445 Ops/s 115.3639 Ops/s $\textbf{\color{#d91a1a}-9.81\%}$
test_reinforce_speed[reduce-overhead-None] 3.8915ms 3.2457ms 308.1020 Ops/s 369.3277 Ops/s $\textbf{\color{#d91a1a}-16.58\%}$
test_reinforce_speed[reduce-overhead-backward] 13.0145ms 9.7636ms 102.4213 Ops/s 116.8406 Ops/s $\textbf{\color{#d91a1a}-12.34\%}$
test_iql_speed[False-None] 34.9736ms 33.8609ms 29.5326 Ops/s 30.9180 Ops/s $\color{#d91a1a}-4.48\%$
test_iql_speed[False-backward] 49.7499ms 46.9972ms 21.2779 Ops/s 21.8697 Ops/s $\color{#d91a1a}-2.71\%$
test_iql_speed[True-None] 12.2262ms 11.2186ms 89.1375 Ops/s 92.0427 Ops/s $\color{#d91a1a}-3.16\%$
test_iql_speed[True-backward] 24.6561ms 22.9296ms 43.6117 Ops/s 46.3723 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_iql_speed[reduce-overhead-None] 12.1753ms 11.1442ms 89.7330 Ops/s 91.4956 Ops/s $\color{#d91a1a}-1.93\%$
test_iql_speed[reduce-overhead-backward] 27.5082ms 23.2073ms 43.0898 Ops/s 45.5474 Ops/s $\textbf{\color{#d91a1a}-5.40\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6870ms 5.1627ms 193.6984 Ops/s 207.6772 Ops/s $\textbf{\color{#d91a1a}-6.73\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9561ms 0.6871ms 1.4555 KOps/s 1.5610 KOps/s $\textbf{\color{#d91a1a}-6.76\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9252ms 0.6577ms 1.5204 KOps/s 1.5620 KOps/s $\color{#d91a1a}-2.67\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.6630ms 4.9320ms 202.7559 Ops/s 220.8288 Ops/s $\textbf{\color{#d91a1a}-8.18\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2794ms 0.6698ms 1.4931 KOps/s 1.6614 KOps/s $\textbf{\color{#d91a1a}-10.13\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9320ms 0.6282ms 1.5918 KOps/s 1.6012 KOps/s $\color{#d91a1a}-0.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.1853ms 1.9343ms 516.9876 Ops/s 510.0108 Ops/s $\color{#35bf28}+1.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4298ms 1.8746ms 533.4469 Ops/s 540.6704 Ops/s $\color{#d91a1a}-1.34\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.9490ms 5.2215ms 191.5152 Ops/s 205.9250 Ops/s $\textbf{\color{#d91a1a}-7.00\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.6050ms 0.8152ms 1.2267 KOps/s 1.2769 KOps/s $\color{#d91a1a}-3.93\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0201ms 0.7872ms 1.2703 KOps/s 1.2942 KOps/s $\color{#d91a1a}-1.84\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3072ms 4.9442ms 202.2565 Ops/s 213.0258 Ops/s $\textbf{\color{#d91a1a}-5.06\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.4162ms 0.6832ms 1.4637 KOps/s 1.8678 KOps/s $\textbf{\color{#d91a1a}-21.64\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8119ms 0.6552ms 1.5263 KOps/s 1.6484 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.0238ms 4.9859ms 200.5676 Ops/s 212.9663 Ops/s $\textbf{\color{#d91a1a}-5.82\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.3156ms 0.6675ms 1.4982 KOps/s 1.9631 KOps/s $\textbf{\color{#d91a1a}-23.68\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8124ms 0.6381ms 1.5671 KOps/s 1.7712 KOps/s $\textbf{\color{#d91a1a}-11.53\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2833ms 5.1673ms 193.5238 Ops/s 196.5995 Ops/s $\color{#d91a1a}-1.56\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.7561ms 0.8187ms 1.2215 KOps/s 1.3979 KOps/s $\textbf{\color{#d91a1a}-12.62\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0209ms 0.7910ms 1.2643 KOps/s 1.3344 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4741s 13.8322ms 72.2950 Ops/s 228.1513 Ops/s $\textbf{\color{#d91a1a}-68.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 5.6652ms 2.3781ms 420.5076 Ops/s 466.1520 Ops/s $\textbf{\color{#d91a1a}-9.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.6840ms 1.4101ms 709.1634 Ops/s 736.1770 Ops/s $\color{#d91a1a}-3.67\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.3721ms 4.3821ms 228.2024 Ops/s 216.8800 Ops/s $\textbf{\color{#35bf28}+5.22\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.9860ms 2.2922ms 436.2625 Ops/s 429.5929 Ops/s $\color{#35bf28}+1.55\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.0293ms 1.3018ms 768.1690 Ops/s 746.9473 Ops/s $\color{#35bf28}+2.84\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4686s 13.9476ms 71.6967 Ops/s 218.7049 Ops/s $\textbf{\color{#d91a1a}-67.22\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.7311ms 2.5189ms 397.0021 Ops/s 404.3499 Ops/s $\color{#d91a1a}-1.82\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.8598ms 1.5029ms 665.3594 Ops/s 686.5002 Ops/s $\color{#d91a1a}-3.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.2992ms 10.9666ms 91.1856 Ops/s 85.3141 Ops/s $\textbf{\color{#35bf28}+6.88\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 14.9590ms 14.4253ms 69.3227 Ops/s 68.2241 Ops/s $\color{#35bf28}+1.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.3977ms 19.9629ms 50.0930 Ops/s 49.1374 Ops/s $\color{#35bf28}+1.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 14.7755ms 14.5457ms 68.7486 Ops/s 67.6851 Ops/s $\color{#35bf28}+1.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.2303ms 19.8997ms 50.2521 Ops/s 49.5417 Ops/s $\color{#35bf28}+1.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.8807ms 15.8411ms 63.1270 Ops/s 62.5075 Ops/s $\color{#35bf28}+0.99\%$

Copy link

github-actions bot commented Nov 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7217s 0.7211s 1.3867 Ops/s 1.3889 Ops/s $\color{#d91a1a}-0.16\%$
test_transformed 1.0493s 0.9711s 1.0298 Ops/s 1.0555 Ops/s $\color{#d91a1a}-2.44\%$
test_serial 2.1509s 2.0734s 0.4823 Ops/s 0.4860 Ops/s $\color{#d91a1a}-0.77\%$
test_parallel 1.9789s 1.9344s 0.5170 Ops/s 0.5202 Ops/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[True-True-True-True-True] 0.1293ms 33.4250μs 29.9177 KOps/s 28.0323 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_step_mdp_speed[True-True-True-True-False] 0.1593ms 19.6340μs 50.9322 KOps/s 49.6067 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[True-True-True-False-True] 0.1182ms 19.1819μs 52.1324 KOps/s 50.3426 KOps/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[True-True-True-False-False] 51.9510μs 11.1669μs 89.5501 KOps/s 87.3444 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-True-False-True-True] 78.8920μs 37.2281μs 26.8615 KOps/s 26.2475 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[True-True-False-True-False] 61.9210μs 21.6017μs 46.2926 KOps/s 46.3635 KOps/s $\color{#d91a1a}-0.15\%$
test_step_mdp_speed[True-True-False-False-True] 59.4510μs 21.4611μs 46.5960 KOps/s 45.5864 KOps/s $\color{#35bf28}+2.21\%$
test_step_mdp_speed[True-True-False-False-False] 0.1081ms 13.0761μs 76.4754 KOps/s 76.7573 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-True-True-True] 75.6710μs 38.9317μs 25.6860 KOps/s 25.2740 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-True-False] 75.8510μs 23.4914μs 42.5687 KOps/s 43.1545 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-False-True-False-True] 59.5410μs 21.4563μs 46.6063 KOps/s 46.8176 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-False-True-False-False] 92.2220μs 13.0225μs 76.7901 KOps/s 75.1112 KOps/s $\color{#35bf28}+2.24\%$
test_step_mdp_speed[True-False-False-True-True] 84.4810μs 40.2027μs 24.8739 KOps/s 24.6233 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-False-False-True-False] 0.1133ms 25.5347μs 39.1625 KOps/s 38.7655 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-False-False-False-True] 60.3610μs 23.2283μs 43.0509 KOps/s 42.1939 KOps/s $\color{#35bf28}+2.03\%$
test_step_mdp_speed[True-False-False-False-False] 51.7610μs 15.0559μs 66.4193 KOps/s 65.8974 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-True-True] 76.7810μs 38.6686μs 25.8608 KOps/s 25.1299 KOps/s $\color{#35bf28}+2.91\%$
test_step_mdp_speed[False-True-True-True-False] 59.9900μs 23.7099μs 42.1765 KOps/s 41.9976 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[False-True-True-False-True] 0.1338ms 24.6767μs 40.5240 KOps/s 38.7189 KOps/s $\color{#35bf28}+4.66\%$
test_step_mdp_speed[False-True-True-False-False] 85.8510μs 14.7106μs 67.9784 KOps/s 66.2809 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-True-False-True-True] 81.1010μs 40.2091μs 24.8700 KOps/s 24.0001 KOps/s $\color{#35bf28}+3.62\%$
test_step_mdp_speed[False-True-False-True-False] 0.1328ms 25.2054μs 39.6740 KOps/s 38.7480 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[False-True-False-False-True] 3.5056ms 26.6008μs 37.5928 KOps/s 35.6948 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_step_mdp_speed[False-True-False-False-False] 48.0100μs 16.5115μs 60.5637 KOps/s 58.9896 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[False-False-True-True-True] 0.2001ms 42.0586μs 23.7763 KOps/s 23.1580 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[False-False-True-True-False] 67.2120μs 27.6077μs 36.2218 KOps/s 35.9843 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[False-False-True-False-True] 65.7410μs 26.6750μs 37.4883 KOps/s 37.1957 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-False-True-False-False] 0.1903ms 16.3441μs 61.1841 KOps/s 58.6659 KOps/s $\color{#35bf28}+4.29\%$
test_step_mdp_speed[False-False-False-True-True] 0.2421ms 43.1924μs 23.1522 KOps/s 22.2148 KOps/s $\color{#35bf28}+4.22\%$
test_step_mdp_speed[False-False-False-True-False] 64.8910μs 29.3675μs 34.0513 KOps/s 33.4519 KOps/s $\color{#35bf28}+1.79\%$
test_step_mdp_speed[False-False-False-False-True] 0.2246ms 27.6503μs 36.1660 KOps/s 34.4472 KOps/s $\color{#35bf28}+4.99\%$
test_step_mdp_speed[False-False-False-False-False] 0.2083ms 17.9125μs 55.8268 KOps/s 54.0154 KOps/s $\color{#35bf28}+3.35\%$
test_values[generalized_advantage_estimate-True-True] 23.8732ms 23.5204ms 42.5163 Ops/s 40.3540 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_values[vec_generalized_advantage_estimate-True-True] 96.9773ms 2.8126ms 355.5381 Ops/s 341.3444 Ops/s $\color{#35bf28}+4.16\%$
test_values[td0_return_estimate-False-False] 85.1910μs 63.5887μs 15.7261 KOps/s 15.6655 KOps/s $\color{#35bf28}+0.39\%$
test_values[td1_return_estimate-False-False] 52.8096ms 52.3936ms 19.0863 Ops/s 18.1791 Ops/s $\color{#35bf28}+4.99\%$
test_values[vec_td1_return_estimate-False-False] 1.3286ms 1.0498ms 952.6064 Ops/s 945.3778 Ops/s $\color{#35bf28}+0.76\%$
test_values[td_lambda_return_estimate-True-False] 84.3097ms 83.5653ms 11.9667 Ops/s 11.4143 Ops/s $\color{#35bf28}+4.84\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3591ms 1.0481ms 954.0640 Ops/s 925.1975 Ops/s $\color{#35bf28}+3.12\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.3283ms 23.1486ms 43.1992 Ops/s 41.0946 Ops/s $\textbf{\color{#35bf28}+5.12\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0721ms 0.7154ms 1.3979 KOps/s 1.4099 KOps/s $\color{#d91a1a}-0.85\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7813ms 0.6338ms 1.5778 KOps/s 1.5289 KOps/s $\color{#35bf28}+3.20\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6058ms 1.4451ms 692.0167 Ops/s 686.9249 Ops/s $\color{#35bf28}+0.74\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7971ms 0.6476ms 1.5442 KOps/s 1.4940 KOps/s $\color{#35bf28}+3.36\%$
test_dqn_speed[False-None] 6.9719ms 1.2776ms 782.6916 Ops/s 775.2076 Ops/s $\color{#35bf28}+0.97\%$
test_dqn_speed[False-backward] 1.9480ms 1.7814ms 561.3605 Ops/s 557.4321 Ops/s $\color{#35bf28}+0.70\%$
test_dqn_speed[True-None] 0.6965ms 0.5375ms 1.8604 KOps/s 1.6166 KOps/s $\textbf{\color{#35bf28}+15.08\%}$
test_dqn_speed[True-backward] 1.0211ms 0.9734ms 1.0273 KOps/s 970.6116 Ops/s $\textbf{\color{#35bf28}+5.84\%}$
test_dqn_speed[reduce-overhead-None] 0.7221ms 0.5389ms 1.8558 KOps/s 1.7836 KOps/s $\color{#35bf28}+4.05\%$
test_dqn_speed[reduce-overhead-backward] 1.0229ms 0.9837ms 1.0165 KOps/s 944.0760 Ops/s $\textbf{\color{#35bf28}+7.68\%}$
test_ddpg_speed[False-None] 3.0569ms 2.6198ms 381.7142 Ops/s 383.7012 Ops/s $\color{#d91a1a}-0.52\%$
test_ddpg_speed[False-backward] 3.9107ms 3.7881ms 263.9874 Ops/s 264.8201 Ops/s $\color{#d91a1a}-0.31\%$
test_ddpg_speed[True-None] 1.3717ms 1.2053ms 829.6856 Ops/s 810.0888 Ops/s $\color{#35bf28}+2.42\%$
test_ddpg_speed[True-backward] 2.2966ms 2.1595ms 463.0596 Ops/s 411.2355 Ops/s $\textbf{\color{#35bf28}+12.60\%}$
test_ddpg_speed[reduce-overhead-None] 1.4112ms 1.2143ms 823.5202 Ops/s 791.3015 Ops/s $\color{#35bf28}+4.07\%$
test_ddpg_speed[reduce-overhead-backward] 2.3060ms 2.1761ms 459.5376 Ops/s 457.1217 Ops/s $\color{#35bf28}+0.53\%$
test_sac_speed[False-None] 8.4379ms 7.3105ms 136.7897 Ops/s 137.1370 Ops/s $\color{#d91a1a}-0.25\%$
test_sac_speed[False-backward] 10.8544ms 10.4255ms 95.9183 Ops/s 95.2619 Ops/s $\color{#35bf28}+0.69\%$
test_sac_speed[True-None] 2.1336ms 1.9621ms 509.6473 Ops/s 502.2619 Ops/s $\color{#35bf28}+1.47\%$
test_sac_speed[True-backward] 3.9613ms 3.8517ms 259.6237 Ops/s 248.4818 Ops/s $\color{#35bf28}+4.48\%$
test_sac_speed[reduce-overhead-None] 2.3757ms 1.9702ms 507.5699 Ops/s 497.0100 Ops/s $\color{#35bf28}+2.12\%$
test_sac_speed[reduce-overhead-backward] 4.0516ms 3.8561ms 259.3311 Ops/s 255.2905 Ops/s $\color{#35bf28}+1.58\%$
test_redq_speed[False-None] 10.6184ms 9.7762ms 102.2894 Ops/s 92.8246 Ops/s $\textbf{\color{#35bf28}+10.20\%}$
test_redq_speed[False-backward] 17.4725ms 16.8540ms 59.3332 Ops/s 57.1362 Ops/s $\color{#35bf28}+3.85\%$
test_redq_speed[True-None] 4.0332ms 3.5022ms 285.5357 Ops/s 284.2895 Ops/s $\color{#35bf28}+0.44\%$
test_redq_speed[True-backward] 8.9267ms 8.5419ms 117.0704 Ops/s 109.5927 Ops/s $\textbf{\color{#35bf28}+6.82\%}$
test_redq_speed[reduce-overhead-None] 3.7460ms 3.5087ms 285.0023 Ops/s 291.7728 Ops/s $\color{#d91a1a}-2.32\%$
test_redq_speed[reduce-overhead-backward] 8.8676ms 8.5169ms 117.4134 Ops/s 116.1825 Ops/s $\color{#35bf28}+1.06\%$
test_redq_deprec_speed[False-None] 10.7427ms 10.2544ms 97.5193 Ops/s 96.9066 Ops/s $\color{#35bf28}+0.63\%$
test_redq_deprec_speed[False-backward] 15.3597ms 14.9356ms 66.9543 Ops/s 66.3031 Ops/s $\color{#35bf28}+0.98\%$
test_redq_deprec_speed[True-None] 3.4106ms 3.1499ms 317.4739 Ops/s 314.4157 Ops/s $\color{#35bf28}+0.97\%$
test_redq_deprec_speed[True-backward] 7.2361ms 7.0051ms 142.7526 Ops/s 141.5845 Ops/s $\color{#35bf28}+0.83\%$
test_redq_deprec_speed[reduce-overhead-None] 3.3408ms 3.1193ms 320.5876 Ops/s 301.8677 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_redq_deprec_speed[reduce-overhead-backward] 7.2588ms 6.9698ms 143.4757 Ops/s 141.7571 Ops/s $\color{#35bf28}+1.21\%$
test_td3_speed[False-None] 7.4396ms 7.2636ms 137.6723 Ops/s 135.7833 Ops/s $\color{#35bf28}+1.39\%$
test_td3_speed[False-backward] 10.4534ms 10.1055ms 98.9561 Ops/s 97.2636 Ops/s $\color{#35bf28}+1.74\%$
test_td3_speed[True-None] 1.9562ms 1.8465ms 541.5653 Ops/s 527.9254 Ops/s $\color{#35bf28}+2.58\%$
test_td3_speed[True-backward] 3.7770ms 3.5901ms 278.5455 Ops/s 260.2934 Ops/s $\textbf{\color{#35bf28}+7.01\%}$
test_td3_speed[reduce-overhead-None] 1.8877ms 1.8402ms 543.4174 Ops/s 523.3806 Ops/s $\color{#35bf28}+3.83\%$
test_td3_speed[reduce-overhead-backward] 3.7930ms 3.6299ms 275.4916 Ops/s 274.4926 Ops/s $\color{#35bf28}+0.36\%$
test_cql_speed[False-None] 27.4106ms 24.6067ms 40.6393 Ops/s 41.0419 Ops/s $\color{#d91a1a}-0.98\%$
test_cql_speed[False-backward] 37.0736ms 33.7997ms 29.5860 Ops/s 29.9587 Ops/s $\color{#d91a1a}-1.24\%$
test_cql_speed[True-None] 11.2034ms 10.8197ms 92.4237 Ops/s 92.5138 Ops/s $\color{#d91a1a}-0.10\%$
test_cql_speed[True-backward] 16.8627ms 16.4566ms 60.7660 Ops/s 59.6179 Ops/s $\color{#35bf28}+1.93\%$
test_cql_speed[reduce-overhead-None] 11.0710ms 10.7758ms 92.8008 Ops/s 93.9564 Ops/s $\color{#d91a1a}-1.23\%$
test_cql_speed[reduce-overhead-backward] 16.9315ms 16.5304ms 60.4946 Ops/s 60.3325 Ops/s $\color{#35bf28}+0.27\%$
test_a2c_speed[False-None] 0.3603s 7.0458ms 141.9281 Ops/s 189.3150 Ops/s $\textbf{\color{#d91a1a}-25.03\%}$
test_a2c_speed[False-backward] 11.8165ms 11.4280ms 87.5043 Ops/s 85.6708 Ops/s $\color{#35bf28}+2.14\%$
test_a2c_speed[True-None] 3.2536ms 2.9947ms 333.9223 Ops/s 330.1688 Ops/s $\color{#35bf28}+1.14\%$
test_a2c_speed[True-backward] 8.6268ms 8.3534ms 119.7113 Ops/s 120.3556 Ops/s $\color{#d91a1a}-0.54\%$
test_a2c_speed[reduce-overhead-None] 3.1690ms 2.9522ms 338.7328 Ops/s 332.0911 Ops/s $\color{#35bf28}+2.00\%$
test_a2c_speed[reduce-overhead-backward] 8.4592ms 8.2734ms 120.8699 Ops/s 120.7710 Ops/s $\color{#35bf28}+0.08\%$
test_ppo_speed[False-None] 5.7350ms 5.5521ms 180.1131 Ops/s 176.7782 Ops/s $\color{#35bf28}+1.89\%$
test_ppo_speed[False-backward] 12.2726ms 11.9267ms 83.8453 Ops/s 83.2403 Ops/s $\color{#35bf28}+0.73\%$
test_ppo_speed[True-None] 3.6712ms 3.3798ms 295.8795 Ops/s 292.8098 Ops/s $\color{#35bf28}+1.05\%$
test_ppo_speed[True-backward] 8.4543ms 8.0567ms 124.1202 Ops/s 122.8659 Ops/s $\color{#35bf28}+1.02\%$
test_ppo_speed[reduce-overhead-None] 3.6132ms 3.3643ms 297.2343 Ops/s 297.6362 Ops/s $\color{#d91a1a}-0.14\%$
test_ppo_speed[reduce-overhead-backward] 8.3409ms 8.0924ms 123.5734 Ops/s 123.1159 Ops/s $\color{#35bf28}+0.37\%$
test_reinforce_speed[False-None] 4.6401ms 4.3644ms 229.1249 Ops/s 224.4873 Ops/s $\color{#35bf28}+2.07\%$
test_reinforce_speed[False-backward] 8.8319ms 7.1636ms 139.5938 Ops/s 138.6272 Ops/s $\color{#35bf28}+0.70\%$
test_reinforce_speed[True-None] 2.3469ms 2.1637ms 462.1802 Ops/s 451.4373 Ops/s $\color{#35bf28}+2.38\%$
test_reinforce_speed[True-backward] 7.2798ms 6.9780ms 143.3067 Ops/s 142.1416 Ops/s $\color{#35bf28}+0.82\%$
test_reinforce_speed[reduce-overhead-None] 2.4338ms 2.1739ms 460.0064 Ops/s 455.3035 Ops/s $\color{#35bf28}+1.03\%$
test_reinforce_speed[reduce-overhead-backward] 7.2113ms 7.0105ms 142.6428 Ops/s 141.0207 Ops/s $\color{#35bf28}+1.15\%$
test_iql_speed[False-None] 19.5440ms 18.7463ms 53.3437 Ops/s 51.2771 Ops/s $\color{#35bf28}+4.03\%$
test_iql_speed[False-backward] 30.1240ms 29.2918ms 34.1393 Ops/s 33.5615 Ops/s $\color{#35bf28}+1.72\%$
test_iql_speed[True-None] 7.0460ms 6.6264ms 150.9111 Ops/s 147.1600 Ops/s $\color{#35bf28}+2.55\%$
test_iql_speed[True-backward] 17.3864ms 15.2733ms 65.4738 Ops/s 64.5407 Ops/s $\color{#35bf28}+1.45\%$
test_iql_speed[reduce-overhead-None] 6.9982ms 6.6135ms 151.2068 Ops/s 149.0677 Ops/s $\color{#35bf28}+1.44\%$
test_iql_speed[reduce-overhead-backward] 15.5780ms 15.1344ms 66.0748 Ops/s 63.6418 Ops/s $\color{#35bf28}+3.82\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2184ms 6.0131ms 166.3048 Ops/s 164.5016 Ops/s $\color{#35bf28}+1.10\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4255ms 0.2705ms 3.6972 KOps/s 3.7314 KOps/s $\color{#d91a1a}-0.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4576ms 0.2450ms 4.0809 KOps/s 4.0947 KOps/s $\color{#d91a1a}-0.34\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0117ms 5.7129ms 175.0415 Ops/s 172.6246 Ops/s $\color{#35bf28}+1.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1182ms 0.3227ms 3.0991 KOps/s 3.9443 KOps/s $\textbf{\color{#d91a1a}-21.43\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5637ms 0.2940ms 3.4009 KOps/s 4.3197 KOps/s $\textbf{\color{#d91a1a}-21.27\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4488ms 1.2595ms 793.9516 Ops/s 835.5233 Ops/s $\color{#d91a1a}-4.98\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3239ms 1.1429ms 874.9551 Ops/s 873.9847 Ops/s $\color{#35bf28}+0.11\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2266ms 5.9717ms 167.4569 Ops/s 167.7685 Ops/s $\color{#d91a1a}-0.19\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8406ms 0.4926ms 2.0300 KOps/s 2.4959 KOps/s $\textbf{\color{#d91a1a}-18.67\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7176ms 0.4763ms 2.0994 KOps/s 2.6236 KOps/s $\textbf{\color{#d91a1a}-19.98\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0211ms 5.8453ms 171.0778 Ops/s 170.4358 Ops/s $\color{#35bf28}+0.38\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1538ms 0.3620ms 2.7621 KOps/s 3.7278 KOps/s $\textbf{\color{#d91a1a}-25.91\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5888ms 0.3436ms 2.9103 KOps/s 4.1180 KOps/s $\textbf{\color{#d91a1a}-29.33\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1349ms 5.8144ms 171.9882 Ops/s 171.8955 Ops/s $\color{#35bf28}+0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8437ms 0.3466ms 2.8855 KOps/s 3.7642 KOps/s $\textbf{\color{#d91a1a}-23.34\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5470ms 0.3313ms 3.0186 KOps/s 4.2704 KOps/s $\textbf{\color{#d91a1a}-29.31\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2152ms 6.0127ms 166.3144 Ops/s 167.2341 Ops/s $\color{#d91a1a}-0.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0200ms 0.4545ms 2.2000 KOps/s 2.2225 KOps/s $\color{#d91a1a}-1.01\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7302ms 0.4280ms 2.3362 KOps/s 2.2263 KOps/s $\color{#35bf28}+4.94\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4486s 14.0387ms 71.2317 Ops/s 192.3524 Ops/s $\textbf{\color{#d91a1a}-62.97\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.0397ms 2.0641ms 484.4751 Ops/s 453.8736 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.0126ms 1.0455ms 956.5097 Ops/s 890.8496 Ops/s $\textbf{\color{#35bf28}+7.37\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.5973ms 5.1709ms 193.3897 Ops/s 33.8822 Ops/s $\textbf{\color{#35bf28}+470.77\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.4142ms 1.9610ms 509.9458 Ops/s 480.1600 Ops/s $\textbf{\color{#35bf28}+6.20\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.5820ms 1.2663ms 789.7230 Ops/s 901.0137 Ops/s $\textbf{\color{#d91a1a}-12.35\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3744s 12.7651ms 78.3389 Ops/s 176.5759 Ops/s $\textbf{\color{#d91a1a}-55.63\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.9969ms 1.9337ms 517.1405 Ops/s 466.4676 Ops/s $\textbf{\color{#35bf28}+10.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.4412ms 1.3512ms 740.1058 Ops/s 703.6172 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.6632ms 12.3171ms 81.1881 Ops/s 79.0719 Ops/s $\color{#35bf28}+2.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 21.1371ms 16.5028ms 60.5958 Ops/s 61.3912 Ops/s $\color{#d91a1a}-1.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3968ms 17.0293ms 58.7225 Ops/s 58.0103 Ops/s $\color{#35bf28}+1.23\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.0051ms 16.3793ms 61.0526 Ops/s 58.4563 Ops/s $\color{#35bf28}+4.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.8675ms 16.9666ms 58.9392 Ops/s 58.0853 Ops/s $\color{#35bf28}+1.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.7301ms 17.7492ms 56.3404 Ops/s 56.3694 Ops/s $\color{#d91a1a}-0.05\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit 323f4d7 into gh/vmoens/5/base Nov 8, 2024
1 check passed
@vmoens vmoens deleted the gh/vmoens/5/head branch November 8, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants