Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix slice sampler #1762

Merged
merged 1 commit into from
Dec 25, 2023
Merged

[BugFix] Fix slice sampler #1762

merged 1 commit into from
Dec 25, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 25, 2023

Copy link

pytorch-bot bot commented Dec 25, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1762

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit ec839f7 with merge base 64434df (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

  • Build M1 Wheels / pytorch/rl / wheel-py3_8-cpu (gh)
    ImportError: cannot import name 'MemoryMappedTensor' from 'tensordict' (/Users/ec2-user/runner/_work/_temp/conda_environment_7322749480/lib/python3.8/site-packages/tensordict/__init__.py)

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 25, 2023
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1384s 70.9348ms 14.0975 Ops/s 15.3774 Ops/s $\textbf{\color{#d91a1a}-8.32\%}$
test_sync 51.7547ms 35.7671ms 27.9586 Ops/s 28.5357 Ops/s $\color{#d91a1a}-2.02\%$
test_async 0.1117s 34.4684ms 29.0121 Ops/s 28.4960 Ops/s $\color{#35bf28}+1.81\%$
test_simple 0.5127s 0.4548s 2.1989 Ops/s 2.2000 Ops/s $\color{#d91a1a}-0.05\%$
test_transformed 0.6916s 0.6367s 1.5706 Ops/s 1.6324 Ops/s $\color{#d91a1a}-3.79\%$
test_serial 1.4829s 1.4241s 0.7022 Ops/s 0.7194 Ops/s $\color{#d91a1a}-2.38\%$
test_parallel 1.4566s 1.3969s 0.7159 Ops/s 0.7272 Ops/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-True-True-True-True] 0.1411ms 21.4947μs 46.5231 KOps/s 45.6774 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[True-True-True-True-False] 38.5020μs 13.2139μs 75.6778 KOps/s 74.3773 KOps/s $\color{#35bf28}+1.75\%$
test_step_mdp_speed[True-True-True-False-True] 38.5210μs 12.8447μs 77.8531 KOps/s 76.7247 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[True-True-True-False-False] 32.6480μs 7.7167μs 129.5891 KOps/s 127.9712 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[True-True-False-True-True] 85.5290μs 23.0061μs 43.4667 KOps/s 42.6660 KOps/s $\color{#35bf28}+1.88\%$
test_step_mdp_speed[True-True-False-True-False] 49.3380μs 14.4274μs 69.3126 KOps/s 68.0595 KOps/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-False-False-True] 55.1220μs 13.9517μs 71.6756 KOps/s 69.6302 KOps/s $\color{#35bf28}+2.94\%$
test_step_mdp_speed[True-True-False-False-False] 48.5800μs 9.0478μs 110.5238 KOps/s 110.7582 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-True-True-True] 50.8140μs 24.3299μs 41.1017 KOps/s 40.5130 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[True-False-True-True-False] 80.3590μs 15.8695μs 63.0141 KOps/s 63.1475 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-True-False-True] 75.8910μs 14.1818μs 70.5130 KOps/s 70.0712 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-False-True-False-False] 28.3830μs 9.1045μs 109.8355 KOps/s 111.1527 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-False-False-True-True] 72.6860μs 25.8350μs 38.7072 KOps/s 38.5967 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-False-False-True-False] 42.9200μs 17.2202μs 58.0713 KOps/s 58.7873 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[True-False-False-False-True] 68.4470μs 15.2358μs 65.6350 KOps/s 64.9632 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-False-False-False-False] 33.9440μs 10.2816μs 97.2611 KOps/s 98.6669 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[False-True-True-True-True] 93.6740μs 24.4819μs 40.8466 KOps/s 40.8380 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-True-True-False] 35.0850μs 15.9135μs 62.8395 KOps/s 63.2771 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-True-True-False-True] 67.0740μs 16.2758μs 61.4409 KOps/s 59.9747 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-True-True-False-False] 44.3120μs 10.3194μs 96.9049 KOps/s 97.6013 KOps/s $\color{#d91a1a}-0.71\%$
test_step_mdp_speed[False-True-False-True-True] 76.4320μs 25.1977μs 39.6861 KOps/s 38.6742 KOps/s $\color{#35bf28}+2.62\%$
test_step_mdp_speed[False-True-False-True-False] 40.7550μs 17.2183μs 58.0777 KOps/s 58.8675 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[False-True-False-False-True] 0.1917ms 18.6567μs 53.6001 KOps/s 56.4774 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_step_mdp_speed[False-True-False-False-False] 57.6670μs 11.5574μs 86.5249 KOps/s 87.6122 KOps/s $\color{#d91a1a}-1.24\%$
test_step_mdp_speed[False-False-True-True-True] 73.0350μs 26.6118μs 37.5773 KOps/s 37.1898 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-False-True-True-False] 49.0710μs 18.5115μs 54.0205 KOps/s 54.2686 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-True-False-True] 57.7570μs 17.4190μs 57.4087 KOps/s 56.2266 KOps/s $\color{#35bf28}+2.10\%$
test_step_mdp_speed[False-False-True-False-False] 67.0640μs 11.3732μs 87.9262 KOps/s 86.9294 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-False-False-True-True] 68.2770μs 27.6146μs 36.2127 KOps/s 35.8536 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-False-False-True-False] 62.7470μs 19.4192μs 51.4954 KOps/s 51.6452 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[False-False-False-False-True] 41.5570μs 18.4602μs 54.1705 KOps/s 53.3034 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[False-False-False-False-False] 0.1370ms 12.5497μs 79.6832 KOps/s 78.7424 KOps/s $\color{#35bf28}+1.19\%$
test_values[generalized_advantage_estimate-True-True] 12.3998ms 12.0246ms 83.1626 Ops/s 78.8936 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_values[vec_generalized_advantage_estimate-True-True] 37.5646ms 28.2486ms 35.4000 Ops/s 35.1530 Ops/s $\color{#35bf28}+0.70\%$
test_values[td0_return_estimate-False-False] 0.2601ms 0.1787ms 5.5964 KOps/s 5.4491 KOps/s $\color{#35bf28}+2.70\%$
test_values[td1_return_estimate-False-False] 26.5235ms 25.4996ms 39.2163 Ops/s 38.5402 Ops/s $\color{#35bf28}+1.75\%$
test_values[vec_td1_return_estimate-False-False] 36.6956ms 28.2623ms 35.3828 Ops/s 35.4702 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td_lambda_return_estimate-True-False] 37.2453ms 35.6347ms 28.0626 Ops/s 27.5968 Ops/s $\color{#35bf28}+1.69\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.7858ms 28.2509ms 35.3971 Ops/s 35.0528 Ops/s $\color{#35bf28}+0.98\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.1577ms 7.9951ms 125.0767 Ops/s 121.9850 Ops/s $\color{#35bf28}+2.53\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2345ms 1.9167ms 521.7338 Ops/s 549.7207 Ops/s $\textbf{\color{#d91a1a}-5.09\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 11.3469ms 0.4402ms 2.2719 KOps/s 2.3050 KOps/s $\color{#d91a1a}-1.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.5667ms 39.9228ms 25.0483 Ops/s 24.8526 Ops/s $\color{#35bf28}+0.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 13.3397ms 2.6766ms 373.6105 Ops/s 370.7701 Ops/s $\color{#35bf28}+0.77\%$
test_dqn_speed 79.8950ms 8.4395ms 118.4911 Ops/s 121.8983 Ops/s $\color{#d91a1a}-2.80\%$
test_ddpg_speed 24.8419ms 14.9422ms 66.9244 Ops/s 68.9244 Ops/s $\color{#d91a1a}-2.90\%$
test_sac_speed 38.6310ms 30.3051ms 32.9977 Ops/s 34.0472 Ops/s $\color{#d91a1a}-3.08\%$
test_redq_speed 40.8831ms 36.4721ms 27.4182 Ops/s 28.0942 Ops/s $\color{#d91a1a}-2.41\%$
test_redq_deprec_speed 36.8600ms 26.0802ms 38.3432 Ops/s 38.9957 Ops/s $\color{#d91a1a}-1.67\%$
test_td3_speed 28.8515ms 20.5960ms 48.5532 Ops/s 49.4794 Ops/s $\color{#d91a1a}-1.87\%$
test_cql_speed 94.9416ms 89.0101ms 11.2347 Ops/s 11.2224 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed 38.7061ms 27.3168ms 36.6075 Ops/s 36.7989 Ops/s $\color{#d91a1a}-0.52\%$
test_ppo_speed 33.8842ms 27.4657ms 36.4090 Ops/s 36.6591 Ops/s $\color{#d91a1a}-0.68\%$
test_reinforce_speed 37.6295ms 26.5393ms 37.6799 Ops/s 38.0481 Ops/s $\color{#d91a1a}-0.97\%$
test_iql_speed 64.6264ms 63.6739ms 15.7050 Ops/s 15.6025 Ops/s $\color{#35bf28}+0.66\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.3940ms 1.9288ms 518.4648 Ops/s 524.9591 Ops/s $\color{#d91a1a}-1.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 8.8630ms 0.5372ms 1.8615 KOps/s 1.9143 KOps/s $\color{#d91a1a}-2.76\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 9.0978ms 0.5223ms 1.9147 KOps/s 1.9411 KOps/s $\color{#d91a1a}-1.36\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.4882ms 1.8855ms 530.3664 Ops/s 536.6169 Ops/s $\color{#d91a1a}-1.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.0892ms 0.5312ms 1.8825 KOps/s 1.9029 KOps/s $\color{#d91a1a}-1.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 12.1533ms 0.5163ms 1.9369 KOps/s 2.0002 KOps/s $\color{#d91a1a}-3.16\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.8879ms 2.1378ms 467.7672 Ops/s 474.1634 Ops/s $\color{#d91a1a}-1.35\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 8.8807ms 0.6892ms 1.4510 KOps/s 1.4970 KOps/s $\color{#d91a1a}-3.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 9.3329ms 0.6780ms 1.4749 KOps/s 1.5180 KOps/s $\color{#d91a1a}-2.84\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 0.1215s 2.2205ms 450.3466 Ops/s 521.5765 Ops/s $\textbf{\color{#d91a1a}-13.66\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6573ms 0.5285ms 1.8920 KOps/s 1.8704 KOps/s $\color{#35bf28}+1.15\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 17.3801ms 0.5440ms 1.8384 KOps/s 1.9664 KOps/s $\textbf{\color{#d91a1a}-6.51\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.1074ms 1.8654ms 536.0730 Ops/s 544.6950 Ops/s $\color{#d91a1a}-1.58\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 9.5392ms 0.5426ms 1.8429 KOps/s 1.9390 KOps/s $\color{#d91a1a}-4.95\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7428ms 0.5108ms 1.9578 KOps/s 1.9348 KOps/s $\color{#35bf28}+1.19\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.5955ms 2.1110ms 473.7038 Ops/s 472.8505 Ops/s $\color{#35bf28}+0.18\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 8.9857ms 0.6987ms 1.4313 KOps/s 1.4962 KOps/s $\color{#d91a1a}-4.34\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7982ms 0.6599ms 1.5154 KOps/s 1.4946 KOps/s $\color{#35bf28}+1.39\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1301s 17.0735ms 58.5702 Ops/s 67.6530 Ops/s $\textbf{\color{#d91a1a}-13.43\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 16.3200ms 12.0693ms 82.8551 Ops/s 82.5162 Ops/s $\color{#35bf28}+0.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.0856ms 1.6090ms 621.5139 Ops/s 635.4432 Ops/s $\color{#d91a1a}-2.19\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1118s 14.6582ms 68.2214 Ops/s 58.5330 Ops/s $\textbf{\color{#35bf28}+16.55\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1140s 14.2888ms 69.9850 Ops/s 81.5279 Ops/s $\textbf{\color{#d91a1a}-14.16\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.3139ms 1.6182ms 617.9621 Ops/s 645.6143 Ops/s $\color{#d91a1a}-4.28\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1143s 14.8424ms 67.3747 Ops/s 58.4884 Ops/s $\textbf{\color{#35bf28}+15.19\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1060s 14.3240ms 69.8128 Ops/s 80.6734 Ops/s $\textbf{\color{#d91a1a}-13.46\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.5118ms 1.7097ms 584.9026 Ops/s 564.1050 Ops/s $\color{#35bf28}+3.69\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1211s 0.1193s 8.3823 Ops/s 8.4017 Ops/s $\color{#d91a1a}-0.23\%$
test_sync 0.1765s 0.1091s 9.1656 Ops/s 9.1638 Ops/s $\color{#35bf28}+0.02\%$
test_async 0.2624s 99.2983ms 10.0707 Ops/s 9.8056 Ops/s $\color{#35bf28}+2.70\%$
test_single_pixels 0.1337s 0.1330s 7.5187 Ops/s 7.0198 Ops/s $\textbf{\color{#35bf28}+7.11\%}$
test_sync_pixels 96.1839ms 95.2496ms 10.4987 Ops/s 10.6007 Ops/s $\color{#d91a1a}-0.96\%$
test_async_pixels 0.1719s 89.2793ms 11.2008 Ops/s 11.0983 Ops/s $\color{#35bf28}+0.92\%$
test_simple 0.9350s 0.8793s 1.1372 Ops/s 1.1527 Ops/s $\color{#d91a1a}-1.35\%$
test_transformed 1.1802s 1.1200s 0.8928 Ops/s 0.8895 Ops/s $\color{#35bf28}+0.38\%$
test_serial 2.5623s 2.5421s 0.3934 Ops/s 0.4069 Ops/s $\color{#d91a1a}-3.32\%$
test_parallel 2.5745s 2.5054s 0.3991 Ops/s 0.4049 Ops/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-True-True-True-True] 97.7820μs 32.2233μs 31.0335 KOps/s 29.8308 KOps/s $\color{#35bf28}+4.03\%$
test_step_mdp_speed[True-True-True-True-False] 42.2610μs 19.4549μs 51.4010 KOps/s 49.7562 KOps/s $\color{#35bf28}+3.31\%$
test_step_mdp_speed[True-True-True-False-True] 43.2210μs 18.5669μs 53.8592 KOps/s 50.6487 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_step_mdp_speed[True-True-True-False-False] 36.0110μs 11.1123μs 89.9906 KOps/s 87.1208 KOps/s $\color{#35bf28}+3.29\%$
test_step_mdp_speed[True-True-False-True-True] 67.2910μs 34.4696μs 29.0111 KOps/s 27.9084 KOps/s $\color{#35bf28}+3.95\%$
test_step_mdp_speed[True-True-False-True-False] 42.5910μs 21.1526μs 47.2756 KOps/s 45.8162 KOps/s $\color{#35bf28}+3.19\%$
test_step_mdp_speed[True-True-False-False-True] 42.9310μs 20.5987μs 48.5467 KOps/s 47.2790 KOps/s $\color{#35bf28}+2.68\%$
test_step_mdp_speed[True-True-False-False-False] 32.3500μs 12.9839μs 77.0184 KOps/s 74.8307 KOps/s $\color{#35bf28}+2.92\%$
test_step_mdp_speed[True-False-True-True-True] 61.4010μs 36.9107μs 27.0924 KOps/s 26.8270 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-False-True-True-False] 49.6400μs 23.1180μs 43.2564 KOps/s 42.2678 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[True-False-True-False-True] 44.3810μs 20.2649μs 49.3464 KOps/s 47.4037 KOps/s $\color{#35bf28}+4.10\%$
test_step_mdp_speed[True-False-True-False-False] 31.1410μs 13.1509μs 76.0404 KOps/s 76.5480 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[True-False-False-True-True] 59.5510μs 37.2257μs 26.8632 KOps/s 25.6784 KOps/s $\color{#35bf28}+4.61\%$
test_step_mdp_speed[True-False-False-True-False] 58.2500μs 24.7624μs 40.3838 KOps/s 39.5469 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[True-False-False-False-True] 43.6210μs 22.2680μs 44.9074 KOps/s 43.4870 KOps/s $\color{#35bf28}+3.27\%$
test_step_mdp_speed[True-False-False-False-False] 35.0020μs 14.6901μs 68.0729 KOps/s 66.1144 KOps/s $\color{#35bf28}+2.96\%$
test_step_mdp_speed[False-True-True-True-True] 58.1010μs 36.0668μs 27.7263 KOps/s 27.2417 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-True-True-False] 43.2400μs 23.0675μs 43.3511 KOps/s 41.5966 KOps/s $\color{#35bf28}+4.22\%$
test_step_mdp_speed[False-True-True-False-True] 46.8400μs 24.3510μs 41.0661 KOps/s 38.9202 KOps/s $\textbf{\color{#35bf28}+5.51\%}$
test_step_mdp_speed[False-True-True-False-False] 34.3010μs 14.8826μs 67.1926 KOps/s 65.8366 KOps/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[False-True-False-True-True] 62.6310μs 37.9087μs 26.3791 KOps/s 25.1158 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_step_mdp_speed[False-True-False-True-False] 52.3110μs 25.0446μs 39.9287 KOps/s 38.9983 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[False-True-False-False-True] 45.6810μs 25.6095μs 39.0480 KOps/s 37.5498 KOps/s $\color{#35bf28}+3.99\%$
test_step_mdp_speed[False-True-False-False-False] 36.9410μs 16.6775μs 59.9611 KOps/s 58.4030 KOps/s $\color{#35bf28}+2.67\%$
test_step_mdp_speed[False-False-True-True-True] 66.2320μs 39.3119μs 25.4376 KOps/s 24.9440 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-False-True-True-False] 46.5700μs 26.7996μs 37.3140 KOps/s 36.1755 KOps/s $\color{#35bf28}+3.15\%$
test_step_mdp_speed[False-False-True-False-True] 73.6510μs 26.6043μs 37.5879 KOps/s 36.5832 KOps/s $\color{#35bf28}+2.75\%$
test_step_mdp_speed[False-False-True-False-False] 41.5510μs 16.5606μs 60.3843 KOps/s 58.5779 KOps/s $\color{#35bf28}+3.08\%$
test_step_mdp_speed[False-False-False-True-True] 80.1020μs 41.6917μs 23.9856 KOps/s 23.1875 KOps/s $\color{#35bf28}+3.44\%$
test_step_mdp_speed[False-False-False-True-False] 51.8410μs 28.7986μs 34.7239 KOps/s 34.4257 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-False-False-False-True] 45.6510μs 27.8098μs 35.9585 KOps/s 35.0519 KOps/s $\color{#35bf28}+2.59\%$
test_step_mdp_speed[False-False-False-False-False] 40.0990μs 18.4574μs 54.1787 KOps/s 54.3709 KOps/s $\color{#d91a1a}-0.35\%$
test_values[generalized_advantage_estimate-True-True] 24.4031ms 24.0164ms 41.6382 Ops/s 43.4595 Ops/s $\color{#d91a1a}-4.19\%$
test_values[vec_generalized_advantage_estimate-True-True] 83.4626ms 3.2107ms 311.4593 Ops/s 304.1674 Ops/s $\color{#35bf28}+2.40\%$
test_values[td0_return_estimate-False-False] 89.8310μs 58.2578μs 17.1651 KOps/s 17.0212 KOps/s $\color{#35bf28}+0.85\%$
test_values[td1_return_estimate-False-False] 51.3712ms 49.7381ms 20.1053 Ops/s 20.3232 Ops/s $\color{#d91a1a}-1.07\%$
test_values[vec_td1_return_estimate-False-False] 2.1681ms 1.7341ms 576.6730 Ops/s 578.1377 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td_lambda_return_estimate-True-False] 82.1070ms 81.2454ms 12.3084 Ops/s 12.6151 Ops/s $\color{#d91a1a}-2.43\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.0323ms 1.7230ms 580.3994 Ops/s 579.5203 Ops/s $\color{#35bf28}+0.15\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 22.9135ms 22.1369ms 45.1734 Ops/s 46.5458 Ops/s $\color{#d91a1a}-2.95\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8171ms 0.6623ms 1.5100 KOps/s 1.5009 KOps/s $\color{#35bf28}+0.60\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7010ms 0.6361ms 1.5720 KOps/s 1.6008 KOps/s $\color{#d91a1a}-1.80\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5310ms 1.4257ms 701.4103 Ops/s 699.4044 Ops/s $\color{#35bf28}+0.29\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8994ms 0.6384ms 1.5665 KOps/s 1.5476 KOps/s $\color{#35bf28}+1.22\%$
test_dqn_speed 13.8181ms 7.4809ms 133.6731 Ops/s 133.7346 Ops/s $\color{#d91a1a}-0.05\%$
test_ddpg_speed 15.3952ms 14.6114ms 68.4399 Ops/s 68.8309 Ops/s $\color{#d91a1a}-0.57\%$
test_sac_speed 30.2191ms 29.4377ms 33.9701 Ops/s 34.0259 Ops/s $\color{#d91a1a}-0.16\%$
test_redq_speed 0.1232s 38.7423ms 25.8116 Ops/s 28.4318 Ops/s $\textbf{\color{#d91a1a}-9.22\%}$
test_redq_deprec_speed 25.0873ms 24.1233ms 41.4537 Ops/s 42.0521 Ops/s $\color{#d91a1a}-1.42\%$
test_td3_speed 20.0178ms 19.6455ms 50.9023 Ops/s 50.7099 Ops/s $\color{#35bf28}+0.38\%$
test_cql_speed 84.5193ms 83.7233ms 11.9441 Ops/s 11.9539 Ops/s $\color{#d91a1a}-0.08\%$
test_a2c_speed 0.1242s 29.4328ms 33.9757 Ops/s 37.0702 Ops/s $\textbf{\color{#d91a1a}-8.35\%}$
test_ppo_speed 28.0404ms 27.0568ms 36.9593 Ops/s 36.6324 Ops/s $\color{#35bf28}+0.89\%$
test_reinforce_speed 27.2629ms 26.0430ms 38.3980 Ops/s 38.1087 Ops/s $\color{#35bf28}+0.76\%$
test_iql_speed 58.0843ms 57.3788ms 17.4280 Ops/s 17.3070 Ops/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.3551ms 2.5821ms 387.2761 Ops/s 386.5344 Ops/s $\color{#35bf28}+0.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9262ms 0.7904ms 1.2652 KOps/s 1.2619 KOps/s $\color{#35bf28}+0.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.8952ms 0.7864ms 1.2717 KOps/s 1.2825 KOps/s $\color{#d91a1a}-0.84\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1281ms 2.5644ms 389.9473 Ops/s 388.4315 Ops/s $\color{#35bf28}+0.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9914ms 0.7827ms 1.2776 KOps/s 1.2806 KOps/s $\color{#d91a1a}-0.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.8039ms 0.7745ms 1.2912 KOps/s 1.2976 KOps/s $\color{#d91a1a}-0.50\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.8299ms 2.8794ms 347.2930 Ops/s 346.7066 Ops/s $\color{#35bf28}+0.17\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0277ms 0.9180ms 1.0893 KOps/s 918.2186 Ops/s $\textbf{\color{#35bf28}+18.64\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 6.2758ms 0.9155ms 1.0923 KOps/s 1.1036 KOps/s $\color{#d91a1a}-1.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9221ms 2.5652ms 389.8356 Ops/s 385.6605 Ops/s $\color{#35bf28}+1.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2103ms 0.7896ms 1.2665 KOps/s 1.2586 KOps/s $\color{#35bf28}+0.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9529ms 0.7783ms 1.2848 KOps/s 1.2716 KOps/s $\color{#35bf28}+1.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1504ms 2.5177ms 397.1953 Ops/s 391.2519 Ops/s $\color{#35bf28}+1.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0010ms 0.7817ms 1.2793 KOps/s 1.2751 KOps/s $\color{#35bf28}+0.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9292ms 0.7705ms 1.2979 KOps/s 1.2950 KOps/s $\color{#35bf28}+0.22\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 0.1355s 3.2595ms 306.7928 Ops/s 350.2290 Ops/s $\textbf{\color{#d91a1a}-12.40\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.5246ms 0.9222ms 1.0844 KOps/s 1.0851 KOps/s $\color{#d91a1a}-0.07\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0174ms 0.9115ms 1.0971 KOps/s 1.0972 KOps/s $\color{#d91a1a}-0.02\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1702s 16.0083ms 62.4678 Ops/s 55.9815 Ops/s $\textbf{\color{#35bf28}+11.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 15.1326ms 12.2002ms 81.9660 Ops/s 82.3379 Ops/s $\color{#d91a1a}-0.45\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.6429ms 1.9088ms 523.9002 Ops/s 520.6501 Ops/s $\color{#35bf28}+0.62\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1218s 14.9884ms 66.7181 Ops/s 58.0321 Ops/s $\textbf{\color{#35bf28}+14.97\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 15.5999ms 12.2645ms 81.5361 Ops/s 81.6986 Ops/s $\color{#d91a1a}-0.20\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 5.9098ms 1.9080ms 524.0974 Ops/s 520.2980 Ops/s $\color{#35bf28}+0.73\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1220s 17.3972ms 57.4806 Ops/s 65.8407 Ops/s $\textbf{\color{#d91a1a}-12.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 15.2773ms 12.2586ms 81.5755 Ops/s 81.0567 Ops/s $\color{#35bf28}+0.64\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.9070ms 2.0800ms 480.7789 Ops/s 487.8402 Ops/s $\color{#d91a1a}-1.45\%$

@vmoens vmoens marked this pull request as ready for review December 25, 2023 15:55
@vmoens vmoens added the bug Something isn't working label Dec 25, 2023
@vmoens vmoens merged commit 80c63ad into main Dec 25, 2023
60 of 63 checks passed
@vmoens vmoens deleted the fix-slice-sampler branch December 25, 2023 15:56
nicklashansen added a commit to nicklashansen/tdmpc2 that referenced this pull request Dec 25, 2023
@nicklashansen
Copy link

nicklashansen commented Jan 7, 2024

@vmoens It appears that the last time step of an episode never gets sampled. Unclear whether this is unique to SliceBuffer or also present in other samplers? Discovered by @dasGringuen using the TD-MPC2 repo and the new SliceBuffer, and reproduced by me.

How to reproduce:

Appending a single time step with all nans does not break the code in any way (they never get sampled during model updates). Appending two time steps with nans raises a CUDA error.

self._tds.append(self.to_td()) # Add NaNs to end of episode
self._tds.append(self.to_td()) # Add NaNs to end of episode <- this line breaks the code
self._ep_idx = self.buffer.add(torch.cat(self._tds))

where self.to_td() constructs a TensorDict that obeys the shape of transitions but is filled with nans.

I have added my "fix" to the tdmpc2 repo here: nicklashansen/tdmpc2@31249a8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants