Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value error after running 50000 timesteps #1251

Open
arun-dezerv opened this issue Jun 27, 2024 · 1 comment
Open

Value error after running 50000 timesteps #1251

arun-dezerv opened this issue Jun 27, 2024 · 1 comment

Comments

@arun-dezerv
Copy link

arun-dezerv commented Jun 27, 2024

Stock Dimension: 30, State Space: 2371
<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>
{'batch_size': 64, 'buffer_size': 100000, 'learning_rate': 0.001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}
Using cpu device
Logging to /content/drive/MyDrive/trained_model_bpm/sac_2010-01-01_2017-01-01_0.001_100000_1

| time/ | |
| episodes | 4 |
| fps | 20 |
| time_elapsed | 338 |
| total_timesteps | 7048 |
| train/ | |
| actor_loss | 2.31e+05 |
| critic_loss | 3.53e+06 |
| ent_coef | 84.8 |
| ent_coef_loss | -1.75e+03 |
| learning_rate | 0.001 |
| n_updates | 6947 |
| reward | 4.956748 |


| time/ | |
| episodes | 8 |
| fps | 20 |
| time_elapsed | 687 |
| total_timesteps | 14096 |
| train/ | |
| actor_loss | 2.58e+08 |
| critic_loss | 2.72e+13 |
| ent_coef | 9.74e+04 |
| ent_coef_loss | -4.53e+03 |
| learning_rate | 0.001 |
| n_updates | 13995 |
| reward | 4.956748 |

day: 1761, episode: 10
begin_total_asset: 1000000.00
end_total_asset: 3142988.68
total_reward: 2142988.68
total_cost: 0.00
total_trades: 29925
Sharpe: 1.135


| time/ | |
| episodes | 12 |
| fps | 20 |
| time_elapsed | 1035 |
| total_timesteps | 21144 |
| train/ | |
| actor_loss | 2.78e+11 |
| critic_loss | 2.76e+20 |
| ent_coef | 1.12e+08 |
| ent_coef_loss | -7.31e+03 |
| learning_rate | 0.001 |
| n_updates | 21043 |
| reward | 4.956748 |


| time/ | |
| episodes | 16 |
| fps | 20 |
| time_elapsed | 1388 |
| total_timesteps | 28192 |
| train/ | |
| actor_loss | 5.54e+13 |
| critic_loss | 2.92e+27 |
| ent_coef | 1.28e+11 |
| ent_coef_loss | -1.01e+04 |
| learning_rate | 0.001 |
| n_updates | 28091 |
| reward | 4.956748 |

day: 1761, episode: 20
begin_total_asset: 1000000.00
end_total_asset: 3142988.68
total_reward: 2142988.68
total_cost: 0.00
total_trades: 29925
Sharpe: 1.135


| time/ | |
| episodes | 20 |
| fps | 20 |
| time_elapsed | 1744 |
| total_timesteps | 35240 |
| train/ | |
| actor_loss | 6.23e+16 |
| critic_loss | 3.83e+33 |
| ent_coef | 1.47e+14 |
| ent_coef_loss | -1.28e+04 |
| learning_rate | 0.001 |
| n_updates | 35139 |
| reward | 4.956748 |


| time/ | |
| episodes | 24 |
| fps | 20 |
| time_elapsed | 2109 |
| total_timesteps | 42288 |
| train/ | |
| actor_loss | 7.17e+19 |
| critic_loss | inf |
| ent_coef | 1.69e+17 |
| ent_coef_loss | -1.57e+04 |
| learning_rate | 0.001 |
| n_updates | 42187 |
| reward | 4.956748 |


| time/ | |
| episodes | 28 |
| fps | 19 |
| time_elapsed | 2473 |
| total_timesteps | 49336 |
| train/ | |
| actor_loss | 8.21e+22 |
| critic_loss | inf |
| ent_coef | 1.93e+20 |
| ent_coef_loss | -1.84e+04 |
| learning_rate | 0.001 |
| n_updates | 49235 |
| reward | 4.956748 |

day: 1761, episode: 30
begin_total_asset: 1000000.00
end_total_asset: 3142988.68
total_reward: 2142988.68
total_cost: 0.00
total_trades: 29925
Sharpe: 1.135


ValueError Traceback (most recent call last)
in <cell line: 78>()
74 model_sac.set_logger(new_logger_sac)
75
---> 76 trained_sac = agent.train_model(model=model_sac,
77 tb_log_name='sac',
78 total_timesteps=timesteps) if if_using_sac else None

15 frames
/usr/local/lib/python3.10/dist-packages/torch/distributions/distribution.py in init(self, batch_shape, event_shape, validate_args)
66 valid = constraint.check(value)
67 if not valid.all():
---> 68 raise ValueError(
69 f"Expected parameter {param} "
70 f"({type(value).name} of shape {tuple(value.shape)}) "

ValueError: Expected parameter loc (Tensor of shape (1, 30)) of distribution Normal(loc: torch.Size([1, 30]), scale: torch.Size([1, 30])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan]])

    I am getting the above Value Error after the model has run for 50,000 timesteps. As you can see,  the actor_loss and critic_loss hit very high values. Could this be causing this error? Also, total_trades is stuck at 29925 and not changing through multiple episodes. Any idea why this could be happening?
@BruceYanghy
Copy link
Member

Thank you for bringing up the issue. Currently, the FinRL library is extremely poorly maintained. Rest assured, I will reorganize a team to ensure its proper maintenance.

Best regards,

Bruce Yang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants