-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Effect of eval_freq #155
Comments
Hi, yes that makes sense. The message is printed only during evaluation at the end of each epoch. Epoch runs for about 5000 steps. Each step takes 0.1 seconds so each epoch by default should take about 5000 * 0.1 = 500 seconds or a bit more than 8 minutes. Add some training time and that means each epoch will run for about 10 minutes. 30 minutes is quite a long time and you should check if your ROS simulation can run in real time. Other than that it seems like everything is performing normally. |
Thank you very much for your reply. I have two questions at present. |
|
|
Thank you very much for your patient guidance. I have a few questions that I would like to ask you again. First, I read your paper and code and found that the termination condition of the paper training is 800 epochs, and the code is max_timesteps = 5e6. Which condition is used? Second, Figures 1, 2, and 3 are the visualization results of my current tensorboard. Why does the loss function keep rising? Is this normal? |
|
Hello, I have successfully tested your code for 100 times, and the success rate of reaching the target point is about 87%. I have two questions to ask you.
|
|
Likely your distance to the goal is outside the trained range. meaning, when you train the model the max value of distance to the goal you will see is around 10 meters. Here the distance is something like 12.5 which is a value the model has never seen and trained on. |
Hi. The start and goal positions are random for each episode by design and is the intended behavior. However, it should not change the input size in any way. Your issue stems from having a different state representation that the default one in this repo. That means, your input vector is actually 36 values instead of expected 24. That means there is some change in the code. Please provide code changes to actually see what the issue is. Also note that for new issues, better to open a new issue and fill in the issue template. Without information that is asked there, it is really difficult to help and answer questions. |
Dear Reinis Cimurs,
I recently read your essay titled "Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning",I think your paper is fantastic and having watched your videos on youtube, I can't wait to implement it.
I have a problem. After I run the python3 test_velodyne_td3.py code, I find that the agent in gazebo can run normally, but the terminal cannot print the message "Average Reward over %i Evaluation Episodes, Epoch %i: %f, %f" as shown in Figure 1. When I change the parameter eval_freq = 5e3 to 500, I find that the message can be printed normally as shown in Figure 2. Can you give me some suggestions? Thank you again.
图一
图二
The text was updated successfully, but these errors were encountered: