Effect of eval_freq #155

zyw0319 · 2024-07-13T11:53:36Z

Dear Reinis Cimurs,
I recently read your essay titled "Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning"，I think your paper is fantastic and having watched your videos on youtube, I can't wait to implement it.
I have a problem. After I run the python3 test_velodyne_td3.py code, I find that the agent in gazebo can run normally, but the terminal cannot print the message "Average Reward over %i Evaluation Episodes, Epoch %i: %f, %f" as shown in Figure 1. When I change the parameter eval_freq = 5e3 to 500, I find that the message can be printed normally as shown in Figure 2. Can you give me some suggestions? Thank you again.
图一
图二

zyw0319 · 2024-07-13T12:09:48Z

I retested the following and found that when eval_freq=5000, it takes 30 minutes for the terminal to print the message as shown in Figure 3. Is this normal?
图三

reiniscimurs · 2024-07-13T20:21:57Z

Hi, yes that makes sense. The message is printed only during evaluation at the end of each epoch. Epoch runs for about 5000 steps. Each step takes 0.1 seconds so each epoch by default should take about 5000 * 0.1 = 500 seconds or a bit more than 8 minutes. Add some training time and that means each epoch will run for about 10 minutes. 30 minutes is quite a long time and you should check if your ROS simulation can run in real time. Other than that it seems like everything is performing normally.

zyw0319 · 2024-07-14T02:42:50Z

Thank you very much for your reply. I have two questions at present.
First: Before I execute the algorithm python3 test_velodyne_td3.py, I first activated ROS and gazebo environment. Will activating gazebo environment affect the training speed?
Second: I saw that your article training took 8 hours. How can I get the same training time as you? I hope you can give me some suggestions.Thank you again.

reiniscimurs · 2024-07-14T14:12:11Z

No that will not affect the speed of training. It is also not required as it will automatically be launched during the execution of the program. Just fyi, test_velodyne_td3.py just tests a trained network, it will not train a model.
You can see the tutorial for some guidance on this.

zyw0319 · 2024-07-15T02:27:56Z

Thank you very much for your reply, which helped me solve the problems I encountered recently. I still have some questions about training. Please help me answer them.
First point: I have modified 1000 to 2000 of the TD3 file according to your tutorial. I have trained for 15 hours, a total of 66 epochs. Is this speed reasonable? As shown in Figure 1
Second: Did you use GPU for training? My GPU occupancy rate is shown in Figure 2. Is it reasonable? If you call GPU, can you give me some information?
Thank you again.
图一
图二

reiniscimurs · 2024-07-15T06:12:37Z

It is hard for me to judge. I would think you would have more epochs by that point but it depends on your computer resources and ability to run the simulation in real-time
See discussion here: Slow training speed and low GPU utilization #147

zyw0319 · 2024-07-20T09:49:54Z

Thank you very much for your patient guidance. I have a few questions that I would like to ask you again.

First, I read your paper and code and found that the termination condition of the paper training is 800 epochs, and the code is max_timesteps = 5e6. Which condition is used?

Second, Figures 1, 2, and 3 are the visualization results of my current tensorboard. Why does the loss function keep rising? Is this normal?

reiniscimurs · 2024-07-22T07:57:53Z

You can see the discussion on that here: How long will it take to finish a training process? #141
See the discussion here: Training convergence problem #103

zyw0319 · 2024-08-23T10:09:17Z

Hello, I have successfully tested your code for 100 times, and the success rate of reaching the target point is about 87%. I have two questions to ask you.

Is this success rate reasonable?
I found that the starting point and target point of the test code are also random. I want to change the starting point and target point to my own defined point during the test. Which part of the code should I modify? Can you give me some suggestions?
Thank you

reiniscimurs · 2024-08-25T12:45:26Z

You could probably reach precision of 95 to 98% if the model trains well and is stopped at the right time, but 87% seems like a somewhat reasonable result.
The points are set in the reset method:

DRL-robot-navigation/TD3/velodyne_env.py

Line 234 in aba927a

def reset(self):

You can update it according to your needs.

xjuzyw0319 · 2024-08-31T09:17:24Z

I have modified the source code according to your suggestion, as shown in Figures 1 and 2 below. I trained the agent for 328 epochs, tested random target points and starting points, and fixed starting points and test points respectively, and found that the random success rate was 84% and the fixed success rate was 0% (as shown in Figure 3). Could you give me some advice?

reiniscimurs · 2024-08-31T11:19:30Z

Likely your distance to the goal is outside the trained range. meaning, when you train the model the max value of distance to the goal you will see is around 10 meters. Here the distance is something like 12.5 which is a value the model has never seen and trained on.

xjuzyw0319 · 2024-10-27T13:15:52Z

Dear Mr. REINISCIMURS, first of all, thank you very much for your code. I am going to add an experience pool model on the basis of your code, but I currently encounter a problem with a dimension. Intersection At present, the reason for my analysis is that the starting point and the target point during the training are inconsistent every round, which causes the torch.size dimension to be inconsistent. Thank you again

reiniscimurs · 2024-10-27T14:23:32Z

At present, the reason for my analysis is that the starting point and the target point during the training are inconsistent every round, which causes the torch.size dimension to be inconsistent

Hi. The start and goal positions are random for each episode by design and is the intended behavior. However, it should not change the input size in any way. Your issue stems from having a different state representation that the default one in this repo. That means, your input vector is actually 36 values instead of expected 24. That means there is some change in the code. Please provide code changes to actually see what the issue is.

Also note that for new issues, better to open a new issue and fill in the issue template. Without information that is asked there, it is really difficult to help and answer questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Effect of eval_freq #155

Effect of eval_freq #155

zyw0319 commented Jul 13, 2024

zyw0319 commented Jul 13, 2024

reiniscimurs commented Jul 13, 2024

zyw0319 commented Jul 14, 2024 •

edited

Loading

reiniscimurs commented Jul 14, 2024

zyw0319 commented Jul 15, 2024

reiniscimurs commented Jul 15, 2024

zyw0319 commented Jul 20, 2024 •

edited

Loading

reiniscimurs commented Jul 22, 2024

zyw0319 commented Aug 23, 2024

reiniscimurs commented Aug 25, 2024

xjuzyw0319 commented Aug 31, 2024

reiniscimurs commented Aug 31, 2024

xjuzyw0319 commented Oct 27, 2024

reiniscimurs commented Oct 27, 2024

Effect of eval_freq #155

Effect of eval_freq #155

Comments

zyw0319 commented Jul 13, 2024

zyw0319 commented Jul 13, 2024

reiniscimurs commented Jul 13, 2024

zyw0319 commented Jul 14, 2024 • edited Loading

reiniscimurs commented Jul 14, 2024

zyw0319 commented Jul 15, 2024

reiniscimurs commented Jul 15, 2024

zyw0319 commented Jul 20, 2024 • edited Loading

reiniscimurs commented Jul 22, 2024

zyw0319 commented Aug 23, 2024

reiniscimurs commented Aug 25, 2024

xjuzyw0319 commented Aug 31, 2024

reiniscimurs commented Aug 31, 2024

xjuzyw0319 commented Oct 27, 2024

reiniscimurs commented Oct 27, 2024

zyw0319 commented Jul 14, 2024 •

edited

Loading

zyw0319 commented Jul 20, 2024 •

edited

Loading