Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretraining the network #10

Open
isheetajha opened this issue Dec 17, 2018 · 15 comments
Open

Pretraining the network #10

isheetajha opened this issue Dec 17, 2018 · 15 comments

Comments

@isheetajha
Copy link

Hi @Yuliang-Zou Thanks for sharing the code for the reimplementation. I wanted to replicate the results as mentioned in the paper. Is it possible for you to share the code changes required for 2 frames?
Can you also please share the results after pretraining the depth and flow network with 2 frame setup?

@roboticsbala
Copy link

@Yuliang-Zou In the kitti_5frame dataset you have train.txt and val.txt files. But I don't find the val.txt being used anywhere. Is that file used anywhere?

@Yuliang-Zou
Copy link
Member

@isheetajha Due to some system update, I cannot get access to the old code base. I will share the results soon.

@Yuliang-Zou
Copy link
Member

@roboticsbala You can use val.txt to test the depth estimation performance for model selection. Simply replace test.txt with val.txt

@isheetajha
Copy link
Author

isheetajha commented Dec 21, 2018

@Yuliang-Zou Thanks a lot.
Will the losses for depth prediction remain same for pretraining as well as joint training?

@Yuliang-Zou
Copy link
Member

@isheetajha I actually used the simple L2 photometric loss for the pre-training, since I found it easier to train the network.

@isheetajha
Copy link
Author

@Yuliang-Zou I tried your suggestion of replacing the ternary loss with L2 photometric loss but still pretraining is not happening properly. Even after several iterations the error do not change much. Also, I increased the learning rate to 0.0002. On testing the depth network I get a very high error. Any suggestions would be helpful.
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.4607, 5.0703, 12.4489, 0.5938, 0.0000, 0.2855, 0.5361, 0.7518

Earlier I was getting the following error:
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.4429, 4.7578, 12.0834, 0.5876, 0.0000, 0.3033, 0.560

@Yuliang-Zou
Copy link
Member

@isheetajha Did you use val set to pick the best model? Or did you monitor the training progress using tensorboard visualization? The training loss will not decrease obviously in the tensorboard, so you need to monitor the visualization.

@isheetajha
Copy link
Author

@Yuliang-Zou I did not use the val set to pick the best model, I just tested all the checkpoint models using split=test. Also, I monitored the training progress using tensorboard. Following are some snapshots.
pixel loss
2

total loss
4

smoothness loss
3

@Yuliang-Zou
Copy link
Member

@isheetajha The pixel loss curve looks good, but the smoothness loss curve clearly indicates that the training fails (since it decreases to zero quickly, if you observe the predicted depth visualization, you should find it all white).

Actually, you should monitor the visualization to decide if you should early-stop to keep training, sometimes the training will fail due to the randomness of CUDA operations or inappropriate hyperparameters. Let me see if I can find my hyperparameter settings (I am out of town so it might take some time)

@isheetajha
Copy link
Author

@Yuliang-Zou Thanks for your reply. If you could share the hyperparameters it would be great.

The smoothness loss very quickly becomes 0 but the pixel loss shows erratic behavior which I was more worried about. Is this expected?

@isheetajha
Copy link
Author

@Yuliang-Zou Is it possible for you to share the hyperparameters for pretraining? I have been experimenting with different smoothness weight(0.5, 1, 1.5). It does train for a while but smoothness loss plummets to zero.

@Yuliang-Zou
Copy link
Member

Yes, definitely. I just got back to campus and would take a look during the weekend.

@shujonnaha
Copy link

@Yuliang-Zou @isheetajha Having the same issue, smoothness loss falls to zero and stays there for depth pretraining. Any idea how to fix that?

@ReekiLee
Copy link

@Yuliang-Zou @isheetajha @shujonnaha
Hi, sorry to bother you guys. I have the same problem, after training the predicted depth map is all white. And I also got the test error like:

root@mygpu:~/DF-Net# python kitti_eval/eval_depth.py --pred_file=./prediction/model-70000.npy --split='test'
abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.4499, 4.8663, 12.4314, 0.5961, 0.0000, 0.2985, 0.5540, 0.7693

even though I change the trained models from 10000 to 99999 I got the same test error.

If you have some idea to solve the problem, please reply me. I'm in urgent to reimplement this model. Thanks!

@ReekiLee
Copy link

@Yuliang-Zou @isheetajha @shujonnaha Hi, sorry to bother you guys. I have the same problem, after training the predicted depth map is all white. And I also got the test error like:

root@mygpu:~/DF-Net# python kitti_eval/eval_depth.py --pred_file=./prediction/model-70000.npy --split='test' abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3 0.4499, 4.8663, 12.4314, 0.5961, 0.0000, 0.2985, 0.5540, 0.7693

even though I change the trained models from 10000 to 99999 I got the same test error.

If you have some idea to solve the problem, please reply me. I'm in urgent to reimplement this model. Thanks!

I've found where the problem is. When I was training, I deleted some lines in the ./dataset/train.txt and I got the results above. After I re-train the model using the un-modified train.txt, I succed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants