-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to reproduce the results #6
Comments
Hi, Could you give a bit more detail on what dataset you are training on? Are you training on the full RealEstate10k dataset from Youtube? For the actual training -- can you also describe when you turn LPIPS and depth loss coefficients on? In general we found that training for a long time without those losses (maybe 3-5 days) and then tuning LPIPS and depth loss can lead to improvements in performance. |
Hi -- yes apologies about that -- I think some of the additional training details in the appendix may not be fully accurate -- you might get better performance with training without L1 loss for around 100k iterations and then train with the regularization loss later. In terms of the numbers in paper -- they were the ones I got from a checkpoint before CVPR -- I ended up refactoring the entire codebase for the code release and the released pretrained weights are based off a model I trained with the refactored codebase. I believe if you train for the model for longer, you would likely be able to improve over the numbers of the provided pretrained model. You may also have to tune the LPIPS coefficient loss a bit, as well as the patch size in which it is applied (I did a lot of ad-hoc hacks to train the model at the CVPR time to try to improve the performance of the model). |
I would primarily worry about reproducing the PSNR / MSE results in the paper. You can improve the SSIM by decreasing the coefficient of the depth loss and you can improve the LPIPS by increasing the coefficient on the LPIPS loss. |
No problem and thank you very much for the clarification and guidance! Ok, I will train the model for longer and see the results. |
Dear @yilundu Despite following your advice, I am yet to successfully replicate the results presented in your paper. I extended the training to 150,000 iterations, which led to a PSNR value of 21.5 — slightly surpassing the number you reported in your paper. However, finetuing with LPIPS and depth loss slightly improved LPIPS but deteriorated PSNR and SSIM. I put the table of each metric with the trained models below. Regarding the coefficient; I decreased the coefficient for depth loss from 0.05 -> 0.01 and increased the coeff for LPIPS from 0.1 to 0.2 or 0.5. Both LPIPS coeffs produce similar scores. Could there be any specifics in the implementation that might be impacting the results? Any guidance or suggestions you could provide would be greatly appreciated.
|
Hi, Sorry about the difficulty in reproducing the results -- the model I used at CVPR had a combinations of hacks (I kept on tuning different hyperparameters between different parts of training to improve the visual quality). A couple things that might be helpful: Since a lot of this isn't so reproducible -- if you would like to compare with the paper I think it would be fine to use your current numbers as the PSNR roughly seems to match (which was the main metric I optimized for anyways besides visual quality) |
Hi, Thank you for the reply!
Do you have any suggestions regarding the specific patch window sizes to test? As I understand it, the current size is 32x32. Would it be advisable to experiment with sizes such as 48x48 or 64x64? |
Hi, Sorry about the late reply -- I've gotten a bit busy with the ICLR deadline. Yes it could be interesting to try 48x48 and 64x64 or potentially 16x16. Best, |
Dear authors,
I recently tried to replicate the results presented in the paper by rerunning the repository code myself. However, I encountered a discrepancy wherein the results I obtained did not match the numbers reflected in the table presented in your documentation.
Furthermore, I evaluated the released pre-trained model and noticed a decline in performance, particularly in the LPIPS and SSIM metrics, compared to the reported values. I have noted the numerical results that I obtained in the below table.
I would greatly appreciate any insights or suggestions you might have that could help me identify potential reasons for this discrepancy.
Regarding the settings and the code:
I used the same version of Pytorch and Torchvision and ran the code on 4-V100, which is the same GPU configuration noted in the paper.
I did not make any changes to the original code, except for the data processing section, due to an error that stopped the progress. More specifically, I observed that certain images were not conforming to the expected (360, 640) resolution and were not being reshaped into the appropriate shape because of not passing this line. To address this, I made the following modifications to ensure proper image reshaping and to adjust the intrinsic parameters accordingly:
The text was updated successfully, but these errors were encountered: