-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pos loss cannot be reduced #24
Comments
Hi, many thanks for your kind attention to our work! Line 203 in f70606b
We also provide our training log here https://www.dropbox.com/scl/fi/rmqy9n2gio5tptbhlt239/20240115_232208.txt?rlkey=0jmnpz3n77bb1b9r9wt9aqkrv&dl=0, for comparisons. |
Weird, seems the loss did not converge as our provided log. I cannot find the problems now. |
Thanks for your reply. The batch_size is 4 and sw_batch_size is 1. Because I used voco_head_old, which does not include student-teacher. In addition, the reason for setting it to 1 is that I saw that there was an operation that modified the random cropping to 1 before. |
Oh I see. You are using the old version. The link here is some researcher modified by himself, which may be due to the limitation of GPU memory. For the old version, I recommend you to try sw_batch_size=4 if you have enough GPU resources. |
Hello author, I used your original voco_head file for pre-training, but found that the pos loss could not decrease normally. What could be the problem?
The text was updated successfully, but these errors were encountered: