You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to build a BERT-LSTM model for your paper using Movie Reviews data, but I couldn't reproduce the paper results.
My results are as follows.
Training: train 0.925, validation 0.833, test 0.849
prediction:
Performance AURPRC comprehensiveness sufficiency
BERT-LSTM + Attention 0.829 0.463 0.223 0.141
BERT-LSTM + Simple Gradient 0.829 0.469 0.222 0.141
The performance in Table 4 of the paper is 0.974, and my result is 0.829, which is very different.
What I changed from the parameters listed in the README is that the predict batch size is 4 to 2 due to lack of memory.
My environment is as follows:
Memory 65G, GPU NVIDIA Tesla 32GB
Could you tell me if there are any parameter differences or any other differences from the paper experiments?
The text was updated successfully, but these errors were encountered:
Hi, I tried to build bert_encoder_generator using Movie Reviews data, but I met with some issues.
The training processes are normal, but the results on the validation data are always the same with
fscore_NEG: 0.000
fscore_POS: 0.667.
I try different bert learning rate with 5e-1, 5e-2, 5e-3, 5e-4, 5e-5. However, the results on the validation dataset are the same.
Could you show me how to set parameters?
Hi, the Bert encoder generator model is extremely unstable hence it is not surprising that you are getting bad results. Could you try with word_emb_encoder_generator model ? Also try setting reinforce_loss_weight to 0 here
Hello. Let me ask you a question.
I tried to build a BERT-LSTM model for your paper using Movie Reviews data, but I couldn't reproduce the paper results.
My results are as follows.
Training: train 0.925, validation 0.833, test 0.849
prediction:
Performance AURPRC comprehensiveness sufficiency
BERT-LSTM + Attention 0.829 0.463 0.223 0.141
BERT-LSTM + Simple Gradient 0.829 0.469 0.222 0.141
The performance in Table 4 of the paper is 0.974, and my result is 0.829, which is very different.
What I changed from the parameters listed in the README is that the predict batch size is 4 to 2 due to lack of memory.
My environment is as follows:
Memory 65G, GPU NVIDIA Tesla 32GB
Could you tell me if there are any parameter differences or any other differences from the paper experiments?
The text was updated successfully, but these errors were encountered: