Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experimental result #3

Open
wsa-dot opened this issue Jun 7, 2022 · 8 comments
Open

experimental result #3

wsa-dot opened this issue Jun 7, 2022 · 8 comments

Comments

@wsa-dot
Copy link

wsa-dot commented Jun 7, 2022

The result I got was only 65. I don't know what was wrong.

@Linda230
Copy link

Hello,
I also get the same result in STS task, I also don't know the reason.

@Linda230
Copy link

The result I got was only 65. I don't know what was wrong.

Hi,
Do you find the reason for this result?

@wsa-dot
Copy link
Author

wsa-dot commented Jun 10, 2022 via email

@Linda230
Copy link

Maybe he is only good at theoretical analysis, but its hybrid method may not be really effective. We need to come up with some new ways to create really useful hard negatives

---Original--- From: @.> Date: Fri, Jun 10, 2022 16:03 PM To: @.>; Cc: @.@.>; Subject: Re: [BDBC-KG-NLP/MixCSE_AAAI2022] experimental result (Issue #3) The result I got was only 65. I don't know what was wrong. Hi, Do you find the reason for this result? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Yes, the theoretical analysis is very valuable, but then I'm curious as to how the results in the paper were derived. Because I basically followed the ReadMe to reproduce it, yet has a large gap with the paper result

@afalf
Copy link

afalf commented Jun 12, 2022

The result I got was only 65. I don't know what was wrong.

Sorry, I have already seen it. Could you please show your hyparameters for training?

@Linda230
Copy link

The result I got was only 65. I don't know what was wrong.

Sorry, I have already seen it. Could you please show your hyparameters for training?

Hi, thank you for your reply, here is my hyperparameters setting:

python train.py
--model_name_or_path bert-base-uncased
--train_file data/wiki1m_for_simcse.txt
--eval_path data/sts-dev.tsv
--output_dir $MODEL_PATH
--num_train_epochs 1
--per_device_train_batch_size 64
--learning_rate 3e-5
--max_seq_length 32
--evaluation_strategy steps
--metric_for_best_model stsb_spearman
--load_best_model_at_end
--eval_steps 125
--pooler_type cls
--overwrite_output_dir
--temp 0.05
--do_train
--do_eval
--seed 42
--lambdas 0.6 \

@Linda230
Copy link

The result I got was only 65. I don't know what was wrong.

Sorry, I have already seen it. Could you please show your hyparameters for training?

hello, thanks for your reminder, I just set the lambda =0.2 as the paper, then I got an average STS = 77.20 using "cls" pooling, and a higher result STS = 77.90 using "cls_before_pooler", but I think I should follow your ReadMe file, and adopt "cls" pooling, right?

@zyznull
Copy link
Collaborator

zyznull commented Jun 14, 2022

The result I got was only 65. I don't know what was wrong.

Sorry, I have already seen it. Could you please show your hyparameters for training?

hello, thanks for your reminder, I just set the lambda =0.2 as the paper, then I got an average STS = 77.20 using "cls" pooling, and a higher result STS = 77.90 using "cls_before_pooler", but I think I should follow your ReadMe file, and adopt "cls" pooling, right?

Yeah, I find the "cls" pooling is more robutness. And the script is updated now. Thank you for your reminder。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants