Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use ComplEx pretrain MetaQA_half? #109

Closed
lihuiliullh opened this issue Nov 20, 2021 · 8 comments
Closed

How to use ComplEx pretrain MetaQA_half? #109

lihuiliullh opened this issue Nov 20, 2021 · 8 comments

Comments

@lihuiliullh
Copy link

May I know how you pre-train MetaQA data?
Do you use the code in directory "train_embeddings" to learn the embedding? If so, can you share the command of running the main.py with me?
If not, how do you generate the bn0.npy, bn1.npy, bn2.npy, E.npy, R.npy?

@lihuiliullh lihuiliullh changed the title How to generate the pretrain model for MetaQA? How to use ComplEx pretrain MetaQA_half? Nov 25, 2021
@lihuiliullh
Copy link
Author

lihuiliullh commented Nov 25, 2021

I read the issue #41.

Do you use command = 'python3 main.py --dataset MetaQA_half --num_iterations 500 --batch_size 256 '
'--lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 '
'--hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1 '
'--valid_steps 10 --model ComplEx '
'--loss_type BCE --do_batch_norm 1 --l3_reg 0.001 '
'--outfile /scratch/embeddings'

to train the model?

Do you only use train.txt in MetaQA_half to train ComplEx?

@lihuiliullh
Copy link
Author

Here is the accuracy I get using main.py in train_embedding on MetaQA_half.
CUDA_VISIBLE_DEVICES=3 python main.py --dataset MetaQA --num_iterations 500 --batch_size 256
--lr 0.0005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2
--hidden_dropout1 0.3 --hidden_dropout2 0.3 --label_smoothing 0.1
--valid_steps 10 --model ComplEx
--loss_type BCE --do_batch_norm 0 --l3_reg 0.001

Hits @10: 0.173125
Hits @3: 0.0995
Hits @1: 0.04325
Mean rank: 9449.934
Mean reciprocal rank: 0.08621291153867376
Best valid: [0.08832985424199054, 9482.2985, 0.178375, 0.09875, 0.046625]
Best Test: [0.08618774827374057, 9425.521875, 0.1725, 0.100125, 0.043375]

The accuracy is very low. Is there a problem?

@apoorvumang
Copy link
Collaborator

yes this is very low. let me try and get back to you

@apoorvumang
Copy link
Collaborator

did you try with batch norm?

@lihuiliullh
Copy link
Author

According to the code, when using "--do_batch_norm 0", do_batch_norm = False. So, I guess I didn't do batch norm.

@apoorvumang
Copy link
Collaborator

what I meant was, did you try running with "do_batch_norm 1" ?

@lihuiliullh
Copy link
Author

Yes, I use "do_batch_norm 1" to run the code. The hits@1 is about 0.07.

@apoorvumang
Copy link
Collaborator

I ran this

CUDA_VISIBLE_DEVICES=2 python main.py --dataset MetaQA_half --num_iterations 1000 --batch_size 256 \
                                       --lr 0.005 --dr 1.0 --edim 200 --rdim 200 --input_dropout 0.2 \
                                       --hidden_dropout1 0.2 --hidden_dropout2 0.3 --label_smoothing 0.1 \
                                       --valid_steps 10 --model ComplEx \
                                       --loss_type BCE --do_batch_norm 1 --l3_reg 0.0

and got
image

These embeddings should be ok-ish I think for downstream application. You will have to uncomment following line (and pls check code to see file location) to save the trained files.

# self.write_embedding_files(model)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants