Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance comparison #6

Open
YanLiang1102 opened this issue Feb 8, 2021 · 6 comments
Open

performance comparison #6

YanLiang1102 opened this issue Feb 8, 2021 · 6 comments

Comments

@YanLiang1102
Copy link
Collaborator

YanLiang1102 commented Feb 8, 2021

test performance using topic as multi-task learning. only enhance the encoder, not using the topic as input for event extraction though

epoch 1
('lose Mention', u'cfcec8e30722564d5bf43fcb5f739cd8')
('lose Mention', u'04cdc0c45303f7024417e4d94f9a7c13')
('lose Mention', u'66b0e5014943dde6c74c048086b7a0a3')
('Micro_F1:', 67.776451242312973)
('Micro_Precision:', 66.690309424594375)
('Micro_Recall:', 68.898557362033429)
('Macro_F1:', 50.955298281699037)
('Macro_Precision:', 55.190862598324387)
('Macro_Recall:', 50.812833395810941)

epoch 2
('lose Mention', u'395e263d21484d8998d853b0b1b6ec5a')
('lose Mention', u'063e9a5b7265bc3ae0ea731c5f06545b')
('lose Mention', u'362cbeafe8c757195425faa40a88ff61')
('Micro_F1:', 67.062822261786408)
('Micro_Precision:', 67.72205099467638)
('Micro_Recall:', 66.416304098923746)
('Macro_F1:', 58.714489780943694)
('Macro_Precision:', 63.478851556871454)
('Macro_Recall:', 58.704903536659984)

epoch 3
('lose Mention', u'f0ae798aa5b9014e3c8e47aefb281330')
('lose Mention', u'd3f3a2e2f335d8c50f7612c29aa0cb2a')
('lose Mention', u'1418f6cb3c97797a7542e7ec6ac04427')
('lose Mention', u'913c2d91f695ccb637386f211cd8d94d')
('lose Mention', u'f697f918553fc11d73d709bf3029603d')
('Micro_F1:', 68.213119095965396)
('Micro_Precision:', 65.833084820858005)
('Micro_Recall:', 70.771696817036869)
('Macro_F1:', 60.237205257354191)
('Macro_Precision:', 61.145512218719368)
('Macro_Recall:', 62.252039674965374)

epoch 10
('Micro_F1:', 64.949966644429608)
('Micro_Precision:', 63.125135076723581)
('Micro_Recall:', 66.883444011907486)
('Macro_F1:', 59.76302032588179)
('Macro_Precision:', 58.998255594751669)
('Macro_Recall:', 61.678292061307531)

@YanLiang1102
Copy link
Collaborator Author

YanLiang1102 commented Feb 8, 2021

test performance bert-crf:

epoch 1
('Micro_F1:', 67.462058468012714)
('Micro_Precision:', 65.979246026533559)
('Micro_Recall:', 69.013052438745135)
('Macro_F1:', 54.734193211038985)
('Macro_Precision:', 58.043941483652027)
('Macro_Recall:', 55.793730660778905)

epoch 3
('lose Mention', u'df0a542b1da02933d7ec99db1d242f88')
('lose Mention', u'35cc76d3b1b331cf496dc13451da70db')
('lose Mention', u'aca274ee083d94192aa9d58832f1a258')
('lose Mention', u'49d6dc789d08db61efd0fbc46a55570a')
('lose Mention', u'b30bf4393b0cc55add7ae3e7380aaac3')
('lose Mention', u'24345d37f8e66bd7fcb3362bdc861942')
('Micro_F1:', 68.219692783733009)
('Micro_Precision:', 65.924818453652293)
('Micro_Recall:', 70.68010075566751)
('Macro_F1:', 60.441497708246828)
('Macro_Precision:', 61.680135350505118)
('Macro_Recall:', 61.821271767813826)

@YanLiang1102
Copy link
Collaborator Author

The above two performance shows that only using topic_id as multi-task learning won't help the performance. next thing to try make the event extraction conditioned on topic. and also at the same time using multi-task learning.

@YanLiang1102
Copy link
Collaborator Author

conditional on the topic, by adding the topic embedding directly to the sentence embedding.
epoch3
('Micro_F1:', 67.491943549283263)
('Micro_Precision:', 65.561312607944728)
('Micro_Recall:', 69.539729791618967)
('Macro_F1:', 60.342763541804359)
('Macro_Precision:', 60.752334592420986)
('Macro_Recall:', 62.328372626437236)

@YanLiang1102
Copy link
Collaborator Author

The topics information does not help might because of this: most of the topics are tail topics.

token:military conflict, test_count:258, train_count:981
token:rail accident, test_count:18, train_count:36
token:limited overs final, test_count:2, train_count:6
token:concert tour, test_count:49, train_count:167
token:event, test_count:16, train_count:36
token:news event, test_count:21, train_count:61
token:flood, test_count:6, train_count:30
token:aircraft occurrence, test_count:24, train_count:76
token:military operation, test_count:1, train_count:1
token:nuclear weapons test, test_count:4, train_count:12
token:civil conflict, test_count:31, train_count:106
token:civilian attack, test_count:49, train_count:196
token:concert, test_count:6, train_count:13
token:historical event, test_count:12, train_count:59
token:terrorist attack, test_count:18, train_count:50
token:olympic event, test_count:4, train_count:10
token:operational plan, test_count:3, train_count:7
token:hurricane, test_count:97, train_count:314
token:recurring event, test_count:34, train_count:94
token:football match, test_count:8, train_count:67
token:music festival, test_count:41, train_count:101
token:wildfire, test_count:12, train_count:10
token:wrestling event, test_count:23, train_count:55
token:airliner accident, test_count:10, train_count:33
token:international football competition, test_count:9, train_count:43
token:cricket tournament, test_count:11, train_count:56
token:aircraft accident, test_count:7, train_count:27
token:athleticrace, test_count:8, train_count:12
token:cycling championship, test_count:1, train_count:1
token:rugby match, test_count:2, train_count:2
token:games, test_count:6, train_count:23
token:earthquake, test_count:17, train_count:31
token:legislative session, test_count:2, train_count:2
token:winter storm, test_count:7, train_count:19
token:canadian football game, test_count:2, train_count:0
token:horse race, test_count:2, train_count:16
token:badminton event, test_count:1, train_count:0
token:military attack, test_count:1, train_count:3
token:summit, test_count:1, train_count:3
token:international ice hockey competition, test_count:7, train_count:16
token:summit meeting, test_count:1, train_count:10
token:mma event, test_count:3, train_count:7
token:athletics competition, test_count:1, train_count:4
token:international handball competition, test_count:1, train_count:4
token:cycling championships, test_count:1, train_count:0
token:individual golf tournament, test_count:3, train_count:16
token:pro bowl, test_count:1, train_count:7
token:u.s. federal election campaign, test_count:1, train_count:2
token:commonwealth games event, test_count:1, train_count:0
token:swimming event, test_count:1, train_count:1
token:athletics race, test_count:1, train_count:6
token:university boat race, test_count:4, train_count:2
token:hurling championship, test_count:1, train_count:1
token:field hockey, test_count:1, train_count:4
token:australian rules football grand final, test_count:2, train_count:2
token:international baseball tournament, test_count:1, train_count:1
token:tennis event, test_count:1, train_count:4
token:rugby tournament, test_count:1, train_count:8

@YanLiang1102
Copy link
Collaborator Author

Add in the topic2event type distribution as prior. (similar to adding vocab for each topic into the extraction work, will this work?? not sure..)

@YanLiang1102
Copy link
Collaborator Author

The topic will help the extraction by assuming two things: 1. given a topic, certain event type happens more often than others.
2. similar topic embedding can infer similar event types in the sentences. First of all, are they true???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant