performance comparison #6

YanLiang1102 · 2021-02-08T18:16:06Z

test performance using topic as multi-task learning. only enhance the encoder, not using the topic as input for event extraction though

epoch 1
('lose Mention', u'cfcec8e30722564d5bf43fcb5f739cd8')
('lose Mention', u'04cdc0c45303f7024417e4d94f9a7c13')
('lose Mention', u'66b0e5014943dde6c74c048086b7a0a3')
('Micro_F1:', 67.776451242312973)
('Micro_Precision:', 66.690309424594375)
('Micro_Recall:', 68.898557362033429)
('Macro_F1:', 50.955298281699037)
('Macro_Precision:', 55.190862598324387)
('Macro_Recall:', 50.812833395810941)

epoch 2
('lose Mention', u'395e263d21484d8998d853b0b1b6ec5a')
('lose Mention', u'063e9a5b7265bc3ae0ea731c5f06545b')
('lose Mention', u'362cbeafe8c757195425faa40a88ff61')
('Micro_F1:', 67.062822261786408)
('Micro_Precision:', 67.72205099467638)
('Micro_Recall:', 66.416304098923746)
('Macro_F1:', 58.714489780943694)
('Macro_Precision:', 63.478851556871454)
('Macro_Recall:', 58.704903536659984)

epoch 3
('lose Mention', u'f0ae798aa5b9014e3c8e47aefb281330')
('lose Mention', u'd3f3a2e2f335d8c50f7612c29aa0cb2a')
('lose Mention', u'1418f6cb3c97797a7542e7ec6ac04427')
('lose Mention', u'913c2d91f695ccb637386f211cd8d94d')
('lose Mention', u'f697f918553fc11d73d709bf3029603d')
('Micro_F1:', 68.213119095965396)
('Micro_Precision:', 65.833084820858005)
('Micro_Recall:', 70.771696817036869)
('Macro_F1:', 60.237205257354191)
('Macro_Precision:', 61.145512218719368)
('Macro_Recall:', 62.252039674965374)

epoch 10
('Micro_F1:', 64.949966644429608)
('Micro_Precision:', 63.125135076723581)
('Micro_Recall:', 66.883444011907486)
('Macro_F1:', 59.76302032588179)
('Macro_Precision:', 58.998255594751669)
('Macro_Recall:', 61.678292061307531)

YanLiang1102 · 2021-02-08T18:18:16Z

test performance bert-crf:

epoch 1
('Micro_F1:', 67.462058468012714)
('Micro_Precision:', 65.979246026533559)
('Micro_Recall:', 69.013052438745135)
('Macro_F1:', 54.734193211038985)
('Macro_Precision:', 58.043941483652027)
('Macro_Recall:', 55.793730660778905)

epoch 3
('lose Mention', u'df0a542b1da02933d7ec99db1d242f88')
('lose Mention', u'35cc76d3b1b331cf496dc13451da70db')
('lose Mention', u'aca274ee083d94192aa9d58832f1a258')
('lose Mention', u'49d6dc789d08db61efd0fbc46a55570a')
('lose Mention', u'b30bf4393b0cc55add7ae3e7380aaac3')
('lose Mention', u'24345d37f8e66bd7fcb3362bdc861942')
('Micro_F1:', 68.219692783733009)
('Micro_Precision:', 65.924818453652293)
('Micro_Recall:', 70.68010075566751)
('Macro_F1:', 60.441497708246828)
('Macro_Precision:', 61.680135350505118)
('Macro_Recall:', 61.821271767813826)

YanLiang1102 · 2021-02-09T05:05:19Z

The above two performance shows that only using topic_id as multi-task learning won't help the performance. next thing to try make the event extraction conditioned on topic. and also at the same time using multi-task learning.

YanLiang1102 · 2021-02-09T23:09:28Z

conditional on the topic, by adding the topic embedding directly to the sentence embedding.
epoch3
('Micro_F1:', 67.491943549283263)
('Micro_Precision:', 65.561312607944728)
('Micro_Recall:', 69.539729791618967)
('Macro_F1:', 60.342763541804359)
('Macro_Precision:', 60.752334592420986)
('Macro_Recall:', 62.328372626437236)

YanLiang1102 · 2021-02-10T02:36:16Z

The topics information does not help might because of this: most of the topics are tail topics.

token:military conflict, test_count:258, train_count:981
token:rail accident, test_count:18, train_count:36
token:limited overs final, test_count:2, train_count:6
token:concert tour, test_count:49, train_count:167
token:event, test_count:16, train_count:36
token:news event, test_count:21, train_count:61
token:flood, test_count:6, train_count:30
token:aircraft occurrence, test_count:24, train_count:76
token:military operation, test_count:1, train_count:1
token:nuclear weapons test, test_count:4, train_count:12
token:civil conflict, test_count:31, train_count:106
token:civilian attack, test_count:49, train_count:196
token:concert, test_count:6, train_count:13
token:historical event, test_count:12, train_count:59
token:terrorist attack, test_count:18, train_count:50
token:olympic event, test_count:4, train_count:10
token:operational plan, test_count:3, train_count:7
token:hurricane, test_count:97, train_count:314
token:recurring event, test_count:34, train_count:94
token:football match, test_count:8, train_count:67
token:music festival, test_count:41, train_count:101
token:wildfire, test_count:12, train_count:10
token:wrestling event, test_count:23, train_count:55
token:airliner accident, test_count:10, train_count:33
token:international football competition, test_count:9, train_count:43
token:cricket tournament, test_count:11, train_count:56
token:aircraft accident, test_count:7, train_count:27
token:athleticrace, test_count:8, train_count:12
token:cycling championship, test_count:1, train_count:1
token:rugby match, test_count:2, train_count:2
token:games, test_count:6, train_count:23
token:earthquake, test_count:17, train_count:31
token:legislative session, test_count:2, train_count:2
token:winter storm, test_count:7, train_count:19
token:canadian football game, test_count:2, train_count:0
token:horse race, test_count:2, train_count:16
token:badminton event, test_count:1, train_count:0
token:military attack, test_count:1, train_count:3
token:summit, test_count:1, train_count:3
token:international ice hockey competition, test_count:7, train_count:16
token:summit meeting, test_count:1, train_count:10
token:mma event, test_count:3, train_count:7
token:athletics competition, test_count:1, train_count:4
token:international handball competition, test_count:1, train_count:4
token:cycling championships, test_count:1, train_count:0
token:individual golf tournament, test_count:3, train_count:16
token:pro bowl, test_count:1, train_count:7
token:u.s. federal election campaign, test_count:1, train_count:2
token:commonwealth games event, test_count:1, train_count:0
token:swimming event, test_count:1, train_count:1
token:athletics race, test_count:1, train_count:6
token:university boat race, test_count:4, train_count:2
token:hurling championship, test_count:1, train_count:1
token:field hockey, test_count:1, train_count:4
token:australian rules football grand final, test_count:2, train_count:2
token:international baseball tournament, test_count:1, train_count:1
token:tennis event, test_count:1, train_count:4
token:rugby tournament, test_count:1, train_count:8

YanLiang1102 · 2021-02-10T06:22:14Z

Add in the topic2event type distribution as prior. (similar to adding vocab for each topic into the extraction work, will this work?? not sure..)

YanLiang1102 · 2021-02-10T06:24:48Z

The topic will help the extraction by assuming two things: 1. given a topic, certain event type happens more often than others.
2. similar topic embedding can infer similar event types in the sentences. First of all, are they true???

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance comparison #6

performance comparison #6

YanLiang1102 commented Feb 8, 2021 •

edited

Loading

YanLiang1102 commented Feb 8, 2021 •

edited

Loading

YanLiang1102 commented Feb 9, 2021

YanLiang1102 commented Feb 9, 2021

YanLiang1102 commented Feb 10, 2021

YanLiang1102 commented Feb 10, 2021

YanLiang1102 commented Feb 10, 2021

performance comparison #6

performance comparison #6

Comments

YanLiang1102 commented Feb 8, 2021 • edited Loading

test performance using topic as multi-task learning. only enhance the encoder, not using the topic as input for event extraction though

YanLiang1102 commented Feb 8, 2021 • edited Loading

test performance bert-crf:

YanLiang1102 commented Feb 9, 2021

YanLiang1102 commented Feb 9, 2021

YanLiang1102 commented Feb 10, 2021

YanLiang1102 commented Feb 10, 2021

YanLiang1102 commented Feb 10, 2021

YanLiang1102 commented Feb 8, 2021 •

edited

Loading

YanLiang1102 commented Feb 8, 2021 •

edited

Loading