Can this be used on non CONLL-2003 data format? #25

hetryn · 2020-10-30T17:57:13Z

As above, can TENER preprocessing be done on dataset that does not follow CONLL-2003 format? My dataset does not have BIO scheme tagging. Meaning the sentences will look like this.

sentence = ['Hi', 'I', 'study', 'in', 'China', 'and', 'work' , 'in', 'ABC']
tag = ['O', 'O', 'O', 'O', 'Country', 'O', 'O', 'O', 'Company']

The text was updated successfully, but these errors were encountered:

yhcc · 2020-11-22T14:23:57Z

Sorry for the late reply. You can re-use the TENER encoder, but the pre-processing and decoding may be suitable for your input. You can try to convert your data into the BIOES type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can this be used on non CONLL-2003 data format? #25

Can this be used on non CONLL-2003 data format? #25

hetryn commented Oct 30, 2020

yhcc commented Nov 22, 2020

Can this be used on non CONLL-2003 data format? #25

Can this be used on non CONLL-2003 data format? #25

Comments

hetryn commented Oct 30, 2020

yhcc commented Nov 22, 2020