You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.
When using the biaffine dependency parser, the em-dash often (but not always) comes back with pos == ''. This is a problem because I am using the parser to generate labels to turn my text into conll2003 NER format, and I'm ending up with blank spots in my columns which is messing up the ner_crf_tagger's ability to read the files in properly.
I would prefer to have the em_pos return ":" or "." or even "UNKNOWN", so that I don't get a blank column. Right now, I've added in a pos=='' test before saving to file, so there is a workaround and it's not urgent for me. But I suspect it's causing difficulties for the dependency parser too, given the wide range of values the em-dash is getting assigned.
System info:
ubuntu 14.04 LTS
allennlp= 0.6.1
python version 3.6.5
The text was updated successfully, but these errors were encountered:
Hi! We actually use Spacy to predict the POS tags here - I think you might either 1) Not have a spacy model installed or 2) Have an old model installed. Can you look at this issue and see if it helps you?
When using the biaffine dependency parser, the em-dash often (but not always) comes back with pos == ''. This is a problem because I am using the parser to generate labels to turn my text into conll2003 NER format, and I'm ending up with blank spots in my columns which is messing up the ner_crf_tagger's ability to read the files in properly.
Example code:
from allennlp.predictors.predictor import Predictor
predictor = Predictor.from_path(
"https://s3-us-west-2.amazonaws.com/allennlp/models/biaffine-dependency-parser-ptb-2018.08.23.tar.gz")
chunk_text = "16 People Print to Share Documents — or Do They?"
ps = predictor.predict(chunk_text)
em_pos = ps["pos"][6]
==> em_pos == ''
Preferred results:
I would prefer to have the em_pos return ":" or "." or even "UNKNOWN", so that I don't get a blank column. Right now, I've added in a pos=='' test before saving to file, so there is a workaround and it's not urgent for me. But I suspect it's causing difficulties for the dependency parser too, given the wide range of values the em-dash is getting assigned.
System info:
ubuntu 14.04 LTS
allennlp= 0.6.1
python version 3.6.5
The text was updated successfully, but these errors were encountered: