-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get the position index of concepts and negated concepts #12
Comments
I did something like this a while back (older version of pyConTextNLP - v0.6.2.0, so your mileage may vary).
Note that when you run "markup.cleanText()", it applies the regex rule "REG_CLEAN2 = re.compile(r"""\s+""", re.UNICODE)" from ConTextMarkup.py (or r2 in pyConTextGraph.py from the previous version), which replaces an arbitrary number of whitespaces/newlines with a single space causing the above method to be off by a few characters. I got around this by removing the '+', so the correct index is preserved though there will be odd spacing. I'm in the habit of writing my rules with \s+ (I.e. Regex: 'foo\s+bar'), so this is usually fine. I prefer to have correct indexes. There may be a better way to do this and I haven't explored the newest pyConText version yet, so it may no longer be an issue. My solution for getting the exact spans of targets and modifiers in text is included in my pyConTextNLP pipeline tool, PyConTextPipeline if you're interested. |
Not really an issue but more a question. I am wondering if there is any function to get the keyword index position and negation index position on a sentence level and on the note level by any chance?
Thanks!
The text was updated successfully, but these errors were encountered: