Skip to content

Migrating from 0.1 to 1.0

Emil Stenström edited this page Aug 7, 2018 · 4 revisions

I don't like breaking backwards compatibility, but to be able to add new features I felt I had to. This means that updating from 0.1 to 1.0 might require code changes.

Slimmer public API

Previously init.py included a lot of public methods. Some of these have NOT moved:

from conllu import parse
from conllu import parse_tree

These two work just like they did before. But they now return a TokenList or TokenTree instead of a raw list. See next heading on how to handle this.

-from conllu.parser import parse
-from conllu.parser import parse_tree
+from conllu import parse
+from conllu import parse_tree

Importing parse and parse_tree for conllu.parser is no longer supported. Remove ".parser" and the imports will work again.

-from conllu.parser import parse_with_comments

parse_with_comments is now removed. When using parse comments are automatically included. You can access them with by accessing the new metadata property on the returned TokenList.

-from conllu.parser import serialize_tree
-from conllu.tree_helpers import print_tree 

These two methods have been moved to TokenTree that is returned from parse_tree. serialize_tree is now tree.serialize(), and print_tree is now tree.print_tree().

Returning TokenLists and TokenTrees instead of lists

The return values from both parse and parse_tree have changed.

sentences = parse(raw_conllu_str)
sentence = sentences[0]
for token in sentence:
    print(token)

This code will keep working since TokenList has a getitem defined that makes it work like a list. If you relied on some other part the return value behaving like a list, you might have to change that.

sentences = parse_tree(raw_conllu_str)
root = sentences[0]
-print(root.data, root.children)
+print(root.token, root.children)

When switching from TreeNode to TokenTree I've also changed data to instead be token. So you have to change all places where you access .data to access .token instead.

Parsing of ID:s now include ranges and decimals

Previously only ID:s in the form of positive integers where recognized. Now conllu has support for ranges ("1-3") and decimals ("3.1") too. If your code relied on those numbers being returned as None, you need to change that to say isinstance(value, int) instead..

"1" -> 1
-"1-3" -> None
+"1-3" -> (1, "-", 3)
-"3.1" -> None
+"3.1" -> (3, ".", 1)