Releases: undertheseanlp/underthesea
Underthesea 1.1.7
β¨ Major Features and Improvements
- API CHANGE: Change
word_sent
function toword_tokenize
π΄ Bug fixes
- Fix dependencies hell (#174)
π Documentation and examples
- Add Vietnamese README page README.vi.rst
- Update style in README.rst page
π Release Notes
The main focus in this release is fix dependencies hell error which is reported by @dthphuong and @YannDubs. This fix will enhance speed in installation process of underthesea and remove all unnecessary dependencies in underthesea by default.
Another import update is an API change. We rename word_sent
function to word_tokenize
which is a better name for word segmentation task.
Contributors
Underthesea 1.1.6
β¨ Major Features and Improvements
- NEW: Implement a Vietnamese aspect sentiment analysis in banking social data.
- NEW: Improve languageflow project with new models (KimCNNCLassifier, XGBoostClassifier), develop LanguageBoard to visualize and inspect features and trained models.
π΄ Bug fixes
- Fix bug tokenize string with "=" (#159)
π Documentation and examples
- Create a live demo of aspect sentiment analysis http://magizbox.com:9386/#!/sentiment
π Release Notes
The main feature in this release is aspect sentiment analysis
. We conduct a banch of experiments with social posts data in bank domain. Traditional classifiers such as SVM, Naive Bayes, Gradient Boosting Tree with count features and tfidf features still yield the better result (59.5% in f1 score), compare with deep learning models like fasttext and CNN. You can view live demo of Vietnamese aspect sentiment analysis in underthesea service
We rename underthesea-flow project to languageflow, integrate new models (KimCNNCLassifier, XGBoostClassifier). See more detail in languageflow documentation
Contributors
Underthesea 1.1.5
β¨ Major Features and Improvements
- NEW: Implement a Vietnamese named entity recognition using CRF #90
- NEW: Create new projects underthesea-flow for NLP experiments, underthesea.amrbank to create a Vietnamese AMR Bank.
- One line install is back, only download model and data on demand.
π΄ Bug fixes
- Refactor underthesea.word_sent, underthesea.pos_tag, underthesea.chunking projects
π Documentation and examples
- Create a live demo of named entity recognition http://magizbox.com:9386/#/ner
π Release Notes
The main feature in this release is named entity recognition
. Our experiments focus on conditional random fields models, which yield a reasonable result and fast (~20 mins per experiment). For more information about NER experiments, go to its own repository.
A lot of work in this month to improve our pipeline, a new project underthesea-flow is created for this reason.
We also create a new project underthesea.amr in response to the raise of AMR. Our first goal is create first 3000 Vietnamese annotated sentences in our AMR bank.
π₯ Contributors
Thanks to @rain1024, @jackNhat, @vunb for the contributions!
Underthesea 1.1.4
β¨ Major Features and Improvements
- NEW: Implement a Vietnamese text classification using fasttext #118
π΄ Bug fixes
- Fix issue in Text wrapper function
π Documentation and examples
- Create a live demo of text classification http://magizbox.com:9386/#/classification
π Release Notes
The main feature in this release is text classification
. We experiments some standard classifiers (Naive Bayes, SVM family, xgboost) and a trendy classifier fasttext
in very large Vietnamse news data set (30k sentences). The winner is fasttext because it's very fast and yeild best accuracy and f1 score. For more information about classification experiments, follow the this link to its own repository.
We're afraid that we can't support one line install
due to many dependencies come with v1.1.4 (fasttext, sklearn). Other reason is we want to separate models and code. So after install underthesea, you must do a small step is download models. Check out how to make underthesea works with four lines
in Installation section here.
See you next release!
Underthesea 1.1.3
β¨ Major Features and Improvements
- NEW: Live demo at underthesea.herokuapp.com
- NEW: Support python 3
Underthesea 1.1.0
Word Segmentation, POS Tagging, Chunking
Support python 2 only