Skip to content

Releases: undertheseanlp/underthesea

Underthesea 1.1.7

11 Apr 19:15
a6915e1
Compare
Choose a tag to compare

✨ Major Features and Improvements

  • API CHANGE: Change word_sent function to word_tokenize

πŸ”΄ Bug fixes

  • Fix dependencies hell (#174)

πŸ—Ž Documentation and examples

  • Add Vietnamese README page README.vi.rst
  • Update style in README.rst page

πŸ”Š Release Notes

The main focus in this release is fix dependencies hell error which is reported by @dthphuong and @YannDubs. This fix will enhance speed in installation process of underthesea and remove all unnecessary dependencies in underthesea by default.

Another import update is an API change. We rename word_sent function to word_tokenize which is a better name for word segmentation task.

Contributors

Thanks to @rain1024, @jackNhat for the contributions!

Underthesea 1.1.6

30 Dec 03:50
0a40418
Compare
Choose a tag to compare

✨ Major Features and Improvements

  • NEW: Implement a Vietnamese aspect sentiment analysis in banking social data.
  • NEW: Improve languageflow project with new models (KimCNNCLassifier, XGBoostClassifier), develop LanguageBoard to visualize and inspect features and trained models.

πŸ”΄ Bug fixes

  • Fix bug tokenize string with "=" (#159)

πŸ—Ž Documentation and examples

πŸ”Š Release Notes

The main feature in this release is aspect sentiment analysis. We conduct a banch of experiments with social posts data in bank domain. Traditional classifiers such as SVM, Naive Bayes, Gradient Boosting Tree with count features and tfidf features still yield the better result (59.5% in f1 score), compare with deep learning models like fasttext and CNN. You can view live demo of Vietnamese aspect sentiment analysis in underthesea service

We rename underthesea-flow project to languageflow, integrate new models (KimCNNCLassifier, XGBoostClassifier). See more detail in languageflow documentation

Contributors

Thanks to @rain1024, @jackNhat for the contributions!

Underthesea 1.1.5

06 Oct 07:04
Compare
Choose a tag to compare

✨ Major Features and Improvements

  • NEW: Implement a Vietnamese named entity recognition using CRF #90
  • NEW: Create new projects underthesea-flow for NLP experiments, underthesea.amrbank to create a Vietnamese AMR Bank.
  • One line install is back, only download model and data on demand.

πŸ”΄ Bug fixes

  • Refactor underthesea.word_sent, underthesea.pos_tag, underthesea.chunking projects

πŸ—Ž Documentation and examples

πŸ”Š Release Notes

The main feature in this release is named entity recognition. Our experiments focus on conditional random fields models, which yield a reasonable result and fast (~20 mins per experiment). For more information about NER experiments, go to its own repository.
A lot of work in this month to improve our pipeline, a new project underthesea-flow is created for this reason.
We also create a new project underthesea.amr in response to the raise of AMR. Our first goal is create first 3000 Vietnamese annotated sentences in our AMR bank.

πŸ‘₯ Contributors

Thanks to @rain1024, @jackNhat, @vunb for the contributions!

Underthesea 1.1.4

12 Sep 12:10
Compare
Choose a tag to compare

✨ Major Features and Improvements

  • NEW: Implement a Vietnamese text classification using fasttext #118

πŸ”΄ Bug fixes

  • Fix issue in Text wrapper function

πŸ—Ž Documentation and examples

πŸ”Š Release Notes

The main feature in this release is text classification. We experiments some standard classifiers (Naive Bayes, SVM family, xgboost) and a trendy classifier fasttext in very large Vietnamse news data set (30k sentences). The winner is fasttext because it's very fast and yeild best accuracy and f1 score. For more information about classification experiments, follow the this link to its own repository.

We're afraid that we can't support one line install due to many dependencies come with v1.1.4 (fasttext, sklearn). Other reason is we want to separate models and code. So after install underthesea, you must do a small step is download models. Check out how to make underthesea works with four lines in Installation section here.

See you next release!

Underthesea 1.1.3

23 Aug 01:57
Compare
Choose a tag to compare

✨ Major Features and Improvements

Underthesea 1.1.0

30 May 04:02
Compare
Choose a tag to compare

Word Segmentation, POS Tagging, Chunking
Support python 2 only