Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 1.54 KB

README.md

File metadata and controls

23 lines (16 loc) · 1.54 KB

sentirueval

http://www.dialog-21.ru/media/3410/loukachevitchnvrubtsovayv.pdf

Russian sentiment analysis evaluation SentiRuEval-2016 devoted to reputation monitoring of banks and telecom companies in Twitter. We describe the task, data, the procedure of data preparation, and participants’ results. At the previous evaluation SentiRuEval-2015, it was noticed that the presented machine-learning approaches significantly depended on the training collection, which was not enough for qualitative classification of the test collection because of data sparsity and time gap. The current results of the participants at SentiRuEval-2016 showed that they have made successful steps to overcome the above-mentioned problems by combining machine-learning approaches and additional manual and automatically generated lexical resources.

Repository description

There are two domains datasets in Russian: banks and telecom companies.

  • bank_train_2016.xml

  • banks_test_2016.xml

  • banks_test_etalon.xml

  • tkk_test_2016.xml

  • tkk_test_etalon.xml

  • tkk_train_2016.xml

You can find official results of sentiment analysis evaluation SentiRuEval-2016 here: Results_SentiRueval_2016.xlsx

And also evaluation script is in Eval folder. Be aware to find out formulas and idea if F-measure calculation.

Please cite:

Loukachevitch, N. V., & Rubtsova, Y. V. (2016). SentiRuEval-2016: overcoming time gap and data sparsity in tweet sentiment analysis. In Computational Linguistics and Intellectual Technologies (pp. 416-426). http://www.dialog-21.ru/media/3410/loukachevitchnvrubtsovayv.pdf