Sentiment Analysis

UIT-ViSFD: A Vietnamese Smartphone Feedback Dataset for Aspect-Based Sentiment Analysis

UIT-ViSFD consists of 11,122 human-annotated comments for mobile e-commerce, which is freely available for research purposes:

📜 SA2SL: From Aspect-Based Sentiment Analysis to Social Listening System for Business Intelligence
🔗 Vietnamese Smartphone Feedback Dataset

UIT ABSA Datasets

📜 Two New Large Corpora for Vietnamese Aspect-based Sentiment Analysis at Sentence Level

Hotel Dataset: 7180 reviews (train), 795 reviews (development), 2030 reviews (test)

AIVIVN 2019: Sentiment Analysis Challenge

🔗 AIVIVN 2019: Sentiment Analysis Challenge 2019 website

The data contains user's reviews following two categories: "positive" and "negative"

27068 sentences

Train: 16087 sentences, Test: 10981 sentences (public: 5454 sentences, private: 5527 sentences)
Labels: 0 (positive), 1 (negative)

Leaderboard

Score: F1 score of negative labels

Author	Model	Score		Paper/Source	Code
Author	Model	Public Test	Private Test	Paper/Source	Code
HoangNhat2	Weighted Ensemble (TextCNN, VDCNN, HARNN, SARNN)	0.90087	0.90012	Write up	Official
iota	SVM	0.8914	0.89688	Write up	Official
Nal_AI	SVM (TF-IDF)	0.89545	0.89574	Write up	Official
nlpers	Ensemble (LinearSVC, SGD, RandomForest)	0.88921	0.89559	Write up	Official
ngxbac	LightGBM (TFIDF)	0.867		Write up	Official

Vietnamese Students’ Feedback Corpus (UIT-VSFC)

📜 UIT-VSFC: Vietnamese Students’ Feedback Corpus for Sentiment Analysis
📁 VSFC data

Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education. Vietnamese Students’ Feedback Corpus (UIT-VSFC) is the resource consists of over 16,000 sentences which are human-annotated with two different tasks: sentiment-based and topic-based classifications. To assess the quality of our corpus, we measure the annotator agreements and classification evaluation on the UIT-VSFC corpus.

Leaderboard

Model	Topic (F1)	Sentiment (F1)	Paper/Source	Code
Bi-LSTM - Word2Vec	0.896	0.92	Nguyen et al. NICS'18
Maximum Entropy classifier	0.88	0.84	Nguyen et al. KSE'18

VLSP 2018 Shared Task: Aspect Based Sentiment Analysis

📜 VLSP 2018 Shared Task: Aspect Based Sentiment Analysis Paper

Leaderboard

Restaurant Dataset: 2961 reviews (train), 1290 reviews (development), 500 reviews (test)

Model	Aspect (F1)	Aspect-Polarity (F1)	Paper/Source
CNNs	0.80		Dang et al. NICS'18
SVM	0.77	0.61	Dang et al. VLSP'18
SVM	0.54	0.48	Nguyen et al. VLSP'18

Hotel Dataset: 3000 reviews (training), 2000 reviews (development), 600 reviews (test)

Model	Aspect (F1)	Aspect-Polarity (F1)	Paper/Source
SVM	0.70	0.61	Dang et al. VLSP'18
CNNs	0.69		Dang et al. NICS'18
SVM	0.56	0.53	Nguyen et al. VLSP'18

VLSP 2016 Shared Task: Sentiment Analysis

📜 VLSP 2016 Shared Task: Sentiment Analysis Paper

The data contains user’s reviews about technological device following three categories: ”negative”, ”positive” and ”neutral”

A review can be very complex with different sentiments on various objects. Therefore, we set some constraints on the dataset as follows:

The dataset only contains reviews having personal opinions.
The data are usually short comments, containing opinions on one object. There is no limitation on the number of the object's aspects mentioned in the comment.
Label (positive/negative/neutral) is the overall sentiment of the whole review.
The dataset contains only real data collected from social media, not artificially created by human.

5100 sentences for training, 1050 sentences for testing

Train: 1700 positive, 1700 neutral, 1700 negative
Test: 350 positive, 350 neutral, 350 negative

Leaderboard

Model	F1	Paper/Source	Code
Perceptron/SVM/Maxent	80.05	Pham et al. VLSP'16
SVM/MLNN/LSTM	71.44	Nguyen et al. VLSP'16
Ensemble: Random forest, SVM, Naive Bayes	71.22	Pham et al. VLSP'16
Ensemble: SVM, LR, LSTM, CNN	69.71	Nguyen et al. NICS'18
SVM	67.54	Ngo et al. VLSP'16
SVM/MLNN	67.23	Tran et al. VLSP'16
SVM/MLNN	67.23	Tran et al. VLSP'16
Multi-channel LSTM-CNN	59.61	Vo et al. KSE'17	Official

Miscellaneous

📜 Papers

Huynh et al. NICS'18. Integrating Grammatical Features into CNN Model for Emotion Classification
Pham et al. 2016, Ngo et al. SoICT'16, Pham et al. KSE'16, Tran et al. 2016
Kieu et al. KSE'10

📁 Open sources

VnEmoLex (2017)data
polyglot (2014-2017 c++,java,python
pyurgent (2016) python,data
VietSentiWordNet (2014) data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sentiment_analysis.md

sentiment_analysis.md

Sentiment Analysis

UIT-ViSFD: A Vietnamese Smartphone Feedback Dataset for Aspect-Based Sentiment Analysis

UIT ABSA Datasets

AIVIVN 2019: Sentiment Analysis Challenge

Leaderboard

Vietnamese Students’ Feedback Corpus (UIT-VSFC)

Leaderboard

VLSP 2018 Shared Task: Aspect Based Sentiment Analysis

Leaderboard

VLSP 2016 Shared Task: Sentiment Analysis

Leaderboard

Miscellaneous

Files

sentiment_analysis.md

Latest commit

History

sentiment_analysis.md

File metadata and controls

Sentiment Analysis

UIT-ViSFD: A Vietnamese Smartphone Feedback Dataset for Aspect-Based Sentiment Analysis

UIT ABSA Datasets

AIVIVN 2019: Sentiment Analysis Challenge

Leaderboard

Vietnamese Students’ Feedback Corpus (UIT-VSFC)

Leaderboard

VLSP 2018 Shared Task: Aspect Based Sentiment Analysis

Leaderboard

VLSP 2016 Shared Task: Sentiment Analysis

Leaderboard

Miscellaneous