Skip to content

guillembp/nlp_personality_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improving the performance of a personality trait classifier trained on ambiguous labels

Abstract

In this project I use Natural Language Processing techniques and several machine learning models, to compare their performance in classifing interviewees from the EmotiW 2017 dataset into 6 personality types from interview transcript data. In the ground truth (i.e. labels), most subjects have ambiguous personality type labels because of labeler disagreement, as measured in Fleiss’s kappa coefficient. In order to increase the contrast among personality traits, several solutions are compared that dichotmize the labels to maximize the posterior classification performance, first by finding ambiguity thresholds and truncating the data, and later by comparing two different weighting functions. Finally, I explore NLP choices, dimensionality redction with PCA, Random Forest and Multinomial Naive Bayes machine learning models and model tuning to continue improving the classifier's performance.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published