A method for utilizing a large language model (LLM) as a way to generate intermediate interpretable features for the downstream task of fake or real news classification. The LLM generates fine grained emotion sentiment analysis on both the title and each sentence of the article from which the features are exracted. Then these features are processed through a simpler classifier to determine the authenticity of the sample news article. Results of the final classifier indicate a test accuracy of ~89% suggesting the practical usage of this model in authenticating news articles.
Machine Learning advancements have enabled large performance improvements in various problems spanning both scientific research and business. However one of the pitfalls of complex architectures that have the ability to perform complex tasks are rarely interpretable and even less so in Natural Language Processing. It is desired to provide insight into the way that these models perform these tasks so that the outputs need not be taken immediately at face value. This interpretability is instrumental in the adoption of Artificial intelligence in the mainstream and ensures we have both fair and ethical intelligent systems.
In the Age of Information it is shown that retractions in the scientific community are detrimental to the societies understanding of the topics as retractions rarely garner the same interest as the original misinformation. It is not a large leap to say that this phenomenon would extend to news as well which can lead to disinformation spreading rapidly across our society. If we seek to stop the spread of this misinformation we need to intervene at the moment of either production or consumption of this information. This work seeks to tackle this problem from the standpoint of consumption as this does not require any integration with the publishers and can be applied more generically.
In this vein we saught to authenticate the trustworthiness of a news article utilizing interpetable features. This has a two-fold impact, it allows us to attempt to provide information to the consumer that grades the trustworthiness of the information possibly preventing the spread of information, while attempting to make the models reasoning more interpretable, allowing for the consumer to decide if they trust the output of the algorithm.
To be able to run the jupyter notbebook efficiently long computations outputs have been serialized to just load them upon new executions. To avoid having to train the Distill BERT model a pretrained model is provided for your conviencence. This model can be downloaded here. The model must be placed in the model_outputs_distill
directory to be used in the notebook.
https://github.com/google-research/google-research/tree/master/goemotions
Apache License 2.0
@inproceedings{demszky2020goemotions, author = {Demszky, Dorottya and Movshovitz-Attias, Dana and Ko, Jeongwoo and Cowen, Alan and Nemade, Gaurav and Ravi, Sujith}, booktitle = {58th Annual Meeting of the Association for Computational Linguistics (ACL)}, title = {{GoEmotions: A Dataset of Fine-Grained Emotions}}, year = {2020} }
https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset
Unknown
Ahmed H, Traore I, Saad S. “Detecting opinion spams and fake news using text classification”, Journal of Security and Privacy, Volume 1, Issue 1, Wiley, January/February 2018. Ahmed H, Traore I, Saad S. (2017) “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham (pp. 127-138).