This project is for Data Science Engineering Methods and Tools course at Northeastern Univeristy.
The project aimed to explore the linguistic patterns of Adam Smith's Wealth of Nations through the use of advanced natural language processing techniques. Using libraries such as numpy, pandas, pymc3, scipy, networkx, and sklearn, we applied bayesian modeling using a negative binomial model to analyze the relationships between positive words and economic indicators such as wages and profits. We also conducted network modeling and used the PageRank algorithm to identify key topic sentences. In addition, we performed semantic modeling by using K-Medoids clustering algorithm to divide the book into ten sections and identify the topic sentence of each section. Through this project, we gained insights into the language used in Adam Smith's Wealth of Nations.