This is a combined workshop paper repo for The First Bangla Language Processing Workshop (BLP 2023) Co-located with EMNLP 2023 in Singapore.
Link to the information on the workshop: https://blp-workshop.github.io/
Bangla music is a treasure trove of cultural her- itage that has been prevalent and thriving for more than a century. This paper presents a new Bangla music dataset with unique features that reflect the thematic, phonemic and stylometric evolution of Bangla music from the 20th to the 21st century. The dataset is accompanied by a thorough exploratory analysis to unfold the ever evolving elements of Bangla music from a temporal and lyrical perspective. Addition- ally, we show that our dataset is a good fit for various classification tasks using deep neural classifiers. We have strategically fine-tuned the BanglaBERT model to achieve an average accu- racy of 60% for various phonemic classification and artist identification from the lyrics.
Our paper has not been accepted. We have decided to make minor changes as suggested by the reviewers and upload to the Archive. Overall, it was a good hands on practise for me to get exposure to NLP with the help of my mentor Ishrak Hayet, who guided me throughout the journey of the work. Some places and things that I will take forward is that research is an iterative process. The clustering algorithm I used needed better data. I could have played more with the feature engineering and feature selection process. Our data collection process was a bit hectic and for which it was only around 300 observations, but it was a lot in terms of the token number considering the music lyrics were long. The overall field of AI is extremely vast, and there is no way to get better at it other than tring and trying and trying. The experience has been fruitful as I learnt how to nurture patience and also keep digging with the problems.
We have been able to generate phonemic bangla datasets which are all cited in our paper. They are all open source and free to use under the MIT License. Hopefully, they will help someone in the future.
Collaborators: Ishraq Hayet, RokunuzJahan Rudro