-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop AI Models for Response Categorization and Sentiment Analysis #294
Comments
Hello @jvJUCA , I am Arya Sarkar a NLP Developer based in Kolkata, India. I am keen to start contributing to the project as I am relatively new to open source in hopes of being a part of GSOC 2024 working on the project titled: I will spend some time understand the code behind the project, in the meantime I had a question about this particular issue.
I have experience working on similar techstack for the last couple of years and feel that I can help with this by leveraging my experience. Happy to be a part of the community :D |
@jvJUCA This got me thinking further about possible solutions and there are two ways to proceed:
What are your thoughts ? |
Hello, my name is Harsh Kasat and I am from India. I am interested in contributing to the project as I am new to open source and I'm looking for part of GSOC 2024. I would like to work on the project |
|
@harshkasat The Sagemaker approach by using some minimal computing power like the t3-small makes response times very high, if we want to actually run an LLM and pricing is a potential issue, it would make sense to use models with a smaller footprint than Llama-7B. Gpt4All although nowhere as powerful as Llama7B can serve a similar purpose in the event we think of implementing RAG / Q&A pipeline in the future. In my experience, finetuning a RoBERTa - Base model on sentiment analysis tasks is the best possible option and required no special computing power for SENTIMENT ANALYSIS tasks. Similarly for Topic Modelling BERT fine-tuned on Topic Modelling tasks is the best performer (have spent quite some time last year on both these tasks). @jvJUCA I would love to see a data sample of user responses to get a high level idea about the possible techniques to use, would love to discuss this in more detail :D |
@kindler-king, I totally agree that if we don't want (rag/qa) a model with a large memory and footprint make no sense, rather go for a smaller one. We can opt for BERT models such as Bert, Roberta, and DistilBert. For quantization, we can use I-Bert, or we can use the LORA quantization method and fine tuning them. |
@harshkasat and @kindler-king I had to tell you that his issue is quite related to a project that we have not shared publicaly as we it has been send to a journal to be published. |
That sounds intruiging, I recently authored a paper on Multi-Modal emotion Recognition i.e. fusing TEXT data (from Twitter) along with VIDEO data to improve the performance of sentiment analysis in attention space. I will start working on this issue then and try to implement a pretrained CNN for video based sentiment analysis. @marcgc21 Do you feel that we should proceed with:
Which do you think is the closest to the use-case you have in mind? I can start working on that. Thanks a lot for your input @marcgc21 Have a great day :D |
An option 4 that comes to my mind, which seems reasonable is: @marcgc21 How does that sound to you? Out of the 4 options, which do you think I should explore? |
Hello @marcgc21 @jvJUCA Here's a simple implementation of a VIDEO SENTIMENT TRACKING system. It is not nearly 100% accurate, but can serve to estimate the sentiment of USERS from their WEBCAM recordings or other videos.
Link to view the VIDEO sample of the result on STOCK videos I found on the web: Another interesting thing is that this is REAL-TIME, and can be replicated on servers with much lesser computing power than a standard LLM. |
The text was updated successfully, but these errors were encountered: