Resonate: A Retrieval Augmented Framework For Meeting Insight Extraction

Resonate

Data Science Capstone

Sartaj Bhuvaji · Prachitee Chouhan · Madhuroopa Irukulla · Jay Singhvi

Resonate: A Retrieval Augmented Framework For Meeting Insight Extraction

In the fast-paced professional realm, meetings serve as vital platforms for collaboration and decision-making. Yet, among the vast exchange of information, recollecting essential details often proves challenging, hindering overall productivity. Imagine a scenario where past discussions on User Interface design are essential but cumbersome to retrieve.

Our project aims to tackle this challenge by developing a solution to effortlessly extract pivotal insights from historical meetings. expeeLeveraging Retrieval Augmented Generation techniques, our proposed system enables users to seamlessly upload meeting records and pose queries for relevant information retrieval. One core component of the system is to group meetings based on their abstractive summaries. Several state-of-the-art clustering algorithms were extensively trained and evaluated. When users pose inquiries, our system will pinpoint the cluster most likely to contain relevant discussions.

By utilizing the Pinecone vector store database, we retrieve pertinent conversations within a contextual window. The retrieved conversations and custom prompt are then processed through a Large Language Model (LLM) to generate precise responses. Our focus on system optimization involves exploring diverse encoders and LLM models, with fine-tuning to ensure rigorous evaluation and seamless integration. Through our approach, we transcend challenges in conversational meeting summarization, content discovery, and delivering a tailored, high-performance solution designed for user convenience.

Objectives

User should be able to upload an audio/video meeting file along with a meeting Topic
There can be multiple meeting topics. With each topic having a series of meetings.
Use would then be able to choose a topic and chat with the meeting just and ask any question

Initial Sketches

RAG Inference

The user would select the meeting Topic and ask a question.
Pinecone would retrieve relevant information and would feed the LLm with custom prompt, context, and the user query.
We also plan to add a Semantic Router to route queries according to the user input.
The LLm would then generate the result and answer the question.

Data Store

The below diagram shows how we plan to store data using Pinecone which is a popular Vector DB.
User would upload meetings in audio/video format.
We would use AWS Transcribe to diarize and transcribe the audio file into timestamp, speaker, text (this is simplified)
We would embed the text data into vectors that would be uploaded to Pinecone serverless.

Research

We would try multiple Vector embeddings and also fine-tune LLM Models on the custom dataset and compare the performance of these models.

Clustering Framework

Proposed UI

Below is the sketch of proposed UI.

Getting Started

Running on Github Codespace

Create a Codespace with 4 cores.
Press Ctrl+C to cancel the automatic installation of requirements.txt, as it may not install the packages correctly.
Manually install required packages:
```
pip install -r requirements.txt
```
Setting environment variables
- Create a /config/.env file and fill in your environment variables.
- Learn more about config options: README

Running the pre-requisits script:

python init_one_time_utils/pinecone_sample_dataloader.py

Run the application:
```
streamlit run app.py
```

Running Locally

Clone the repository:

git clone https://github.com/SartajBhuvaji/Resonate.git

Set up a virtual environment:
```
python -m venv .venv
```
Activate the virtual environment:
- On Windows:
```
.\.venv\Scripts\Activate.ps1
```
- On Unix or MacOS:
```
source .venv/bin/activate
```

Install dependencies:

pip install -r requirements.txt --upgrade

Setting environment variables:

Create a /config/.env file and fill in your environment variables.

Running the pre-requisite script:

python init_one_time_utils/pinecone_sample_dataloader.py

Run the application:
```
streamlit run app.py
```

Demo

demo.mp4

Framework

title: Resonate emoji: 🐨 colorFrom: yellow colorTo: red sdk: streamlit sdk_version: 1.34.0 app_file: app.py pinned: false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

67f4d03a9343406f745194381b1253ba85b64493

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
Notebooks		Notebooks
config		config
data		data
docs		docs
init_one_time_utils		init_one_time_utils
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Resonate

Resonate: A Retrieval Augmented Framework For Meeting Insight Extraction

Objectives

Initial Sketches

Getting Started

Running on Github Codespace

Running Locally

Demo

Framework

title: Resonate emoji: 🐨 colorFrom: yellow colorTo: red sdk: streamlit sdk_version: 1.34.0 app_file: app.py pinned: false

About

Releases

Packages

Languages

License

madhuroopa/Resonate-Chat-Bot

Folders and files

Latest commit

History

Repository files navigation

Resonate

Resonate: A Retrieval Augmented Framework For Meeting Insight Extraction

Objectives

Initial Sketches

Getting Started

Running on Github Codespace

Running Locally

Demo

Framework

title: Resonate emoji: 🐨 colorFrom: yellow colorTo: red sdk: streamlit sdk_version: 1.34.0 app_file: app.py pinned: false

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages