This project demonstrates a web-based application to query a dataset through natural language.
For this purpose, it uses:
- Streamlit to build a data science web app
- Pandasai to generate Pandas code from a query through OpenAI GPT-3.5
Download the dataset into the data
folder at the root of the project.
If you don't have a Python environment available, you can use the conda package manager which comes with the Anaconda distribution to manage a clean Python environment.
Create a new environment and activate it:
conda create -n streamlit-pandasai python=3.9
conda activate streamlit-pandasai
Install Python dependencies in the activate Python environment:
pip install -r requirements.txt
Create a new API key and set it to the OPENAI_API_KEY
environment variable beforehand.
On Windows:
set OPENAI_API_KEY="sk-..."
On Unix:
export OPENAI_API_KEY="sk-..."
Run the Streamlit project:
streamlit run streamlit_app.py