This repository contains the implementation of a search engine for the Information Retrieval and Web Analytics course, part of the Mathematical Engineering in Data Science degree at Pompeu Fabra University. The corpus of documents used is a collection of tweets related to the Farmers Protests in 2021.
The code is executed from the python notebooks located in the notebooks/
folder. To run the code you will need to install the dependencies specified in the requirements.txt
file.
The data folder is empty by default. To use the code you will need to populate it with the corresponding files.
Here is a step by step example of how to run the code using Jupyter Notebook and venv. Other editors such as Visual Studio are also suitable options for running the code.
git clone [email protected]:iv97n/irwa.git
Create the virtual environment
Windows
python -m venv venv
Ubuntu
python3 -m venv venv
Activate the virtual environment
Windows
.\venv\bin\activate
Ubuntu
source venv/bin/activate
Install the requirements.txt dependencies
pip install -r requirements.txt
ipython kernel install --user --name=venv
The code is executed from the python notebooks located in the notebooks/
folder. To select the suitable kernel for
executing them, once you have opened Jupyter Notebook go to Kernel>>Change Kernel and select the venv
kernel.