Skip to content

Latest commit

 

History

History
54 lines (33 loc) · 2.8 KB

README.md

File metadata and controls

54 lines (33 loc) · 2.8 KB

anime-recommender

A Simple synopsis based anime recommender. Embeddings are projected with BigBird-ROBERTA.

You can see its embeddings are really good since the model not only reduce the dimensionality of the data (synopsis) but it also preserves semantic relationships. See how it recommends a lot of mecha genre anime when asked to recommend animes like Evangelion (remember it has no knowledge of tags or titles, it only reads synopsis):

image

We can also see that the recommended embeddings trained by an SVC are mostly close to each other when plotted in a 3d plane with PCA: image

Although the hyperplane itself is not visually intuitive, remember embeddings are 768d not 3d (and you can see on the left plot a blue dot inside the red ones, to remember that not only the 3 dimensions with most variance are being used when recommending). A lightweight recommendation is also possible using a kNN instead of the SVM approach.

Search name matching

We use a BERT model to perform name matching between the searched anime and the animes we have in the database.

The simplest Frontend

This is a very simple frontend written by ChatGPT just to interact better with the API.

image

We have a dedicated server and frontend!

We are using a free AWS machine to host part of this backend (until february 2024, when Amazon will start to charge for it 😔). Unfortunately, the free VM is very limited (t2.micro), and it can't handle model inference, so the name matching BERT is disabled and a very simple algorithm is being used instead.

We do also have a dedicated frontend! See: https://gogaido.vercel.app/

image

How to reproduce it

First, you should download those files and place them under a directory named /data, those are the already calculated embbedings for ~23k animes.

Additionaly, you can download this dataset to populate the database, or run the embeddings yourself with a different model.

Then, create a MongoDB instance and populate it with:

# Install mongo: see https://www.mongodb.com/docs/mongodb-shell/install/
sudo service mongod start 
python3 misc/pickle_to_mongo.py

Now you should have a database that the API will use to retrieve data:

image Then you can run the app with:

uvicorn app:app # or
gunicorn 0.0.0.0:8000 --daemon app:app -k uvicorn.workers.UvicornWorker # deploy it somewhere

Access http://127.0.0.1:8000/home, frontend will be displayed on /home.