This is an automated AI meme generator. On selecting a particular meme template a relating caption to that meme template gets generated. Here, the mechanism is similar to a captioning model but no images were used for training. This is a fun project I created to test the sarcasm levels of a trained model. The application has then been deployed on AWS.
I was tired of working on projects for industry level with use cases. Sometimes there are times when you need to take a step back and relax, when you need to look back on why you chose this particular field. This was a different idea. Although it might not be something that has a use case but it is definitely something that you can look at and chuckle with you friends. It was fun creating this project and sometimes you do need to work on silly projects as well. Here is a clip of realtime clip of the working of the application. Here are some other memes generated by AI. I have to give it to it's sense of humour.git clone https://github.com/Shreyz-max/Memes-Generator.git
Video Caption Generator: cd Memes-Generator
Create environment: conda create -n meme_generator python=3.8
Activate environment: conda activate meme_generator
Install requirements: pip install -r requirements.txt
Run python3 app.py
If you want to train from scratch, you can run the following scripts in the sequence.
First we will clone the dataset library. git clone https://github.com/schesa/ImgFlip575K_Dataset.git
Here you can either scrape from scratch. The script for scraping from scratch is present in the repo itself.
I used the already existing dataset. I used the individual json files in dataset folder.
So the first step is to convert this into dataset for training.
Hence, run the code python3 preprocess.py
This is followed by python3 preprocess_captions.py
Now, for the training, run python3 train.py
DistillGPT2 is finetuned to generate captions. In here, the way this model is trained is such that, for each meme template a token
has been assigned. You can check out the tokens in special_tokens.py
. Now, the inference works just like any other transormer model.
Given the category token it has to predict the remaining words hence generating a caption. It also has two other tokens.
One signifies the end of the sentence and other signifies the end of the box. End of the box tells us the upper
and lower caption on the meme template.
This generated text along with the category is passed to ImgFlip Api
to generate the meme. I added a bit
of front end to make it look a bit more interactive.
This is then deployed to AWS. The link to try this by yourself is here .
- Using other transformers more advanced transformers
- Working on a larger dataset