This project aims to develop a deep neural network model to detect keypoints in an image. The base architecture of the model is pix2seq and we modified it to make it proper for detecting x, y, and a class of a keypoint in an image. We use a pre-trained model in the encoder of the architecture to get the best results.
The input of the model is an image and the output is a sequence of tokens which are representing the coordinates of the keypoints in the input image. The pix2seq model consists of an encoder and a decoder, the encoder gets and image and generates a representation of it. The decoder is a language model which can generate result tokens.
To use the corner detection model, follow these steps:
- Clone the repository to your local machine.
- Install the required dependencies (you can use
pip install -r requirements.txt
command) - Set config and datasets.
- Run train.py or inference.py to train or infer model.
To train the corner detection model included in this project, follow these steps:
- Run the train.py code by using this command:
python train.py --config-path=[path_to_config]
To infere the corner detection model, follow these steps:
- Prepare a config file related to the selected training and set the path in inference.py code.
- Run the inference.py code by using this command:
python inference.py
The following packages and libraries are required to train the scene classification model:
- Python 3.8 or higher
- PyTorch 1.13
- adabelief-pytorch
- pytorch-cosine-annealing-with-warmup
Install the required dependencies by running pip install -r requirements.txt
.
- Creating an API