Mammal detection and behaviour recognition using the MVIT model and the MammalNet dataset.
Course Name: Computer Vision (EE511)
Final Presentation: https://drive.google.com/file/d/1r3_nsPgNaiTcLxcW8pyfaWKLTQo_p3pD/view?usp=drive_link
Reference:
-
MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding (https://arxiv.org/pdf/2306.00576.pdf)
-
Mvit: Multiscale Vision Transformers (https://arxiv.org/pdf/2104.11227.pdf) Jitendra Malik, Facebook AI Research, UC Berkeley