This repository is a fork of the original [MLB-YouTube Dataset repository by AJ Piergiovanni], which provides datasets, code, and models for fine-grained activity recognition in baseball videos.
Note
Please note: This fork was created because the original repository became stale and required updates to work with modern dependencies and tooling. Users looking for the latest functionality, stability, or who want to contribute to ongoing improvements should use this repository.
The original MLB-YouTube Dataset repository has been inactive for several years, making it challenging to use with newer Python and package dependencies. This fork aims to:
- Modernize and maintain the codebase.
- Provide a functional setup compatible with recent dependencies.
- Allow direct installation using pip install for easier use.
For any questions or more details about the original project, please visit and refer to the original repository: https://github.com/piergiaj/mlb-youtube.
To install this forked version:
pip install git+https://github.com/Aakash-Tripathi/mlb-yt-dataset
Note
This package requires ffmpeg
to be installed on your system. Please refer to the official FFmpeg website for installation instructions.
This package allows users to download and process MLB video datasets directly from YouTube. After installing, import the package to download videos, segment clips, or extract continuous clips:
import mlb_dataset as mlb
# Example of downloading videos
download_results = mlb.download_all_videos("path/to/manifest.json", "data/raw")
mlb.extract_segmented_clips("path/to/manifest.json", "data/raw", "data/segmented")
mlb.extract_continuous_clips("path/to/manifest.json", "data/raw", "data/continuous")
The MLB-YouTube dataset was originally created by AJ Piergiovanni and Michael S. Ryoo and introduced in their research paper:
@inproceedings{mlbyoutube2018,
title={Fine-grained Activity Recognition in Baseball Videos},
booktitle={CVPR Workshop on Computer Vision in Sports},
author={AJ Piergiovanni and Michael S. Ryoo},
year={2018}
}
This forked version of the repository is maintained independently for usability purposes. Please refer to the original repository for further references, academic citations, or any in-depth questions regarding the dataset.