Skip to content

Latest commit

 

History

History
101 lines (71 loc) · 3.17 KB

README.md

File metadata and controls

101 lines (71 loc) · 3.17 KB

Description

Experiments with CLIP based image search for dataset creation and foundational models for image autolabeling

Setup and usage

cd ./dataset_creator/
# create conda environment
conda env create -f environment.yml
# install this pkg
pip install -e .
# then it should be possible to run existing scripts
python scripts/download_data.py
python scripts/select_dataset.py
python scripts/autolabel_dataset.py

Results

Data Lake
  • python scripts/download_data.py

  • Total images:

alt text alt text alt text alt text alt text alt text

Data Selection
  • python scripts/select_dataset.py

  • Rough Inference Times (RTX 3070 laptop):

    • CLIP img/text embedding: ~0.06s / it (~15it/s)
  • Selected images:

alt text alt text alt text alt text

Autolabeling
  • python scripts/autolabel_dataset.py TODO:

  • Rough Inference Times (RTX 3070 laptop):

    • DepthAnything: ~0.35s / it (~2.8it/s)
    • GroundingSAM: ~17s / it (scales ~linearly with instances to detect in class_onthology)
    • COCA: ~1s / it
  • Autolabeled images:

alt text alt text alt text alt text


Ideas / TODOs

  • script to download files from internet (Pixabay API)
  • CLIP based image directory search
    • image based search
    • text based search
    • similarity based filtering
  • GT Autolabeling
    • ImageCaptions (based on COCA model)
    • BBox + InstanceSegmentation (based on Grounding-Sam)
    • Depth (based on DepthAnything)

References