Upgraded Depth Anything V2 - UDAV2

This work presents Depth Anything V2. It significantly outperforms V1 in fine-grained details & robustness. Compared with SD-based models, it enjoys faster inference speed, fewer parameters, higher depth accuracy, & a robust upgraded Gradio WebUI as well as both image & video .bat scripts for more intuitive CLI usage (if that is your more preferred method of use).

UDAV2 Outputs

Gradio Example

News

2024-06-14: Paper, project page, code, models, demo, & benchmark are all released.
2024-06-20: The repo has been upgraded & is also now running on .safetensors models instead of .pth models.
2024-06-23: Updated installation process to be a simpler one_click_install.bat file. It automatically downloads the depth models into a 'checkpoints' folder, the triton wheel into the repo's main folder & installs all of the dependencies needed. [Also updated this README.md file to provide more clarity!]
2024-06-24: pravdomil has provided a much need update to UDAV2 for 16bit image creation in order to make stunning 3D Bas-Reliefs! I am currently in the process of updating the gradio webui to include both 16bit single image & 16bit batch image creation which will be pushed in the coming days.
2024-06-25: I'm currently working on a beta version of UDAV2 as an automatic1111 extension & will be released next week, so stay-tuned!
2024-06-27: A1111 extension released! sd-webui-udav2
2024-06-29: Updated Forge extension release sd-forge-udav2, to prevent conflicts w/ pre-existing installed extensions in Forge!
2024-07-01: sd-webui-udav2 has now been added to the extension index.json! You can now install the extension directly inside A1111.
2024-07-03: [v1.1.452] sd-webui-controlnet now has a depth_anything_v2 preprocessor🔥! Update transformers dependency to transformers-4.44.1 to use the new depth_anything_v2 controlnet preprocessor.

Windows Installation

All you need to do is copy & paste (or right-click), each of the following lines in-order into cmd & everything will be installed properly.

git clone https://github.com/MackinationsAi/Upgraded-Depth-Anything-V2.git
cd Upgraded-Depth-Anything-V2
one_click_install.bat

That's it! All you have to do now is pick one of the run_-------.bat files, double-click & you're off to depthing!

MacOS & Linux Installation

Run the following commands in your terminal.

git clone https://github.com/MackinationsAi/Upgraded-Depth-Anything-V2.git
cd Upgraded-Depth-Anything-V2
source one_click_install.sh

or

git clone https://github.com/MackinationsAi/Upgraded-Depth-Anything-V2.git
cd Upgraded-Depth-Anything-V2
pip install requirements_macos.txt

Then manually download & place all 3 of the Depth Anything V2 models [download links found below] into a folder call checkpoints & you'll be good to go.

Usage

Gradio WebUi

To use the upgraded gradio webui locally:

For Windows

run_gradio.bat

You can also try the online gradio demo, though it is FAR less capable than this Upgraded Depth Anything V2 repo.

For MacOS & Linux

python run_gradio.py

Running run_image-depth_16bit.py CLI script to make 16bit images for creating 3D Bas-Reliefs!

It works for both single image depth processing & batch image depth processing.

run_image-depth_16bit.bat

3D Bas-Relief from 16bit Image Depth Maps Examples

The images used to make the following depth maps were created using Dreamshaper Turbo.*

Running run_image-depth_8bit.py CLI script

It works for both single image depth processing & batch image depth processing.

run_image-depth_8bit.bat

or

python run_image-depth.py --encoder <vits | vitb | vitl> --img-path <path> --outdir <outdir> [--input-size <size>] [--pred-only] [--grayscale]

Options:

--img-path: You can either 1.) point it to an image directory storing all interested images, 2.) point it to a single image, or 3.) point it to a text file storing all image paths.
--input-size (optional): By default, we use input size 518 for model inference. You can increase the size for even more fine-grained results.
--pred-only (optional): Only save the predicted depth map, without raw image.
--grayscale (optional): Save the grayscale depth map, without applying color palette.

For example:

python run_image-depth.py --encoder vitl --img-path assets/examples --outdir depth_vis

Running run_video-depth.py CLI script

It works for both single video depth processing & batch video depth processing.

run_video-depth.bat

or

python run_video-depth.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis

Pre-trained Models [.safetensors]

We provide three models of varying scales for robust relative depth estimation (the fourth model is still a WIP):

All three models are automatically downloaded to a 'checkpoints' folder in your repo when you run the one_click_install.bat. (I only provided the download link here incase you want to download them elsewhere for use outside this repo)

Models	Params	Checkpoints
Depth-Anything-V2-Small model	48.4M	Download
Depth-Anything-V2-Base model	190.4M	Download
Depth-Anything-V2-Large model	654.9M	Download
Depth-Anything-V2-Giant model	1.3B	Coming soon

Please note that the larger (vitl) model has better temporal consistency on videos.

Triton Dependency Wheel

This dependency .whl is automatically downloaded to the main/tree repo-folder when you run the one_click_install.bat. (I only provided the download link here incase you want to download it elsewhere for use outside this repo.)

Dependency	Params	Wheel
Triton==2.1.0	306.7M	Download

(Once it has been installed & the gradio webui is running properly, you can delete it or use it elsewhere in a similar fashion.)

Notes:

Compared to V1, we have made a minor modification to the DINOv2-DPT architecture (originating from this issue). In V1, we unintentionally used features from the last four layers of DINOv2 for decoding. In V2, we use intermediate features instead. Although this modification did not improve details or accuracy, we decided to follow this common practice.
I will be updating the training scripts to support .safetensors output pre-trained models in the coming weeks so stay-tuned for more UDAV2 depthing updates!

Original DAV2 Github Repo Creds

Lihe Yang¹ · Bingyi Kang^2† · Zilong Huang² · Zhen Zhao · Xiaogang Xu · Jiashi Feng² · Hengshuang Zhao^1*

Legend ^Keys - [ HKU ¹ · TikTok ² · project-lead † · corresponding author * ]

Fine-tuned to Metric Depth Estimation & DA-2K Evaluation Benchmark

Please refer to metric depth estimation &/or to DA-2K benchmark.

LICENSE

Depth-Anything-V2-Small model is under the Apache-2.0 license. Depth-Anything-V2-Base/Large/Giant models are under the CC-BY-NC-4.0 license.

Citation

If you find this project useful, please consider citing below, give this upgraded repo a star & share it w/ others in the community!

@article{depth_anything_v2,
  title={Depth Anything V2},
  author={Yang, Lihe & Kang, Bingyi & Huang, Zilong & Zhao, Zhen & Xu, Xiaogang & Feng, Jiashi & Zhao, Hengshuang},
  journal={arXiv:2406.09414},
  year={2024}
}

@inproceedings{depth_anything_v1,
  title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data}, 
  author={Yang, Lihe & Kang, Bingyi & Huang, Zilong & Xu, Xiaogang & Feng, Jiashi & Zhao, Hengshuang},
  booktitle={CVPR},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upgraded Depth Anything V2 - UDAV2

UDAV2 Outputs

Gradio Example

News

Windows Installation

MacOS & Linux Installation

Usage

Gradio WebUi

For Windows

For MacOS & Linux

Running run_image-depth_16bit.py CLI script to make 16bit images for creating 3D Bas-Reliefs!

3D Bas-Relief from 16bit Image Depth Maps Examples

Running run_image-depth_8bit.py CLI script

Running run_video-depth.py CLI script

Pre-trained Models [.safetensors]

Triton Dependency Wheel

Notes:

Original DAV2 Github Repo Creds

Fine-tuned to Metric Depth Estimation & DA-2K Evaluation Benchmark

LICENSE

Citation

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
assets		assets
depth_anything_v2		depth_anything_v2
metric_depth		metric_depth
DA-2K.md		DA-2K.md
LICENSE		LICENSE
README.md		README.md
one_click_install.bat		one_click_install.bat
one_click_install.sh		one_click_install.sh
requirements.txt		requirements.txt
requirements_macos.txt		requirements_macos.txt
run_gradio.bat		run_gradio.bat
run_gradio.py		run_gradio.py
run_image-depth_16bit.bat		run_image-depth_16bit.bat
run_image-depth_16bit.py		run_image-depth_16bit.py
run_image-depth_8bit.bat		run_image-depth_8bit.bat
run_image-depth_8bit.py		run_image-depth_8bit.py
run_video-depth.bat		run_video-depth.bat
run_video-depth.py		run_video-depth.py

License

MackinationsAi/Upgraded-Depth-Anything-V2

Folders and files

Latest commit

History

Repository files navigation

Upgraded Depth Anything V2 - UDAV2

UDAV2 Outputs

Gradio Example

News

Windows Installation

MacOS & Linux Installation

Usage

Gradio WebUi

For Windows

For MacOS & Linux

Running run_image-depth_16bit.py CLI script to make 16bit images for creating 3D Bas-Reliefs!

3D Bas-Relief from 16bit Image Depth Maps Examples

Running run_image-depth_8bit.py CLI script

Running run_video-depth.py CLI script

Pre-trained Models [.safetensors]

Triton Dependency Wheel

Notes:

Original DAV2 Github Repo Creds

Fine-tuned to Metric Depth Estimation & DA-2K Evaluation Benchmark

LICENSE

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages