add depth anything notebook (#1639)

* fix dependencies install in whisper notebook * add depth anything notebook * code style * fix readme * table of content * move notebooks
openvinotoolkit · Jan 24, 2024 · 73df723 · 73df723
1 parent a484c4e
commit 73df723
Show file tree

Hide file tree

Showing 5 changed files with 907 additions and 1 deletion.
diff --git a/.ci/ignore_pip_conflicts.txt b/.ci/ignore_pip_conflicts.txt
@@ -13,4 +13,5 @@ notebooks/256-bark-text-to-audio/256-bark-text-to-audio.ipynb  # torch==1.13
 notebooks/257-llava-multimodal-chatbot/257-llava-multimodal-chatbot.ipynb # transformers<4.35
 notebooks/257-llava-multimodal-chatbot/257-videollava-multimodal-chatbot.ipynb # transformers<4.35
 notebooks/273-stable-zephyr-3b-chatbot/273-stable-zephyr-3b-chatbot.ipynb # install requirements.txt after clone repo
-notebooks/279-mobilevlm-language-assistant/279-mobilevlm-language-assistant.ipynb # transformers<4.35
+notebooks/279-mobilevlm-language-assistant/279-mobilevlm-language-assistant.ipynb # transformers<4.35
+notebooks/280-depth-anything/280-depth-anything.ipynb # install requirements.txt after clone repo
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -100,6 +100,7 @@ Convolutional
 convolutional
 CoSENT
 CPUs
+cpu
 CRNN
 CSV
 CTC
@@ -140,6 +141,7 @@ denoising
 denormalization
 denormalized
 depainting
+DepthAnything
 detections
 Dettmers
 dev
@@ -766,3 +768,4 @@ ZavyChromaXL
 Zongyuan
 ZeroScope
 zeroscope
+xformers
diff --git a/README.md b/README.md
@@ -52,6 +52,7 @@ Check out the latest notebooks that show how to optimize and deploy popular mode
 | [LLM Instruction following pipeline](notebooks/275-llm-question-answering)<br> | Usage variety of LLM models for answering questions using OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/daafd702-5a42-4f54-ae72-2e4480d73501 width=300> |
 |[Stable Diffusion with IP-Adapter](notebooks/278-stable-diffusion-ip-adapter)<br> | Image conditioning in Stable Diffusion pipeline using IP-Adapter | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/182657d9-2aa3-40b3-9fc4-a90b803419fe width=300> |
 | [MobileVLM](notebooks/279-mobilevlm-language-assistant)<br> | Mobile language assistant with MobileVLM and OpenVINO | |
+| [DepthAnything](notebooks/280-depth-anything)<br>[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2F280-depth-anythingh%2F280-depth-anything.ipynb)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/280-depth-anything/280-depth-anything.ipynb) | Monocular Depth estimation with DepthAnything and OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/a9a16658-512f-470c-a33c-0e1f9d0ae72c width=300> |
 
 ## Table of Contents
 
@@ -231,6 +232,7 @@ Demos that demonstrate inference on a particular model.
 | [277-amused-lightweight-text-to-image](notebooks/277-amused-lightweight-text-to-image)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/277-amused-lightweight-text-to-image/277-amused-lightweight-text-to-image.ipynb)<br>| Lightweight image generation with aMUSEd and OpenVINO™ | <img src=https://huggingface.co/amused/amused-256/resolve/main/assets/collage_small.png width=225> | 
 | [278-stable-diffusion-ip-adapter](notebooks/278-stable-diffusion-ip-adapter)<br> | Image conditioning in Stable Diffusion pipeline using IP-Adapter |  <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/182657d9-2aa3-40b3-9fc4-a90b803419fe width=300> |
 | [279-mobilevlm-language-assistant](notebooks/279-mobilevlm-language-assistant)<br> | Mobile language assistant with MobileVLM and OpenVINO | |
+| [280-depth-anything](notebooks/280-depth-anything)<br>[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2F280-depth-anythingh%2F280-depth-anything.ipynb)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/280-depth-anything/280-depth-anything.ipynb) | Monocular Depth Estimation with DepthAnything and OpenVINO |  <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/a9a16658-512f-470c-a33c-0e1f9d0ae72c width=225> |
 
 <div id='-model-training'></div>
 

diff --git a/notebooks/280-depth-anything/280-depth-anything.ipynb b/notebooks/280-depth-anything/280-depth-anything.ipynb
diff --git a/notebooks/280-depth-anything/README.md b/notebooks/280-depth-anything/README.md
@@ -0,0 +1,32 @@
+# Depth estimation with DepthAnything and OpenVINO
+
+[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2F281-depth-anythingh%2F281-depth-anything.ipynb)
+[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/281-depth-anything/281-depth-anything.ipynb)
+
+![depth_map.gif](https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/a9a16658-512f-470c-a33c-0e1f9d0ae72c)
+
+[Depth Anything](https://depth-anything.github.io/) is a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, this project aims to build a simple yet powerful foundation model dealing with any images under any circumstances.
+The framework of Depth Anything is shown below. it adopts a standard pipeline to unleashing the power of large-scale unlabeled images. 
+![image.png](https://depth-anything.github.io/static/images/pipeline.png)
+
+More details about model can be found in [project web page](https://depth-anything.github.io/), [paper](https://arxiv.org/abs/2401.10891), and official [repository](https://github.com/LiheYoung/Depth-Anything)
+
+In this tutorial we will explore how to convert and run DepthAnything using OpenVINO.
+
+## Notebook Contents
+
+This notebook demonstrates Monocular Depth Estimation with the [DepthAnything](https://github.com/LiheYoung/Depth-Anything) in OpenVINO.
+
+The tutorial consists of following steps:
+- Install prerequisites
+- Load and run PyTorch model inference
+- Convert Model to Openvino Intermediate Representation format
+- Run OpenVINO model inference on single image
+- Run OpenVINO model inference on video
+- Launch interactive demo
+
+## Installation Instructions
+
+This is a self-contained example that relies solely on its own code.</br>
+We recommend  running the notebook in a virtual environment. You only need a Jupyter server to start.
+For details, please refer to [Installation Guide](../../README.md).