Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified Inference #1045

Closed
wants to merge 39 commits into from
Closed

Simplified Inference #1045

wants to merge 39 commits into from

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Sep 25, 2020

This PR implements standalone, independent inference classes and methods for pytorch hub and a future pip package.

OpenCV Example:

import cv2
import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).autoshape()

# Image
torch.hub.download_url_to_file('https://github.com/ultralytics/yolov5/blob/master/inference/images/zidane.jpg?raw=true', 'image.jpg')
img = cv2.imread('image.jpg')

# Inference
prediction = model(img[:, :, ::-1], shape=640)  # BGR to RGB

# Plot
for *box, conf, cls in prediction[0]:  # [xy1, xy2], confidence, class
    print('%s %.2f' % (model.names[int(cls)], conf))  # label
    cv2.rectangle(img, pt1=tuple(box[:2]), pt2=tuple(box[2:]), color=[255, 255, 255], thickness=2)  # plot
cv2.imwrite('results.jpg', img)  # save

Output:

100% 165k/165k [00:00<00:00, 5.11MB/s]

person 0.87
person 0.80
tie 0.78
tie 0.28
True

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Updated dependencies, improved training stability, added NMS control, and streamlined dataset acquisition methods.

📊 Key Changes

  • 📦 Updated NVIDIA PyTorch Docker image from 20.08 to 20.09.
  • 🎛 Changed giou loss term in hyperparameter YAML files to box.
  • 🔄 Modified COCO dataset download script to pack multiple commands and improved readability.
  • 🔧 Simplified VOC dataset download script to be more concise and organized.
  • ✂️ Removed unnecessary import in detect.py and adjusted confidence/threshold defaults.
  • 💡 Moved NMS (Non-Maximum Suppression) layer control from models to separate class for modularity.
  • 🔁 Refined autoShape class to handle cv2/np/PIL/torch inputs uniformly.
  • 📚 Updated requirements.txt by commenting out a specific coremltools version requirement.
  • 🛠 Replaced giou (Generalized Intersection over Union) with box loss throughout the codebase.

🎯 Purpose & Impact

  • 🏗 Ensures the underlying Docker container has the latest stable libraries for PyTorch.
  • 📐 Adjusts hyperparameters to enhance model training stability and performance.
  • 🚀 Streamlines dataset scripts for ease of use and maintainability.
  • 🧠 Introduces modularity to the NMS process, allowing for better future extensions.
  • ✅ Facilitates easier input handling for more flexible model predictions.
  • 🗂 Encourages best practices in dependency management by avoiding hard version locks where not necessary.
  • 🧹 General cleanup and standardization of loss terminology for clearer understanding across the codebase.

These updates prepare the code for future enhancements, promote better practices in managing dependencies, and make it simpler for users to acquire datasets. Such improvements may positively affect the usability, efficiency, and reproducibility of the models for developers and end-users alike.

@glenn-jocher glenn-jocher self-assigned this Sep 25, 2020
@glenn-jocher
Copy link
Member Author

glenn-jocher commented Sep 25, 2020

PIL Example:

import numpy as np
import torch
from PIL import Image, ImageDraw

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True).autoshape()

# Image
torch.hub.download_url_to_file('https://github.com/ultralytics/yolov5/blob/master/inference/images/zidane.jpg?raw=true', 'image.jpg')
img = Image.open('image.jpg')

# Inference
prediction = model(img, shape=640)

# Plot
draw = ImageDraw.Draw(img)
for *box, conf, cls in prediction[0]:  # [xy1, xy2], confidence, class
    print('%s %.2f' % (model.names[int(cls)], conf))  # label
    draw.rectangle(box, width=3)  # plot
img.save('results.jpg')  # save

Output:

100% 165k/165k [00:00<00:00, 5.16MB/s]

person 0.87
person 0.80
tie 0.78
tie 0.28

@NanoCode012
Copy link
Contributor

Hi @glenn-jocher , would this support multiple images inference?

For ex, img would be an array of images. Would prediction be an array of predictions?

Also, I noticed that you lowered iou and conf, could there be a reason?

@glenn-jocher
Copy link
Member Author

glenn-jocher commented Sep 27, 2020

@NanoCode012 I've just updated it to support batched inference now, I know it's a popular request. It autocomputes the minimum inference size per the shape argument. For zidane.jpg and bus.jpg in a batch for example, it will use 640x640 to accommodate both vertical and horizontal rectangular images at 640. If it was just bus.jpg it would be 640x480 vertical, and if it was just zidane.jpg it would run at 384x640 horizontal, so it's optimally shaped under all conditions.

This should super-simplify inference for most custom use cases I think.

# Images
img1 = Image.open('inference/images/zidane.jpg')
img2 = Image.open('inference/images/bus.jpg')

# Batched inference
prediction = model([img1, img2], shape=640)

# Plot
for i, img in enumerate([img1, img2]):
    for *box, conf, cls in prediction[i]:  # [xy1, xy2], confidence, class
        print('%s %.2f' % (model.names[int(cls)], conf))  # label
        ImageDraw.Draw(img).rectangle(box, width=3)  # plot
    img.save('results%g.jpg' % i)  # save

@glenn-jocher
Copy link
Member Author

glenn-jocher commented Sep 27, 2020

@NanoCode012 yes I lowered IoU and confidence thresholds also for this NMS module. After playing around with the sliders in iDetection I realized lower values seemed to produce visually more better results (qualitatively speaking), so I released the iDetection v7.7 update with 0.4 and 0.2 IoU and confidence threshold defaults (which are manually variable now with the sliders).

I also separately saw that the CoreML official NMS defaults for them are 0.45 IoU and 0.25 confidence, so I decided to adopt those values here. I should probably update the detect.py defaults as well in this PR to match.

@NanoCode012
Copy link
Contributor

Hello @glenn-jocher , thanks for clarification on the IOU and conf thresholds as well as the update on batch inference.

@aniltolwani
Copy link

Will these change affect standard video inference (detect.py) at all? It would be great to be able to achieve similar inference times to test.py using batched predictions with video (right now I'm getting .014s per frame).

@glenn-jocher
Copy link
Member Author

/rebase

@glenn-jocher
Copy link
Member Author

@aniltolwani yes, you can use this for batched video inference.

The model() here accepts a list of images, so you would construct the batch yourself, i.e. you would read perhaps 16 frames from a cv2 video capture object, place them in a list, and then pass the list for batched inference, repeat for the duration of the video.

/rebase

@glenn-jocher
Copy link
Member Author

glenn-jocher commented Oct 6, 2020

I think I'm going to expand this concept to be a bit more ambitious. If I can make this autoshape wrapper handle the current input format as well, which is just BCWH torch data, then this would allow the model to really accept nearly all commonly used input formats: cv2 image, numpy image, pil image, list of images (for batched inference), pytorch input (already shaped/letterboxed). I think this would cover the great majority of use cases.

@glenn-jocher
Copy link
Member Author

/rebase

@glenn-jocher
Copy link
Member Author

glenn-jocher commented Oct 10, 2020

PR is updated with torch input functionality now, so it can optionally behave identically to the current model. The autoShape wrapper comments now show all the input options as well, which I think cover the vast majority of pytorch inference use cases:

    def forward(self, x, shape=640, augment=False, profile=False):
        # supports inference from various sources. For height=720, width=1280, RGB images example inputs are:
        #   opencv:     x = cv2.imread('image.jpg')[:,:,::-1]  # HWC BGR to RGB x(720,1280,3)
        #   PIL:        x = Image.open('image.jpg')  # HWC x(720,1280,3)
        #   numpy:      x = np.zeros((720,1280,3))  # HWC
        #   torch:      x = torch.zeros(16,3,720,1280)  # BCHW
        #   multiple:   x = [Image.open('image1.jpg'), Image.open('image2.jpg'), ...]  # list of images

Test script:

import cv2
import numpy as np
from PIL import Image, ImageDraw

from models.experimental import attempt_load

# Model
# model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model = attempt_load('yolov5s.pt')
model = model.autoshape()  # < ------------------ add autoshape() wrapper

# Image
img1 = Image.open('inference/images/zidane.jpg')  # PIL
img2 = cv2.imread('inference/images/zidane.jpg')[:, :, ::-1]  # opencv (BGR to RGB)
img3 = np.zeros((640, 1280, 3))  # numpy
imgs = [img1, img2, img3]

# Inference
prediction = model(imgs, size=640)  # includes NMS

# Plot
for i, img in enumerate(imgs):
    print('\nImage %g/%g: %s ' % (i + 1, len(imgs), img.shape), end='')
    img = Image.fromarray(img.astype(np.uint8)) if isinstance(img, np.ndarray) else img  # from np
    if prediction[i] is not None:  # is not None
        for *box, conf, cls in prediction[i]:  # [xy1, xy2], confidence, class
            print('%s %.2f, ' % (model.names[int(cls)], conf), end='')  # label
            ImageDraw.Draw(img).rectangle(box, width=3)  # plot
    img.save('results%g.jpg' % i)  # save

Test script output for batched infrence at img size 640:

Fusing layers... 
Adding autoShape... 

Image 1/3: (720, 1280, 3) person 0.87, person 0.80, tie 0.78, tie 0.28, 
Image 2/3: (720, 1280, 3) person 0.87, person 0.80, tie 0.78, tie 0.28, 
Image 3/3: (640, 1280, 3) 

@glenn-jocher
Copy link
Member Author

/rebase

* fix/hyper

* Hyp giou check to train.py

* restore general.py

* train.py overwrite fix

* restore general.py and pep8 update

Co-authored-by: Glenn Jocher <[email protected]>
@glenn-jocher
Copy link
Member Author

/rebase

glenn-jocher and others added 11 commits October 15, 2020 18:48
img size checks are warnings rather than errors, so current implementation allows improperly formed model inputs.
* comment

* fix parsing

* fix evolve

* folder

* tqdm

* Update train.py

* Update train.py

* reinstate anchors into meta dict

anchor evolution is working correctly now

* reinstate logger

prefer the single line readout for concise logging, which helps simplify notebook and tutorials etc.

Co-authored-by: Glenn Jocher <[email protected]>
* fix/hyper

* Hyp giou check to train.py

* restore general.py

* train.py overwrite fix

* restore general.py and pep8 update

Co-authored-by: Glenn Jocher <[email protected]>
@glenn-jocher
Copy link
Member Author

This PR is rebased, updates complete, CI passing, merging.

@glenn-jocher
Copy link
Member Author

/rebase

@glenn-jocher
Copy link
Member Author

Ok, somehow I managed to get this PR stuck in a rebase cycle. Closing and starting from scratch at #1153

@glenn-jocher glenn-jocher deleted the simple_inference branch October 15, 2020 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants