-
Notifications
You must be signed in to change notification settings - Fork 962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Howo to recognize more than one face on one image? #216
Comments
Hello @bit-scientist, I just figured out on how to recognise more than one face using this Pytorch Facenet's Face Recognition approach. As the examples gives in aligned = []
names = []
for x, y in loader:
x_aligned, prob = mtcnn(x, return_prob=True)
if x_aligned is not None:
print('Face detected with probability: {:8f}'.format(prob))
aligned.append(x_aligned)
names.append(dataset.idx_to_class[y])
aligned = torch.stack(aligned).to(device)
embeddings = resnet(aligned).detach().cpu() Then, you can save both embeddings and names as model file : data = [embeddings, names]
torch.save(data, 'Facenet Pytorch Finetuning embeddings.pt') # saving data.pt file
Next up, here is how to use the model; # importing libraries
from facenet_pytorch import MTCNN, InceptionResnetV1
import torch
from torchvision import datasets
from torch.utils.data import DataLoader
from PIL import Image
import cv2
import time
import os
"""Initializing global variables"""
def get_device():
"""Returns the best available device for PyTorch."""
if torch.backends.mps.is_available():
device = torch.device("mps")
elif torch.cuda.is_available():
device = torch.device("cuda:0")
else:
device = torch.device("cpu")
return device
device = get_device()
load_data = torch.load('Facenet Pytorch Finetuning embeddings.pt',map_location=device)
embedding_list = load_data[0]
name_list = load_data[1]
print(embedding_list.shape)
resnet = InceptionResnetV1(pretrained='vggface2',).eval()
# resnet.to(device)
mtcnn = MTCNN(image_size=160,margin=0,min_face_size=20,
thresholds=[0.6,0.7,0.7],factor=0.709,post_process=True,keep_all=True ) # keep_all=True
cam = cv2.VideoCapture(0)
while True:
ret, frame = cam.read()
if not ret:
print("fail to grab frame, try again")
break
img = Image.fromarray(frame)
img_cropped_list, prob_list = mtcnn(img, return_prob=True)
if img_cropped_list is not None:
boxes, _ = mtcnn.detect(img)
for i, prob in enumerate(prob_list):
if prob>0.90:
emb = resnet(img_cropped_list[i].unsqueeze(0)).detach()
dist_list = [] # list of matched distances, minimum distance is used to identify the person
for idx, emb_db in enumerate(embedding_list):
dist = torch.dist(emb.to(device=device), emb_db).item()
dist_list.append(dist)
min_dist = min(dist_list) # get minumum dist value
min_dist_idx = dist_list.index(min_dist) # get minumum dist index
name = name_list[min_dist_idx-1] # get name corrosponding to minimum dist
box = boxes[i]
original_frame = frame.copy() # storing copy of frame before drawing on it
if min_dist<0.65:
min_dist = 1 - min_dist
frame = cv2.putText(frame, name+' '+str(min_dist), (int(box[0]),int(box[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255,0),1)
print(f"{name} {min_dist}")
else:
frame = cv2.putText(frame, "Unknown", (int(box[0]),int(box[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,255,0),1)
frame = cv2.rectangle(frame, (int(box[0]),int(box[1])) , (int(box[2]),int(box[3])), (255,0,0), 2)
cv2.imshow("IMG", frame)
if cv2.waitKey(1) == ord('q'):
break
cam.release()
cv2.destroyAllWindows() Thanks! Hope it helps 🤠 |
An inference pipeline given in infer deals with one face per image. I have several images where each image has many people. I would like to recognize all unique persons individually on all images.
I have made an exemplanary case below:
Here, I would like to assign IDs (0 to 5) to each character on these given images. I have been thinking of how I can accomplish this.
For now, I could think of one way where I loop all images through
mtcnn
(keep_all=True
) to get face crops (6 per image), calculate their embeddings withresnet
, then calculate their distance matrix using:But I don't know what to do with these numbers afterwards.
Since each face crop (for now) is considered unique, I will get 36x36 matrix, right?
How to proceed from here?
I hope to get some help as I am quite new to face recognition domain. Thanks!
The text was updated successfully, but these errors were encountered: