Fix Cuda out of memory issue in model.encode by allowing user to transfer to cpu #1717

Quetzalcohuatl · 2022-10-09T00:45:59Z

In issue #487 and issue #522, users were running into OOM issues when batch size is large, because the embeddings aren't offloaded onto cpu.

The PR that fixed this only fixes it if convert_to_numpy. That means if you have convert_to_numpy=False, then your problem still exists.

In this PR, I just added an extra flag that allows the embeddings to be offloaded to cpu. This gives the user the flexibility to save the embeddings (for example if they are saving the SentenceTransformer embeddings to disk or keeping them in RAM for knn, which is often the case) instead of keeping all the embeddings on the gpu.

previously cuda oom issue was only solved if you had convert_to_numpy=True. This is a generalized fix.

….encode fix cuda oom issue for model.encode

paolorechia · 2022-10-16T19:00:12Z

Hey, thanks for opening this, I'm facing this OOM issue. I'll checkout this PR and give it a try.

paolorechia · 2022-10-16T19:17:57Z

Hey, I tried using this PR, however, I ran into the following error:

    con.model.fit(
  File "/home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 749, in fit
    self._eval_during_training(evaluator, output_path, save_best_model, epoch, -1, callback)
  File "/home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 781, in _eval_during_training
    score = evaluator(self, output_path=eval_path, epoch=epoch, steps=steps)
  File "/home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/evaluation/EmbeddingSimilarityEvaluator.py", line 77, in __call__
    embeddings1 = model.encode(self.sentences1, batch_size=self.batch_size, show_progress_bar=self.show_progress_bar, convert_to_numpy=True)
  File "/home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 201, in encode
    all_embeddings = np.asarray([emb.numpy() for emb in all_embeddings])
  File "/home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 201, in <listcomp>
    all_embeddings = np.asarray([emb.numpy() for emb in all_embeddings])
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Here's how I'm calling the encode function

    def encode(self, X):
        return self.model.encode(X, convert_to_numpy=True, transfer_to_cpu=True)

Quetzalcohuatl · 2022-10-16T21:55:13Z

@paolorechia It looks like the problem is in the file /home/paolo/dev/openimagegenius/gpu-code/gpu-node/lib/python3.10/site-packages/sentence_transformers/evaluation/EmbeddingSimilarityEvaluator.py specifically on Line 77. You need to edit it to have transfer_to_cpu=True, because by default it's False.

Here is an example of it working correctly.

paolorechia · 2022-10-17T10:16:44Z

Hey, @Quetzalcohuatl thanks for the advice. Fortunately, I managed to solve my problem in another way.

Quetzalcohuatl added 2 commits October 8, 2022 20:39

fix cuda oom issue for model.encode

3e44a67

previously cuda oom issue was only solved if you had convert_to_numpy=True. This is a generalized fix.

Merge pull request #1 from Quetzalcohuatl/fix-cuda-oom-issue-in-model…

4ac6b80

….encode fix cuda oom issue for model.encode

chschroeder mentioned this pull request Dec 22, 2022

GPU out of memory issues #1793

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Cuda out of memory issue in model.encode by allowing user to transfer to cpu #1717

Fix Cuda out of memory issue in model.encode by allowing user to transfer to cpu #1717

Quetzalcohuatl commented Oct 9, 2022

paolorechia commented Oct 16, 2022

paolorechia commented Oct 16, 2022

Quetzalcohuatl commented Oct 16, 2022

paolorechia commented Oct 17, 2022

Fix Cuda out of memory issue in model.encode by allowing user to transfer to cpu #1717

Are you sure you want to change the base?

Fix Cuda out of memory issue in model.encode by allowing user to transfer to cpu #1717

Conversation

Quetzalcohuatl commented Oct 9, 2022

paolorechia commented Oct 16, 2022

paolorechia commented Oct 16, 2022

Quetzalcohuatl commented Oct 16, 2022

paolorechia commented Oct 17, 2022