Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] ~2x faster method to convert numpy image array to builtin array #443

Open
Interpause opened this issue Jul 4, 2022 · 2 comments

Comments

@Interpause
Copy link

img_msg.data.frombytes(cvim.tobytes())

Looking at the above, this copies the image twice, once when converting from np.array to bytes, and another when reading into img_msg.data (which is an array.array).

Referring to https://numpy.org/doc/stable/reference/generated/numpy.ndarray.data.html, np.array.data exposes the memoryview of the array. By reading directly from it, it is possible to get a ~2x speed up, especially for large images (below tested using a 8k resolution image).

>>> import numpy as np
>>> from array import array
>>> from timeit import timeit
>>> img = np.zeros((4320, 7680, 3), dtype=np.uint8)
>>> timeit(lambda: array('B', []).frombytes(img.tobytes()), number=200)
11.108921873000327
>>> timeit(lambda: array('B', []).frombytes(img.data), number=200)
5.20463201199982

I will write a pull request when I have time.

@daggarwa
Copy link

daggarwa commented Feb 28, 2023

@Interpause Did you happen to create an MR for this optimization? By the way I found a case where this is not working for me. If the img.data has datatype of np.float32 then this optimization is breaking:

>>> import numpy as np
>>> from array import array
>>> from timeit import timeit
>>> img = np.zeros((4320, 7680, 3), dtype=np.uint8)
>>> timeit(lambda: array('B', []).frombytes(img.tobytes()), number=200)
12.889657152991276
>>> timeit(lambda: array('B', []).frombytes(img.data), number=200)
5.910853378998581
>>> img = np.zeros((4320, 7680, 3), dtype=np.float32)
>>> timeit(lambda: array('B', []).frombytes(img.data), number=200)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/timeit.py", line 233, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
  File "/usr/lib/python3.8/timeit.py", line 177, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
  File "<stdin>", line 1, in <lambda>
TypeError: a bytes-like object is required

Would you have any suggestions how to workaround that?

@Interpause
Copy link
Author

Interpause commented Mar 5, 2023

The "dtype" of a memoryview can be read from memoryview.format (https://docs.python.org/3/library/stdtypes.html#memoryview.format). So for a numpy array, it would be img.data.format. The formats can be referenced here: https://docs.python.org/3/library/struct.html#format-characters.

There is also a memoryview.cast thats can cast the memoryview without copy (https://docs.python.org/3/library/stdtypes.html#memoryview.cast). However, in such a case, extra precaution is needed to cast it back to the correct type when receiving.

That said, cv2 images are always supposed to be uint8 numpy arrays, so it should not be a problem?

Oh, and I did not end up making the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants