-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace bytes of compressed stream with uint8_t #106
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @halehawk! Have a couple minor suggestions below.
Co-authored-by: jakirkham <[email protected]>
Co-authored-by: jakirkham <[email protected]>
It seems |
@jakirkham The reason is that the bit stream is used both for compression and decompression, and it needs to be mutable to support compression. |
Does that mean user provided data is overwritten as part of compression? |
No, user data is not being overwritten. The compressed data is copied and returned here: Line 166 in 0817b08
stream_open , which is a very low-level C function that is called only once during compression to set up the memory buffer where the compressed data is written: Line 155 in 0817b08
I'm not entirely sure I understand the issue being discussed here. Can you please elaborate? |
A |
stream_open is used in both compress and decompress, so its argument has to
be writable.
…On Fri, Sep 18, 2020 at 11:00 AM jakirkham ***@***.***> wrote:
A bytes object is read-only. So the pointer would need to be const. It
may have worked before with a C compiler warning. Though this change is
highlighting that is a problem as Cython is making this a compilation error
now. Would it be possible to provide API functions that are const at
least in some cases to respect this constraint? I'm guessing this means
other changes under-the-hood as well.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFDSAFNGLRSWFOZJVQTSGOG2VANCNFSM4RRCB53Q>
.
|
Supporting both const and nonconst pointers in the bit stream API would require substantial surgery on all of zfp. Moreover, it would preclude you from compressing to a stream and later decompressing from the same stream (something needed by zfp's compressed-array classes)--two different objects would be needed: one that holds a const pointer, one that holds a mutable pointer. From the C/C++ programmer's perspective, the expectation is that Now, we should be able to safely cast a |
Good to know. Sure we could do that. Just noting that we may want to fix this in the long term. @halehawk, would you be able to give Peter's suggestion a try? Should be a small change to make. |
I just added stream_open(<char *>buffer, ..), is this the way you want?
Then after I submitted the change, it is still failed. @lindstro you can
check the error log of Xenial. @jakirkham
…On Fri, Sep 18, 2020 at 1:38 PM jakirkham ***@***.***> wrote:
Good to know.
Sure we could do that. Just noting that we may want to fix this in the
long term.
@halehawk <https://github.com/halehawk>, would you be able to give
Peter's suggestion a try? Should be a small change to make.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFHKA4SIAEEHLC76NZTSGOZNBANCNFSM4RRCB53Q>
.
|
For some reason, build #992 is not showing up on CDash (you may be able to check it yourself: https://open.cdash.org/index.php?project=zfp), even though the Travis logs suggest that they were sent there. If they don't show up on CDash soon, I may restart the build. Your changes look right, although I would caution against using Although mostly orthogonal to this issue, the compressed stream needs to be word aligned. I'm not sure if Python makes any guarantees about alignment. I have no reason to believe that alignment has anything to do with these errors; this is just something that occurred to me when discussing pointer types. |
Looks like some tests but not all are now passing:
|
Xenial of 992 builds still gave us 5 errors. The less errors log Peter
showed were built from non-xenial builds.
My 993 builds with all suggested changes still failed on xenial.
…On Fri, Sep 18, 2020 at 4:24 PM jakirkham ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/zfpy.pyx
<#106 (comment)>:
> cdef zfp_field* field = zfp_field_alloc()
cdef bitstream* bstream = stream_open(
- comp_data_pointer,
+ <char *>comp_data_pointer,
⬇️ Suggested change
- <char *>comp_data_pointer,
+ <void *>comp_data_pointer,
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (review)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFFDEMWWQDF3DHGZYFTSGPM2JANCNFSM4RRCB53Q>
.
|
Sorry, I copied the wrong log. I'm guessing this is a Cython bug (see, e.g., cython/cython#1605). I don't know what to suggest at this point. @jakirkham Do you have other numcodecs examples that use Cython with |
Though that has been fixed since 0.28. Cython is now deep in the 0.29.x releases. |
python/zfpy.pyx
Outdated
@@ -329,15 +330,15 @@ cpdef np.ndarray _decompress( | |||
return output | |||
|
|||
cpdef np.ndarray decompress_numpy( | |||
bytes compressed_data, | |||
const uint8_t[::1] compressed_data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry there may have been an extra space in the suggestion before. Maybe that helps?
const uint8_t[::1] compressed_data, | |
const uint8_t[::1] compressed_data, |
Where do we look to see the build failures? Went to the CDash page, but am not familiar with this tool. So wasn't sure where to go from there to see this PR. |
The CDash user interface leaves much to be desired. You first need to find the correct date of the build (in whatever timezone CDash defaults to), which you can do via current/prev/next (e.g., https://open.cdash.org/index.php?project=zfp&date=2020-09-18). Then find the Travis build name you're interested in (e.g., develop-#993.1). Click the entry in the red Fail column (e.g., https://open.cdash.org/viewTest.php?onlyfailed&buildid=6770400), then the "Failed" link (e.g., https://open.cdash.org/test/250854826). This should bring up the error log for that build. |
@halehawk Thanks for the offer. By the release script, do you mean generating and uploading the wheels to PyPI? That's done through a separate repo: https://github.com/LLNL/zfpy-wheels. When experimenting with this, we should use TestPyPI; see https://packaging.python.org/guides/using-testpypi/. If this is what you meant, then let's discuss offline the steps involved in doing this. |
I'm not sure I follow. Why is numcodecs needed to test that ndarrays can be passed? I'm reluctant to add unnecessary dependencies to zfp. |
Sure, I will change as you mentioned. Because your codes used char* at
first, I thought I should follow it.
…On Wed, Sep 23, 2020 at 9:12 AM Peter Lindstrom ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In python/zfpy.pyx
<#106 (comment)>:
> ):
if compressed_data is None:
raise TypeError("compressed_data cannot be None")
- cdef char* comp_data_pointer = compressed_data
+ cdef const char* comp_data_pointer = <const char *>&compressed_data[0]
Same thing here; const void* or void*.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (review)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFHNRE7OLAY7JZ2WRFLSHIF5BANCNFSM4RRCB53Q>
.
|
Yes, that's what I want to know.
…On Wed, Sep 23, 2020 at 9:22 AM Peter Lindstrom ***@***.***> wrote:
Is it possible you release a minor release such as 0.5.5.1 (-:? If not,
please merge my PR as soon as possible. I will be glad to help on this
dylib issue. But can you point to me how you can run the release script?
… <#m_-2146694908586678442_>
@halehawk <https://github.com/halehawk> Thanks for the offer. By the
release script, do you mean generating and uploading the wheels to PyPI?
That's done through a separate repo: https://github.com/LLNL/zfpy-wheels.
When experimenting with this, we should use TestPyPI; see
https://packaging.python.org/guides/using-testpypi/. If this is what you
meant, then let's discuss offline the steps involved in doing this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFFDIXKOIOSOUZT35RTSHIHEJANCNFSM4RRCB53Q>
.
|
Yes, that's my concern too. I will try to test directly with a numpy array.
…On Wed, Sep 23, 2020 at 9:26 AM Peter Lindstrom ***@***.***> wrote:
I did a test on my machine with ensure_ndarray, it worked. But do I need
to do it officially on test_numpy, ithen test_numpy.py needs import
numcodecs.
… <#m_7178860819664189994_m_4606441376701855244_>
I'm not sure I follow. Why is numcodecs needed to test that ndarrays can
be passed? I'm reluctant to add unnecessary dependencies to zfp.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFAVKQRKZ4FNRG45WJDSHIHUDANCNFSM4RRCB53Q>
.
|
I added tests and passed all checks. @lindstro |
tests/python/test_numpy.py
Outdated
random_array, | ||
write_header=False, | ||
**compression_kwargs | ||
) | ||
|
||
if isinstance(compressed_array_t, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the intent here. compressed_array_t
is by definition a bytes
object, so this conditional should always be false.
Wouldn't the appropriate test be to ensure that we can pass both a bytes
object and an ndarray
to zfpy.decompress_numpy
? What other array-like objects are we trying to support here?
Minor nit: As a C/C++ programmer, I think compressed_array_t
is a confusing name as the _t
suffix makes it look like a type.
On Wed, Sep 30, 2020 at 12:09 PM Peter Lindstrom ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/python/test_numpy.py
<#106 (comment)>:
> random_array,
write_header=False,
**compression_kwargs
)
-
+ if isinstance(compressed_array_t, np.ndarray):
I'm not sure I understand the intent here. compressed_array_t is by
definition a bytes object, so this conditional should always be false.
No, compressed_array_t could be array objects from numcodecs.zfpy, and
could be a bytes object from your other upstream application tools.
Wouldn't the appropriate test be to ensure that we can pass both a bytes
object and an ndarray to zfpy.decompress_numpy? What other array-like
objects are we trying to support here?
Yes, in your test_numpy.py, there are two places calling _decompress (line
75 and line 104) or decompress_numpy, I only added this test on one place
of _decompress or decompress_numpy.
… Minor nit: As a C/C++ programmer, I think compressed_array_t is a
confusing name as the _t suffix makes it look like a type.
What is your suggestion for the variable name here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (review)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFCYU436NALALHUD673SINX57ANCNFSM4RRCB53Q>
.
|
Maybe it's my lack of Python skills, but unless I'm missing something obvious, in this test
My concern is that the conditional is always false (so what purpose does it serve?), and therefore only one branch is taken and tested. That is, only the |
I will remove one condition.
…On Wed, Sep 30, 2020 at 4:03 PM Peter Lindstrom ***@***.***> wrote:
No, compressed_array_t could be array objects from numcodecs.zfpy, and
could be a bytes object from your other upstream application tools.
Maybe it's my lack of Python skills, but unless I'm missing something
obvious, in this test compressed_array_t is always a bytes object
returned by zfpy.compress_numpy. How could it be any other type the way
this particular test is written?
Wouldn't the appropriate test be to ensure that we can pass both a bytes
object and an ndarray to zfpy.decompress_numpy? What other array-like
objects are we trying to support here?
Yes, in your test_numpy.py, there are two places calling _decompress (line
75 and line 104) or decompress_numpy, I only added this test on one place
of _decompress or decompress_numpy.
My concern is that the conditional is always false (so what purpose does
it serve?), and therefore only one branch is taken and tested. That is,
only the memoryview path is exercised.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFCF6MFLFJYZPMWD44DSIOTMJANCNFSM4RRCB53Q>
.
|
Thanks Haiying! Also thanks Peter for the review! 😄 |
Yes, big thanks to @halehawk for this PR and for putting up with my picky comments. 😉 Some of the tests are failing on Travis and AppVeyor, but this appears to be caused by some issues with cmocka. I've restarted the failed jobs, and they seem to be passing now. |
Thank you all for such good collaboration. @lindstro @jakirkham
…On Thu, Oct 1, 2020 at 12:41 PM Peter Lindstrom ***@***.***> wrote:
Yes, big thanks to @halehawk <https://github.com/halehawk> for this PR
and for putting up with my picky comments. 😉 Some of the tests are
failing on Travis and AppVeyor, but this appears to be caused by some
issues with cmocka. I've restarted the failed jobs, and they seem to be
passing now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFDCD4ECWKW7PI7JYEDSITENDANCNFSM4RRCB53Q>
.
|
Thanks again for your help, Peter! Do you have an idea of when the next release (with this change) might be? 🙂 |
We're woefully behind schedule on this release, and I don't want to make any promises, but we're working hard to get it done by the end of the year. |
Peter, will the compress and decompress APIs in your new release have changes?
…Sent from my iPhone
On Nov 16, 2020, at 3:46 PM, Peter Lindstrom ***@***.***> wrote:
We're woefully behind schedule on this release, and I don't want to make any promises, but we're working hard to get it done by the end of the year.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
No, no changes there. The majority of new features are related to the C++ compressed-array classes and their C wrappers. |
Peter, any update on your new zfp release?
…On Mon, Nov 16, 2020 at 5:08 PM Peter Lindstrom ***@***.***> wrote:
No, no changes there. The majority of new features are related to the C++
compressed-array classes and their C wrappers.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFFYKBHBUPULHUQNW3TSQG5G5ANCNFSM4RRCB53Q>
.
|
We're slowly making progress, but we're not as close as I would have wished. I expect we're still a few weeks away. |
Just checking in here, how are things looking? |
Also, it looks like numcodecs latest release including Python 3.9
builds. @lindstro could you please build zfpy with this Python version as
well?
…On Tue, Feb 2, 2021 at 11:30 AM jakirkham ***@***.***> wrote:
Just checking in here, how are things looking?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFHZV2W6M2JCKJ2HU6DS5BAFBANCNFSM4RRCB53Q>
.
|
Slowly making progress on 0.5.6. Sorry to be holding everyone up, but there's just a lot of stuff going into this release. Regarding Python 3.9, I can add that to zfpy-wheels but am unsure whether it needs @da-wad Do you know? |
No worries 🙂 If it is easy to add, it would be great, but if not wouldn't worry about it. I don't think the fact that Numcodecs uses Python 3.9 should constrain zfpy. Ah guessing you are referring to this? Yeah 2010 makes sense sense as that is CentOS 6 based. I think the default is 1, which is CentOS 5 based (no one should be using that any more as it no longer receives any updates). They've also added 2014, which is CentOS 7 based. Not sure what GLIBC your users typically need, but continuing to target CentOS 6 makes sense and should give GLIBC 2.12+ support. |
@jakirkham, does the tests of the latest numcodecs allow "conda install
zfpy" instead of "pip install zfpy"? I just wonder if zfpy or other
compression package has to be pip installed packages or not.
…On Tue, Feb 2, 2021 at 1:01 PM jakirkham ***@***.***> wrote:
No worries 🙂
If it is easy to add, it would be great, but if not wouldn't worry about
it. I don't think the fact that Numcodecs uses Python 3.9 should constrain
zfpy.
Ah guessing you are referring to this
<https://github.com/matthew-brett/multibuild#build-phase>? Yeah 2010
makes sense sense as that is CentOS 6 based. I think the default is 1,
which is CentOS 5 based (no one should be using that any more as it no
longer receives any updates). They've also added 2014, which is CentOS 7
based. Not sure what GLIBC your users typically need, but continuing to
target CentOS 6 makes sense and should give GLIBC 2.12+ support.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFF7DLTDXL2IXATVLP3S5BKZPANCNFSM4RRCB53Q>
.
|
We need a tag is the issue |
On Fri, Mar 12, 2021 at 12:09 PM jakirkham ***@***.***> wrote:
We need a tag is the issue
I didn't get it, could you please elaborate it?
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#106 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFHMGEKGY26UI2EMBX3TDJKD5ANCNFSM4RRCB53Q>
.
|
There is no git tag here including these changes from which to build. We would need that to proceed (even with Conda) |
I replaced type bytes with const uint8_t[::1], this way the compressed stream can be checked by ensure_ndarray of numcodecs, and also it prevents numpy array tobytes procedures. I tested with ensure_ndarray, and the default zfp tests.