-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using deprecated/dangerous method to set 'data' in numpy array #17
Comments
Yes that is right. We are making use of the data attribute for in memory views of parameters. It seems that this will not be possible in the future. This appears to be the source of memory issues reported on GPy: SheffieldML/GPy#341, SheffieldML/GPy#51, though this is not confirmed. Not using this in memory view significantly reduces turnaround during optimization, as getting and setting parameters has to go through python loops. Can you think of a fix for this issue, or know about another way of in-memory views of arrays? See stackoverflow discussions about this: |
Is there any reasonable workaround at the moment? Or should we just silence the warning in the meantime? |
Silencing it is currently the only option, as numpy has not provided a workaround as of yet...
… On 30 Dec 2019, at 12:39, Jakub Arnold ***@***.***> wrote:
Is there any reasonable workaround at the moment? Or should we just silence the warning in the meantime?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@mzwiessele Is there an upstream issue in numpy that we're waiting on to be fixed, so it can be tracked here? |
Unfortunately there is only a vague issue:
numpy/numpy#7093
Which links to this thread:
https://mail.scipy.org/pipermail/numpy-discussion/2016-January/074708.html
… On 12 Jan 2020, at 23:21, Jakub Arnold ***@***.***> wrote:
@mzwiessele Is there an upstream issue in numpy that we're waiting on to be fixed, so it can be tracked here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Hi there, I think I have worked out possible solution for this problem, but I am not sure it is easy to integrate it in the existing framework.
Although this seems reasonable to me, I am not sure this will not break some internal mechanics of the framework, of which I honestly haven't an in-depth understanding yet (I got the overall mechanics of the system, but I am still trying to wrap my head around the logic behind the implementation of IndexOperations). |
An alternative idea would be to decouple the description of a parameter from its actual data storage. |
This is my impression, but it could use some correction: The issue appears to boil down to having a pointer "*A" (the It took some reading to understand @g-bassetto 's suggestion, and it makes some sense to me: essentially, we should provide a wrapper around the ndarray.data buffer that keeps it stable while allowing us to get and set the array's contents. Does anyone see any issues with this approach? Not only would it get rid of the warning, it appears to address the root cause of the warning by more carefully handling the The only change I might propose is the addition of some initialization of the buffer that would do away with the Sources:
Edit: |
Also, the link to the numpy mailing list thread above is broken for me, here's a currently working link for anyone else having the same issue: http://numpy-discussion.10968.n7.nabble.com/deprecating-assignment-to-ndarray-data-td42176.html |
Why not do an in-place modification of the memory underlying the array, e.g.
Or, even simpler, why not just do a raw copy? Would something like the following work? (from https://github.com/sods/paramz/blob/master/paramz/core/parameter_core.py#L290)
I'm guessing this approach incurs the performance issue discussed towards the top of the thread, though, by using python loops. It seems like we might have two additional approaches:
The second approach is brutally just
By assigning the EDIT: If I'm not too mislead, the second solution I've sketched out matches up with @g-bassetto 's second solution. Sources: |
@ekalosak your second approach matches mine second one, as you guessed. The reason why I think that having the Do you think that what I said is sensible? |
I think what you've said is sensible, but it also might be out of scope for the narrow issue - though if you see the current narrow issue as a stepping stone to backend agnosticism, then that just adds motivation to solve this current, narrow issue. If I've understood correctly, there's an issue with simply directly reassigning the If the Param were decoupled from the np.ndarray, I can sense we might be able to maintain memory continuity, but it's not clear how we'd do it aside from directly modifying the array's values in place as it is done now. We could gain more confidence that there aren't any stale views of the data that could cause segfaults, but at the same time, decoupling seems like a potentially unnecessary additional level of abstraction. |
At the same time, @g-bassetto , I wonder if it wouldn't be easier to just implement EDIT
Results:
I don't quite understand why we wouldn't use the built-in slice overwriting. Seems like the |
In NumPy 2.0, |
In numpy 1.12.0, release notes:
This appears to happen where we assign directly to the data of
param_array
andgradient_full
. I've noticed warnings fromThe text was updated successfully, but these errors were encountered: