Fix for #2954 and alternative way to give values to memoized function #2955

pzinn · 2023-10-08T06:34:26Z

fix for memoizeValues issue #2954: now all memoized functions have the same function body. I ended up removing initialValues from the codeHelper since they appear in the list of symbols anyway.
I also implemented the suggestion in the comments of memoizeValues issue #2954 to give values to a memoized function after defining it and added an example in the doc.

d-torrance · 2023-10-08T17:10:15Z

Would it make things any simpler if memoize returned instances of some new subclass of FunctionClosure, maybe MemoizedFunction?

pzinn · 2023-10-08T23:03:51Z

yeah that's another option, I've been thinking about it. I see nothing wrong with it (in fact, I'm experimenting with something like this for mode-dependent functions like print or show). It'd be an easy change, something like adding

MemoizedFunctionClosure = new Type of FunctionClosure
...
new MemoizedFunctionClosure from ( x -> ( ... ) )

I don't know what others' opinions are...

mahrud · 2023-10-09T03:39:47Z

Like I said on Zulip, I wish memoize didn't cache results in non-garbage collected places, and until that is fixed I'd rather keep its use limited (e.g. strictly not used by packages, only for personal use).

I also pointed out that one problem of the Function Thing = ... notation is that it's really hard to search for in the code, and hence hard to figure out where something is being incorrectly cached when debugging. I used to find it cute, but because of this now I'd rather remove that syntax entirely (or at least limit it to personal use, not packages and the core).

pzinn · 2023-10-09T03:43:06Z

I don't understand your comment on storefuns: currently there's no interaction between memoize and storefuns.
As to the second part, I think this is fixable, so rather than remove or not use the feature, one should improve debugging tools.

mahrud · 2023-10-09T03:47:20Z

I don't understand your comment on storefuns: currently there's no interaction between memoize and storefuns.

Sorry, I got confused. I meant this non-garbage collected hash table in remember.m2:

          values := new MutableHashTable;

As to the second part, I think this is fixable, so rather than remove or not use the feature, one should improve debugging tools.

How do you propose debugging the situation that I described above? It's pretty much impossible to use search/grep/regex to find this. If you have a suggestion you should add that to the documentation as well.

pzinn · 2023-10-09T03:55:08Z

maybe I don't understand how garbage collecting works in M2. in the stupid example

fib=memoize(n -> if n <= 1 then 1 else fib(n-1) + fib(n-2))

why wouldn't values get gc-ed when say fib is removed?

mahrud · 2023-10-09T17:11:08Z

The trouble is that the memoized function is itself almost never removed. Suppose you have a memoized function myRank which takes in modules, and you have functions that generate semi-temporary modules over new rings and return the result of myRank. If people use memoize then the temporary rings and modules remain alive inside the values hash table and are not garbage collected until myRank is removed! You might say the correct thing is to use cacheValue, but there's no way to prevent users from memoizing functions that take in modules or rings!

pzinn · 2023-10-09T23:37:36Z

OK, there are cases when memoize shouldn't be used, and cases when it's perfectly fine (say, when arguments are numbers), we can all agree.

pzinn · 2023-10-09T23:43:45Z

incidentally I notice two strategies for accessing the closure of a function: either using frame, or using localDictionaries. what are the pros/cons of each?

pzinn · 2023-10-11T05:44:32Z

Also I would like to hear others' thoughts on @d-torrance 's suggestion to introduce say MemoizedFunctionClosure. It would definitely make the code of memoizeClear and memoizeValues cleaner and would in particular fix the issue that this PR is for.

pzinn · 2023-10-26T10:01:33Z

I've been thinking about the pros and cons of memoize vs cacheValue.
Maybe I'm mistaken, please correct me if I'm wrong (@mahrud ?) but one major issue of cacheValue is when it applies to objects which we can create multiple copies of, like modules. These objects will be undistinguishable with === and yet they'll each have their own cache table... which means if we compute something for one of them, it will not be memorized for use for any of the other undistinguishable objects. On the other hand, memoize will cache them all at once.

DanGrayson · 2023-10-26T19:46:41Z

Like I said on Zulip, I wish memoize didn't cache results in non-garbage collected places (so in particular not storefuns), and until that is fixed I'd rather keep its use limited (e.g. strictly not used by packages, only for personal use).

There are no non-garbage-collected places.

mahrud · 2023-10-26T21:03:42Z

There are no non-garbage-collected places.

What happens if you run this code?

f = memoize(n -> koszul vars(QQ[x_1..x_5]))
scan(10000, f)

The output is null, but the rings, modules, and complexes, are not garbage collected.

mahrud · 2023-10-26T21:22:25Z

Here's a more realistic example:

mbasis = memoize basis
hilb = (n, d) -> numcols mbasis(d, QQ(monoid[vars(1..n)]))
apply(100, hilb_5)

You can watch memoize eat up all of your memory.

To be clear, mbasis is a function I do use locally because with careful programming it can be extremely useful, but it would 100% be abused if something like this was in the Core.

mahrud · 2023-10-26T21:37:03Z

These objects will be undistinguishable with === and yet they'll each have their own cache table... which means if we compute something for one of them, it will not be memorized for use for any of the other undistinguishable objects. On the other hand, memoize will cache them all at once.

You're not wrong, but like in my example above, this is too easy to abuse if it were used in Core.

My preferred solution for this would be having an extra context object storing the cached information which can be garbage collected, so the user is forced to keep track of it. Maybe if instances of your proposed MemoizedFunctionClosure by default accepted an optional context object, like this:

f = memoize(n -> (...))
ctx = new MutableHashTable
f(10, Context => ctx)

In fact, the context object can be the parent ring, so that when the ring goes out of scope (and hence also the module), so does the cached data.

pzinn · 2023-10-26T23:09:30Z

for the record, this whole discussion is not directly related to the PR, which does not change the way memoize works, but simply fixes a bug and adds a minor feature.

mahrud · 2023-10-28T04:58:34Z

Agreed. I'm not objecting to this PR. I do like the added features, and would use them personally.

As to the second part, I think this is fixable, so rather than remove or not use the feature, one should improve debugging tools.

The specific reason I prefer avoiding using the Function Thing = ... notation in Core and packages is very negative experiences with bugs caused by this line in particular:

M2/M2/Macaulay2/m2/pushforward.m2

Lines 138 to 139 in 86d6c70

    
           -- cache poincare 
        
           poincare cokernel m1 = hf);

I think @mikestillman also mentioned that it's really difficult to debug issues with this kind of caching, so I'm curious if you have ideals for how to improve debugging tools.

pzinn · 2023-10-28T05:05:37Z

I do want to think about this debugging issue some more.
Your example reminded me of this 5-year old bug report: #732 -- I checked and the bug is still there ;-)

mahrud · 2023-11-11T22:47:21Z

Could you rebase on top of upstream/development and force push instead of having multiple merge commits? This way the history remains linear.

pzinn · 2023-11-11T22:52:01Z

what's the exact rebase command that's needed? so I don't spend too much time on this...

pzinn · 2023-11-11T22:55:18Z

BTW the only reason I'm doing all these pulls, as you probably guessed, is that the build keeps failing for reasons that are most likely unrelated to the PR itself.

mahrud · 2023-11-11T23:04:16Z

I was actually confused about why you merged in the first place, since it shows that your first commit (before the merges) passed all tests:

The command is:

git fetch upstream
git rebase upstream/development
git push --force

If rebase runs into conflicts and you don't know how to fix them, you could also just go back in history to the first commit that passed:

git reset --hard HEAD~2
git push --force

pzinn · 2023-11-12T00:41:31Z

thanks, done. and now I'm really confused, thought the builds had always failed...

mahrud mentioned this pull request Oct 21, 2023

Add one-argument hilbertFunction method that returns a function #2971

Merged

fix memoizedValues + alternative memoize initial values

b257a59

pzinn force-pushed the main branch from 7fa7fb5 to b257a59 Compare November 12, 2023 00:40

DanGrayson merged commit 3be5db8 into Macaulay2:development Nov 21, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for #2954 and alternative way to give values to memoized function #2955

Fix for #2954 and alternative way to give values to memoized function #2955

pzinn commented Oct 8, 2023

d-torrance commented Oct 8, 2023

pzinn commented Oct 8, 2023

mahrud commented Oct 9, 2023 •

edited

Loading

pzinn commented Oct 9, 2023

mahrud commented Oct 9, 2023

pzinn commented Oct 9, 2023

mahrud commented Oct 9, 2023

pzinn commented Oct 9, 2023

pzinn commented Oct 9, 2023

pzinn commented Oct 11, 2023 •

edited

Loading

pzinn commented Oct 26, 2023

DanGrayson commented Oct 26, 2023

mahrud commented Oct 26, 2023 •

edited

Loading

mahrud commented Oct 26, 2023

mahrud commented Oct 26, 2023

pzinn commented Oct 26, 2023 •

edited

Loading

mahrud commented Oct 28, 2023

pzinn commented Oct 28, 2023

mahrud commented Nov 11, 2023

pzinn commented Nov 11, 2023

pzinn commented Nov 11, 2023

mahrud commented Nov 11, 2023 •

edited

Loading

pzinn commented Nov 12, 2023

Fix for #2954 and alternative way to give values to memoized function #2955

Fix for #2954 and alternative way to give values to memoized function #2955

Conversation

pzinn commented Oct 8, 2023

d-torrance commented Oct 8, 2023

pzinn commented Oct 8, 2023

mahrud commented Oct 9, 2023 • edited Loading

pzinn commented Oct 9, 2023

mahrud commented Oct 9, 2023

pzinn commented Oct 9, 2023

mahrud commented Oct 9, 2023

pzinn commented Oct 9, 2023

pzinn commented Oct 9, 2023

pzinn commented Oct 11, 2023 • edited Loading

pzinn commented Oct 26, 2023

DanGrayson commented Oct 26, 2023

mahrud commented Oct 26, 2023 • edited Loading

mahrud commented Oct 26, 2023

mahrud commented Oct 26, 2023

pzinn commented Oct 26, 2023 • edited Loading

mahrud commented Oct 28, 2023

pzinn commented Oct 28, 2023

mahrud commented Nov 11, 2023

pzinn commented Nov 11, 2023

pzinn commented Nov 11, 2023

mahrud commented Nov 11, 2023 • edited Loading

pzinn commented Nov 12, 2023

mahrud commented Oct 9, 2023 •

edited

Loading

pzinn commented Oct 11, 2023 •

edited

Loading

mahrud commented Oct 26, 2023 •

edited

Loading

pzinn commented Oct 26, 2023 •

edited

Loading

mahrud commented Nov 11, 2023 •

edited

Loading