Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dev docs on caching parent objects #3619

Merged
merged 1 commit into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/doc.main
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,7 @@
"DeveloperDocumentation/documentation.md",
"DeveloperDocumentation/printing_details.md",
"DeveloperDocumentation/debugging.md",
"DeveloperDocumentation/caching.md",
"DeveloperDocumentation/serialization.md",
"DeveloperDocumentation/design_decisions.md",
"DeveloperDocumentation/gap_integration.md",
Expand Down
105 changes: 105 additions & 0 deletions docs/src/DeveloperDocumentation/caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
```@meta
CurrentModule = Oscar
```

# Caching parent objects in OSCAR

Many functions in OSCAR that construct parent objects (such as rings, modules,
groups, etc.) have an optional keyword argument `cached::Bool`. If set to
`true` then the object is put into a cache, and when the construction function
is later called again with identical inputs, then the cached object is
returned instead of creating a new object. In contrast when `cached` is set to
`false` then each time a new object is returned.

Example:
```jldoctest
julia> R1, = polynomial_ring(QQ, :x; cached = true);

julia> R2, = polynomial_ring(QQ, :x; cached = true);

julia> R1 === R2 # identical as both were created with `cached = true`
true

julia> R3, = polynomial_ring(QQ, :x; cached = false);

julia> R1 === R3 # not identical as R3 was created with `cached = false`
false

julia> R4, = polynomial_ring(QQ, :y; cached = true);

julia> R1 === R4 # not identical despite `cached = true` due to differing variable names
false
```

## Why cache parent objects?

The main reason for supporting caching of parent objects is **user convenience**:
experience shows that most mathematicians (espescially those who are not also
programmers; but it really affects all) are surprised if, say, `QQ[:x] == Q[:x]`
produces `false`.

For interactive use, it is often simply convenient: e.g. in the following example,
we use `map_coefficients` to map polynomials over the integers to polynomials
over a finite field, and the results can be added -- this is only possible because
the new polynomials have the same parent, thanks to caching.
```jldoctest
julia> Zx, x = ZZ["x"]
(Univariate polynomial ring in x over ZZ, x)

julia> F = GF(2);

julia> map_coefficients(F, x^2) + map_coefficients(F, x)
x^2 + x
```

Caching parents also has downsides. E.g. all those cached objects take up memory which
in some cases can add up to significant amounts.
Comment on lines +55 to +56
Copy link
Member

@joschmitt joschmitt Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tommy also mentioned issues with parallelization/thread safety as a (possible) downside to me.

(And I don't really understand how the memory could fill up if we use these WeakValueDicts? But that's more for my curiosity.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not all our caches actually use WeakValueDict. We introduced it much later and e.g. Hecke still uses Dict and IdDict caches quite a bit (I just grepped for get_cached! to find examples for caches and then checked their definitions). Might be a good idea for somebody to clean that up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope that I got all caching dictionaries which weren't WeakValueDicts in Nemocas/Nemo.jl#1724 and thofma/Hecke.jl#1460.



## Rules for implementations

In the following we describe some rules related to caching for people implementing
parent constructor functions

1. Don't use caching in code inside OSCAR (caching is for end users!)
- i.e., code inside OSCAR by default should always construct rings with `cached = false`.
- In other words: internal code should not rely on caching being active.
Usually the need for using cached parents can be overcome by allowing callers to
pass in a parent object as an additional function argument. One may still provide a
default value for that as a user convenience, but these default parents then should
be created with `cached=false`.
- Rationale: this avoids clogging the system with cached objects the user never asked
for. It also eliminates sources of bugs: a cached ring may have attributes assigned
that modify its behavior in a way that it is completely unexpected in code dealing
with "newly created" ring
2. All end-user facing constructors should have a `cached::Bool` keyword argument
with a default value, regardless of whether caching is actually supported or not.
- if caching is supported, then `cached` should default to *true*
- if caching is not supported, then `cached` should default to *false*
- Rationale: this allows us to comply pro-actively with the first rule: when creating
a parent object, you always pass in `cached = false`. If not all constructors
support this, we can't comply with it. Even if a constructor does not support
caching right now: this might change in the future. So by allowing the `cached`
argument in all cases, we can write future-proof code.
3. Caches must not overflow
- the simplest solution to achieve this is to use an `AbstractAlgebra.CacheDictType`
instances (which really is an alias for `WeakValueDict`) together with `get_cached!`
which automatically removes objects from caches if nothing outside the cache references
it anymore
- Alternatively one may offer a manual way for users to "flush" caches, but beware
the problems this can cause when code relies on parents being cached -- yet another
reason for rule 1.

For convenience, `Hecke` also defines these "standard rings" for use in functions
like `cyclotomic_polynomial`
```
module Globals
using Hecke
const Qx, _ = polynomial_ring(FlintQQ, "x", cached = false)
const Zx, _ = polynomial_ring(FlintZZ, "x", cached = false)
const Zxy, _ = polynomial_ring(FlintZZ, ["x", "y"], cached = false)
end
```
You can use these in your own code as well, or imitate this pattern if convenient.

As always, if in doubt what to do, please ask.