Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename is_pinned, pin_object and unpin_object #917

Open
wks opened this issue Aug 22, 2023 · 2 comments
Open

Rename is_pinned, pin_object and unpin_object #917

wks opened this issue Aug 22, 2023 · 2 comments
Labels
P-normal Priority: Normal.

Comments

@wks
Copy link
Collaborator

wks commented Aug 22, 2023

In the memory_manager module, there are functions is_pinned, pin_object and unpin_object. Those functions operate on the pin bit metadata.

One problem with those functions is that those function names give the user an impression that it is "the right way" or "the only way" to pin object. However, with the PR #897, it will no longer be true. The preferred way to pin an object and also keep the object alive is adding those objects into the non-transitively pining root set (previously known as "black" roots). The pinning bits are still useful for pinning objects while not keeping them alive.

We should rename those functions so that it is clear that those functions operate on the "pin bits" metadata.

  • is_pinned -> is_pin_bit_set keep as is_pinned, or remove
  • pin_object -> set_pin_bit set_pinned
  • unpin_object -> clear_pin_bit unset_pinned

Since it is an API-breaking change, we can move those functions to the util::metadata::pin_bit module and make those functions public in order to make the API idiomatic to Rust (#658).

Update: Instead of mentioning "pin bit" in the API, we can tell the user that each object has a "pinned" state which can be set and unset. The state may be implemented as the pin bit (ObjectModel::LOCAL_PINNING_BIT_SPEC), but it is a local metadata, and spaces don't necessarily use it.

Update: We may consider removing the is_pinned function. See comments below.

@qinsoon
Copy link
Member

qinsoon commented Aug 22, 2023

I think we discussed about the pin bit before. At that point, the idea was that we should not expose the existence of pin bit, but rather provide an abstract pinning semantic. The reason behind it was that we may not actually have a 'pin bit' for some policies. For example, we may use one state in the 2-bit forwarding bits to represent the pinned state. And for non moving policies, they do not even need a pin bit. However, at that point, we did not have 'transitively pin'. So all the discussion was about the normal pinning semantic that pins one object.

I think now we just need a term to describe the normal pinning, to differentiate from the transitively pinning. If we use 'pin bit', probably we probably would like to describe it as a 'logical' pin bit, and it does not mean we actually use a 'pin bit'. But that also sounds confusing. My preference is that we find a term to describe the semantic.

@wks
Copy link
Collaborator Author

wks commented Aug 23, 2023

We may consider removing the is_pinned API.

TL;DR: The user cannot reliably use is_pinned for anything. And we don't need to call is_pinned in order to decide if we need to unpin an object after pinning.

Problems

is_pinned has two problems. One is inconsistent semantics, and the other is the TOCTTOU problem.

Inconsistent semantics

Its semantics is not consistent with the other two API, namely pin_object and unpin_object. From the user's point of view, the function name is_pinned can be interpreted in two ways:

  1. It returns the status whether an object is pinned.
  2. It returns whether the VM binding has called pin_object on that object before.

It makes a different for spaces that never move objects, such as MarkSweepSpace. For interpretation (1), if an object is in such space, is_pinned should always return true, but pin_object should always fails (returns false, in the sense that it is already pinned). For interpretation (2), is_pinned should always return false because the user can never successfully call pin_object.

It is tempting for VM bindings to use is_pinned to test if it needs to unpin an object after pinning it. However, there is another problem.

Time-of-check to time-of-use (TOCTTOU)

FYI: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

In a multi-threaded environment, multiple threads can attempt to pin or unpin objects concurrently. If one thread calls is_pinned and it returns true, it may attempt to unpin the object. But many things can happen between it sees is_pinned returning true and it actually calls unpin_object. A second thread can unpin the same object, and a third thread can pin the same object again. If object pinning is needed in a multi-threaded environment, the VM binding needs to do the synchronisation properly.

Because of this, is_pinned cannot be used for anything. For example, it cannot be used for native interface. If it see an object already pinned, the VM binding cannot simply pass it to the C code without making sure other threads do not unpin the same object before the C code returns. The recommended way for pinning objects for FFI (pin and keep alive) is adding such objects into the pinned root set (formerly known as "black roots"). See this PR: #897

Who is responsible for unpinning?

How to decide whether any component of the VM binding (a thread, or a step of processing) should unpin an object at a certain stage?

A rule of thumb is that whoever called pin_object and observed it returning true should call unpin_object. In other words, the return value of pin_object dictates the responsibility to call unpin_object.

For example, if the VM binding maintains a list of objects to be pinned during GC (for example, children of "potentially pinning parents". See #690), but some objects have already be pinned during mutator phase, the VM binding wants to make sure that after the GC, any objects that have been pinned before GC remain pinned, and any object that were not pinned before GC remain un-pinned. To do this, the VM binding can enumerate objects to be pinned, call pin_object on each of them, and record the objects for which pin_object returned true (See this example). At the end of GC, the VM binding only unpins those recorded objects (See this example). In this way, the VM binding can unpin objects that need to be unpinned without calling is_pinned.

@qinsoon qinsoon added the P-normal Priority: Normal. label Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-normal Priority: Normal.
Projects
None yet
Development

No branches or pull requests

2 participants