Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle un-update-able edges #690

Open
wks opened this issue Nov 3, 2022 · 0 comments
Open

Handle un-update-able edges #690

wks opened this issue Nov 3, 2022 · 0 comments
Labels
A-meta Area: Meta issues for the repository P-normal Priority: Normal.

Comments

@wks
Copy link
Collaborator

wks commented Nov 3, 2022

Some VMs, such as the official Ruby interpreter, contain edges (roots or fields) that cannot be updated during GC, for various reasons.

  • Stacks are scanned conservatively.
  • Some global data structures contain fields that cannot be updated.
  • Some objects contain fields that cannot be updated.
    • We call such an object a "potential pinning parent", or PPP for short.

(The reason why a particular VM cannot update those roots or fields varies. For Ruby, see this issue: mmtk/mmtk-ruby#18)

Any objects pointed by one or more of those edges (roots or fields) cannot be moved. They are pinned during this GC (i.e. may or may not be pinned in the next GC).

In the diagram below, red arrows represent pinning edges (roots or fields), and objects pointed by red arrows are red, meaning that they cannot move. Note that

  1. Pinning is not transitive.
  2. Being a PPP doesn't mean the PPP itself cannot move.
  3. Not all field of a PPP are pinning fields.
  4. An object can be pointed by pinning and non-pinning edges at the same time.
  5. Although dead objects will never be used again, we have to pin objects pointed by dead PPPs because we don't know they are dead until the end of GC, after transitive closure is computed and finalisable objects are resurrected.
  6. Pinning doesn't keep an object alive. It may be pinned by dead PPPs.

pin2 - 1

If we know an object is pinned beforehand, when tracing through an edge to that object, we will not move that object.

However, during the computation of the transitive closure, un-update-able (pinning) edges are gradually discovered. So if an object is first reached from an update-able edge, then from an un-update-able edge, the object may have been moved when first reached, breaking the second edge.

For this reason, the GC must be able to identify all unmovable objects before tracing through any update-able edges.

Identify objects pointed by un-update-able edges

There are two ways to identify objects pinned by un-update-able edges

  1. Via write barriers. When updating an un-update-able edge (root or field), we record the pointed object.
  2. Via parents. We enumerate all objects that has un-update-able fields, and pin the objects pointed by those un-update-able fields. We also enumerate un-update-able roots and pin the objects they point to.

Although method (1) is generally possible, it is not practical for Ruby, because most PPPs are from third-party C extensions which we cannot control. Those C extensions were developed long before Ruby introduced copying GC, therefore not aware of object movement. What's worse, the Ruby official documentation discourages C extension developers from inserting write barriers.

Method (2) is viable for Ruby. Only several kinds of objects can be PPPs (See mmtk/mmtk-ruby#18). Therefore, the Ruby VM can record an object into a list of PPPs when

  1. an instance of a known PPP type is created, or when
  2. an object (which wasn't a PPP when created) becomes a PPP.

The list of PPPs can be traversed before GC to pin those objects. The list can be updated after the GC to remove dead objects from the list.

Metadata and API needed

Pinning for handling un-update-able edges needs a one-bit-per-object metadata. This metadata can be in the header or on the side, but it is set by the VM, not MMTk core, because only the VM knows whether an object can move or not. It is generally easier to put it into the header because that doesn't need MMTk core's assistance. However, for VMs that doesn't have vacant header bits (such as Ruby), MMTk core may help the VM allocate a one-bit-per-object side metadata for it to use.

API-wise, we need to add a method to the VM binding API to ask the VM whether an object can move.

trait ObjectModel {
    /// Return true if the given `object` can be moved in this GC.
    /// This function is called during GC.  If it returns false, the object will not be moved.
    /// This function is only called by copying GC that supports object pinning, such as Immix.
    fn can_move(object: ObjectReference) -> bool;
}

We also need two hooks to be executed before and after GC.

  1. One to tell the VM to traverse the PPP list and the un-update-able roots before GC to pin objects.
    • (update: We don't need to pin un-update-able roots here. They are roots, and can be delivered through RootsWorkFactory::create_process_node_roots_work which implies pinning.)
  2. The other to tell the VM to visit the PPP list again after GC to unpin objects and remove dead objects.

And two API functions need to be implemented by the VM to be executed at the two times.

trait Collection {
    /// Called before scanning roots to pin objects for this GC.
    fn early_pin_objects();

    /// Called after the liveness and the new location of objects are determined.
    /// This gives the VM a chance to update the pinning set.
    /// `update` is a call-back function.  It takes an object reference.
    /// -   If the object is live, return `Some(newref)` where `newref` is its new location;
    /// -   If it is dead, return None.
    fn update_pin_set<F>(update: F) where F: Fn(ObjectReference) -> Option<ObjectReference>;

Interaction with mutator-level pinning

Some programming languages support explicit pinning. For example, in C#, the fixed statement can be used to pin a byte buffer and pass it to system calls.

unsafe {
    byte[] buffer = new byte[...];
    fixed (byte* ptr = buffer) {
        // buffer is pinned, and ptr is the address of buffer.
    }
}

The difference between mutator-level pinning and Ruby's un-update-able fields is that mutator-level pinning also guarantees to keep the pinned object alive (so that the object is valid throughout the fixed block). Ruby's un-update-able fields do not keep objects alive, because the PPPs that have such fields may already be dead during this GC, and will not pin the same object in the next GC.

Mutator-level pinning can be handled by the existing API RootsWorkFactory::create_process_node_roots_work. The VM can call this API during Scanning::scan_vm_specific_roots or Scanning::scan_stack_roots to deliver such "ndoe roots" which will be implicitly pinned by the GC.

On the GC side, mutator-level pinning can be handled by Immix's mark bit (LOCAL_MARK_BIT_SPEC) which forbids moving an object. (See ImmixSpace::trace_object_with_opportunistic_copy). Un-update-able fields, however, needs to be handled by the new ObjectModel::can_move API because it doesn't keep the object alive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: Meta issues for the repository P-normal Priority: Normal.
Projects
None yet
Development

No branches or pull requests

2 participants