Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc file for JITServer AOT cache #20902

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

cjjdespres
Copy link
Contributor

The doc/compiler/jitserver/AOTCache.md file contains some draft documentation on the JITServer AOT cache feature. Some of the sections still need to be cleaned up and expanded.

@cjjdespres
Copy link
Contributor Author

Attn @mpirvu. This is adapted from my comment in #16721 (comment) on the changes to the AOT cache to allow it to function without a client SCC available. I added more information about the structure of the AOT cache in general, and erred on the side of too much detail. @dsouzai may also be interested.

Some of it is a work in progress, but please take a look and suggest improvements if you like.

@mpirvu mpirvu added comp:doc comp:jitserver Artifacts related to JIT-as-a-Service project labels Jan 9, 2025
@mpirvu mpirvu self-assigned this Jan 9, 2025
Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please expand the two TODOs in the doc


# The JITServer AOT cache

The JITServer AOT cache is a feature of the JITServer that allows a server to share the same data (currently only AOT-compiled methods) with multiple client JVMs. This sort of sharing can save resources at the server, since it can serve cached data in response to a request instead of computing it on-demand, and can as a consequence improve server response times and client performance. The JITServer AOT cache relies on the existing AOT code relocation infrastructure, so we will go over the relevant properties of that mechanism in the next section, before moving on to an overview of the different components of the JITServer AOT cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ROMClasses are also shared, so why not say directly "allows a server to share AOT-compile method with multiple client JVMs".


Traditionally, the JITServer compiled AOT code on a per-client basis, and required a local SCC in order to perform these compilations. Whenever a relocation record needed a client offset, it would send the data to the client, have it store the data in its local SCC, get the offset to the data back, and store that offset in the relocation record. The resulting compiled method could be sent to the client and stored in its local SCC without issue, and the client could relocate it normally with its relo runtime and local SCC. This is still what happens when the client does not request that the compilation involve the JITServer AOT cache.

## Serialization records and the problem of sharing AOT code between JITServer clients
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... between JITServer clients with different local SCCs"
Otherwise, if all clients connect to the same local SCC, things work.

@dsouzai
Copy link
Contributor

dsouzai commented Jan 10, 2025

Thanks for writing this up, reading both the original paper and your doc has been very helpful in increasing my understanding of the overall structure and process of the AOTCache.

I had a couple of questions:

For the default case (deserializing without a local SCC), conceptually does the client need to do the deserialization i.e., theoretically, couldn't the server pre-emptively deserialize the records so that the client only needs to do relocation?

For the ClassChain deserialization records, why does it need to be a "RAM class chain"? Wouldn't the ROM classes be sufficient as it is in a local AOT compilation?

@cjjdespres
Copy link
Contributor Author

For the default case (deserializing without a local SCC), conceptually does the client need to do the deserialization i.e., theoretically, couldn't the server pre-emptively deserialize the records so that the client only needs to do relocation?

Yes, the server could do this. The deserializer currently does three things: (1) maps serialization records to valid client objects, (2) updates offsets in relo records, and (3) gives the relo runtime a way of turning relo record offsets into client objects.

The server could do (1), though it would need to request any missing loader and class hierarchy information from the client for an AOT cache load. Maybe all of that information could be sent to the server by the client during class loading (in batches?). That might make tracking the dependencies of methods in the AOT cache at the server easier, if that were a goal, because the server would have a reasonably up-to-date picture of the client's hierarchy. The server would build up a cached record->client object map like the client deserializer does now.

For (2) and (3), there are a few choices I can think of:

  • Server relocates the methods completely and sends it to the client for it to install and use directly.
  • Server sends along an offset->pointer mapping along with the methods, so a suitably-modified TR_J9DeserializerSharedCache could use it during relocation.
  • Server replaces the offsets in relo records with actual pointers to what the relo runtime would want given that offset, so, for instance, a Class offset would be replaced by a client J9Class pointer. We considered this when I was implementing the new deserializer, I think. The TR_J9DeserializerSharedCache would be replaced with a trivial SCC interface that returns the "offsets" it's given.

I should mention that the only reason (2) is necessary right now is because we don't flag AOT-cached methods as using client SCC offsets or server serialization record ID offsets, and a single server can store methods compiled for clients that do and do not ignore their local SCC. If either were changed then we wouldn't need (2) right now.

For the ClassChain deserialization records, why does it need to be a "RAM class chain"? Wouldn't the ROM classes be sufficient as it is in a local AOT compilation?

Yes, in reality it's the ROM classes that are of interest. RAM class chains were used before I started working on the AOT cache, actually, even when the client was required to use a local SCC. I think RAM class chain ends up arising in AOT cache compilations because the server caches a lot of things by J9Class *, and builds up a J9Class * -> class serialization record association during compilation. They come up during deserialization because the client resolves class records into J9Class * values. In both cases you end up dealing with vector<J9Class *> values regularly.

@cjjdespres cjjdespres marked this pull request as ready for review January 14, 2025 18:47
@cjjdespres
Copy link
Contributor Author

I think that's it.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please correct a couple of small typos and I will merge it.

1. `ClassLoader`. The dynamic JVM entity associated to it is a `J9ClassLoader`. This record identifies a loader using the name of the first class it loaded, which is *very* heuristic, and when the record is deserialized into a `J9ClassLoader`, that loader will not be guaranteed to resemble the compile-time loader much at all. The deserializer gets a candidate for this record by looking one up in the persistent class loader table by name.
2. `Class`. The dynamic JVM entity associated to it is a `J9Class`. This record identifies a class using the serialization record ID of the `ClassLoader` record for its defining class loader, the name of the class, and the sha256 of (a normalized representation of) its `J9ROMClass`. The deserializer gets a candidate for this record by looking up the class by name in the class loader that was cached for the loader record ID, and then checking that the hashes of the two classes' ROM classes match. See the `JITServerHelpers::packROMClass()` method for the details of ROM class packing - we need to transform the ROM class before hashing because the ROM class both contains too little information (it can omit unicode strings) and too much information (it can contain intermediate class info, method debug info - these things are irrelevant to the JIT, so stripping them out improves `Class` deserialization and improves AOT cache deserialization success rates).
3. `Method`. The dynamic JVM entity associated to it is a `J9Method`. This record identifies a method using the serialization record ID of the `Class` record for its defining class, and the index of the method in its defining class. The deserializer gets a candidate by looking up the method by index in the `J9Class` that was already cached for the class record ID.
4. `ClassChain`. The dynamic JVM entity associated to it is a "RAM class chain", which is a vector whose first entry is a `J9Class *`, and whose subsequent entries are the entire class/interface hierarchy of that first class. (These are in thes ame order as local SCC class chains). This record identifies this chain with a list of the `Class` serialization record IDsfor these classes. The deserializer resolves this record by retrieving the already-cached classes for those class record IDs and making sure that the resulting chain is equal to the actual chain of the first class. You can think of this record (and its dependencies) as being a JVM-independent arbitrary class validation record.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typos:
"These are in thes ame order"
"IDsfor"

4. `ClassChain`. The dynamic JVM entity associated to it is a "RAM class chain", which is a vector whose first entry is a `J9Class *`, and whose subsequent entries are the entire class/interface hierarchy of that first class. (These are in thes ame order as local SCC class chains). This record identifies this chain with a list of the `Class` serialization record IDsfor these classes. The deserializer resolves this record by retrieving the already-cached classes for those class record IDs and making sure that the resulting chain is equal to the actual chain of the first class. You can think of this record (and its dependencies) as being a JVM-independent arbitrary class validation record.
5. `WellKnownClasses`. This record type is not associated to a particular JVM entity or relo data offset. Rather, it is a list of `ClassChain` record IDs that is used to reconstruct the well-known classes chain of an AOT method compiled with SVM.
6. `AOTHeader`. This record type is not associated to a particular JVM entity or relo data offset. Rather, it records the entire `TR_AOTHeader` of a method to check for compatibility during deserialization.
7. `Thunk`. This record type is not associated to a particular relo data offset. Rather, it records an entire J2I thunk referenced in the compiled code of a method so the client can relocate and install it during deserialization, and so have the thunk be available for the relo runtime to find during relocation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't read well: "and so have the thunk be available"

Signed-off-by: Christian Despres <[email protected]>
@mpirvu mpirvu merged commit 81a325b into eclipse-openj9:master Jan 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:doc comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants