-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul implementation of for-generators #844
base: main
Are you sure you want to change the base?
Conversation
Motivation: - Perform same exception handling for every implementation of PklRootNode.execute(). - Avoid code duplication. Changes: - Change PklRootNode.execute() to be a final method that performs exception handling and calls abstract method executeImpl(), which is implemented by subclasses. - Remove executeBody() methods, which served a similar purpose but were more limited. - Remove duplicate exception handling code. Result: - More reliable exception handling. This should fix known problems such as misclassifying stack overflows as internal errors and displaying errors without a stack trace. - Less code duplication.
Motivation: * fix known bugs and limitations of for-generators * improve code health by removing complex workarounds Changes: * simplify AstBuilder code related to for-generators * track for-generators via `SymbolTable.enterForGenerator()` * add `RestoreForBindingsNode` during initial AST construction instead of calling `MemberNode.replaceBody()` later on * simplify some unnecessarily complex code * remove workarounds and band-aids such as: * `isInIterable` * `executeAndSetEagerly` * adding dummy slots in `AmendFunctionNode` * overhaul implementation of for-generators * store keys and values of for-generator iterations in regular instead of auxiliary frame slots * set them via `TypeNode.executeAndSet()` * `ResolveVariableNode` no longer needs to search auxiliary slots * `Read(Enclosing)AuxiliarySlot` is no longer needed * at the start of each for-generator iteration, create a new `VirtualFrame` that is a copy of the current frame (arguments + slots) and stores the iteration key and value in additional slots. * execute for-generator iteration with the newly created frame * `childNode.execute(newFrame)` * Pkl objects created during the iteration will materialize this frame * store newly created frames in `owner.extraStorage` if their for-generator slots may be accessed when a generated member is executed * resolving variable names to for-generator variables at parse time would make this analysis more precise * when a generated member is executed, * retrieve the corresponding frame stored in `owner.extraStorage` * copy the retrieved frame's for-generator slots into slots of the current frame Result: * for-generators are implemented in a correct, reasonably simple, and reasonably efficient way * complexity is fully contained within package `generator` and `AstBuilder` * for-generator keys and values can be accessed from all nested scopes: * key and value expressions of generated members * condition expressions of nested when-generators * iterable expressions of nested for-generators * for-generator keys and values can be accessed from within objects created by the expressions listed above * sibling for-generators can use the same key/value variable names * parent/child for-generators can use the same key/value variable names * fixes apple#741 Limitations not addressed in this PR: * object spreading is eager in values This should be easy to fix. * for-generators are eager in values I think this could be fixed by: * resolving variable names to for-generator variables at parse time * replacing every access to a for-generator's `value` with `iterable[key]` * for/when-generator bodies can't have local properties/methods I think this could be fixed by: * resolving variable names to local properties/methods at parse time * internally renaming generated local properties/methods to avoid name clashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is a good improvement in how for-generators work!
I did a first pass, take a look at my comments.
Also, this is slower than the current implementation. Some quick numbers:
// test.pkl
amends "pkl:Benchmark"
outputBenchmarks {
["for-generator"] {
sourceModule = import("test2.pkl")
}
}
// test2.pkl
res {
for (i in IntSeq(1, 10000)) {
i
}
}
Running this benchmark produces:
Current Pkl:
outputBenchmarks {
["for-generator"] {
iterations = 15
repetitions = 50
min = 2.86.ms
max = 3.01.ms
mean = 2.96.ms
stdev = 0.05.ms
error = 0.03.ms
}
}
This PR:
outputBenchmarks {
["for-generator"] {
iterations = 15
repetitions = 50
min = 3.5.ms
max = 3.88.ms
mean = 3.64.ms
stdev = 0.1.ms
error = 0.06.ms
}
}
Tested on macOS/M1, testing the native executable (built with -DreleaseBuild=true
)
|
||
private static FrameDescriptor.Builder newFrameDescriptorBuilder(FrameDescriptor descriptor) { | ||
var builder = FrameDescriptor.newBuilder(); | ||
for (int i = 0; i < descriptor.getNumberOfSlots(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (int i = 0; i < descriptor.getNumberOfSlots(); i++) { | |
for (var i = 0; i < descriptor.getNumberOfSlots(); i++) { |
// Only a subset of members have their frames stored (`GeneratorMemberNode.isFrameStored`). | ||
// Frames are stored in `owner.extraStorage` and retrieved by `RestoreForBindingsNode` | ||
// when members are executed. | ||
private final EconomicMap<Object, MaterializedFrame> generatorFrames; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code:
foo {
for (i in IntSeq(1, 100)) {
i
}
}
Results in this generatorFrames
map (in Pkl syntax), where there is really only one frame (represented as THE_FRAME
):
Map(
1, THE_FRAME,
2, THE_FRAME,
3, THE_FRAME,
4, THE_FRAME,
5, THE_FRAME,
// ...
100, THE_FRAME
)
I'm not sure if this is the right model; there's many members for just this one frame. It's doing a lot of extra allocation here.
It seems quite extra to use for-generator keys as a way to look up the same materialized frame.
Food for thought: maybe the lookup key can be a synthesized name for the for-generator. It doesn't matter too much what this name is, but it should be simple enough to use that for lookup in RestoreForBindingsNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry... disregard that comment. I've had too much eggnog (or that's my excuse, anyway).
One issue I do see here, though, is that we are materializing the same frame multiple times in the case of many generator members in the same for body. E.g.
foo {
for (i in someList) {
i + 1
i + 2
i + 3
}
}
And, according to the contract of the API, this allocates a new frame. So, let's guard against it; something like this should work:
diff --git a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
index 03ae904c2..0bce42c8c 100644
--- a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
+++ b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
@@ -24,6 +24,7 @@ import org.pkl.core.ast.member.ObjectMember;
import org.pkl.core.runtime.VmObject;
import org.pkl.core.runtime.VmUtils;
import org.pkl.core.util.EconomicMaps;
+import org.pkl.core.util.Nullable;
/** Data collected by {@link GeneratorObjectLiteralNode} to generate a `VmObject`. */
public final class ObjectData {
@@ -36,12 +37,14 @@ public final class ObjectData {
private final EconomicMap<Object, MaterializedFrame> generatorFrames;
// The object's number of elements.
private int length;
+ private @Nullable MaterializedFrame currentFrame;
ObjectData(int parentLength) {
// optimize for memory usage by not estimating minimum size
members = EconomicMaps.create();
generatorFrames = EconomicMaps.create();
length = parentLength;
+ currentFrame = null;
}
UnmodifiableEconomicMap<Object, ObjectMember> members() {
@@ -56,6 +59,10 @@ public final class ObjectData {
return generatorFrames.isEmpty();
}
+ void resetForBindings() {
+ currentFrame = null;
+ }
+
void addElement(VirtualFrame frame, ObjectMember member, GeneratorMemberNode node) {
addMember(frame, (long) length, member, node);
length += 1;
@@ -70,8 +77,11 @@ public final class ObjectData {
CompilerDirectives.transferToInterpreter();
throw node.duplicateDefinition(key, member);
}
+ if (currentFrame == null) {
+ currentFrame = frame.materialize();
+ }
if (node.isFrameStored) {
- EconomicMaps.put(generatorFrames, key, frame.materialize());
+ EconomicMaps.put(generatorFrames, key, currentFrame);
}
}
diff --git a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
index 865b6c632..b01afbe94 100644
--- a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
+++ b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
@@ -68,7 +68,7 @@ public abstract class GeneratorForNode extends GeneratorMemberNode {
@Override
public final void execute(VirtualFrame frame, Object parent, ObjectData data) {
- initialize(frame);
+ initialize(frame, data);
executeWithIterable(frame, parent, data, iterableNode.executeGeneric(frame));
}
@@ -164,7 +164,8 @@ public abstract class GeneratorForNode extends GeneratorMemberNode {
}
}
- private void initialize(VirtualFrame frame) {
+ private void initialize(VirtualFrame frame, ObjectData data) {
+ data.resetForBindings();
if (unresolvedKeyTypeNode != null) {
CompilerDirectives.transferToInterpreterAndInvalidate();
var keySlot = frame.getFrameDescriptor().getNumberOfSlots();
if (EconomicMaps.put(data.members, key, member) != null) { | ||
CompilerDirectives.transferToInterpreter(); | ||
throw duplicateDefinition(key, member); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind getting rid of this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not removed, just moved from node classes to class ObjectData
(see method addMember
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that you called this out in your PR description. So, effectively this relaxes our current rule of: you cannot re-use the same for-generator name in a nested for-generator.
This code, which is currently invalid, becomes valid:
obj {
for (bar in something) {
for (bar in somethingElse) ... }
}
}
I don't see any issues with this, and lines up with how other languages work (for loops can create nested scopes that shadow outer variables).
CC @stackoverflow @holzensp for comments.
BTW: re: this comment:
sibling for-generators can use the same key/value variable names
This is possible today, too.
var convertedKey = member.isProp() ? key.toString() : key; | ||
// TODO: Executing iteration behind a Truffle boundary is bad for performance. | ||
// This and similar cases will be fixed in an upcoming PR that replaces method | ||
// `(forceAnd)iterateMemberValues` with cursor-based external iterators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely interested in a future PR of yours that addresses this.
But, we currently have a (small) optimization that this PR removes, in evaluateMembers
. I think we should bring that back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several places in the codebase that call Node.execute from behind a Truffle boundary. The PR that fixes all of them has been ready from my side for over a month, but due to the slow review progress, I haven't sent it yet. Please let's not bring back this workaround; it's not worth it and will only slow us down further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay; don't need to block here!
Correctness has a performance cost here. However, we can probably also find benchmarks where the new implementation performs better than the old one. For example,
This instantiates 100 frames. Each frame captures one iteration's key/value binding. I'm pretty confident this is the right model (even discussed it in Truffle Slack). It's a more efficient form of having a root node for the loop body and calling it 100 times, which would also result in 100 frames. The old implementation did a similar allocation for every iteration: https://github.com/apple/pkl/blob/main/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorMemberNode.java#L115-L117 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: performance regression:
I tested these changes against some real-world code, and this is what time
has to say about it:
Pkl 0.27:
________________________________________________________
Executed in 299.86 secs fish external
usr time 29.09 mins 0.11 millis 29.09 mins
sys time 1.01 mins 2.81 millis 1.01 mins
These changes:
________________________________________________________
Executed in 297.81 secs fish external
usr time 28.74 mins 53.00 micros 28.74 mins
sys time 1.00 mins 889.00 micros 1.00 mins
So, all in all, the changes introduced here are actually slightly faster in my one run (although not significant enough to rule out noise), so, my performance concerns really aren't that high.
I made another pass. I still need to get through more of this code, but want to provide some more feedback rather than delay too much more.
// when members are executed. | ||
private final EconomicMap<Object, MaterializedFrame> generatorFrames; | ||
// The object's number of elements. | ||
private int length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] turn all of these into JavaDoc comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you prefer Javadoc for internal code? (Using Javadoc for internal code will become less painful with Java 23 Markdown doc comments.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if these don't get turned into actual javadoc, it's nice because the IDE provides insight here (you get hover-over docs, for example)
// Only a subset of members have their frames stored (`GeneratorMemberNode.isFrameStored`). | ||
// Frames are stored in `owner.extraStorage` and retrieved by `RestoreForBindingsNode` | ||
// when members are executed. | ||
private final EconomicMap<Object, MaterializedFrame> generatorFrames; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry... disregard that comment. I've had too much eggnog (or that's my excuse, anyway).
One issue I do see here, though, is that we are materializing the same frame multiple times in the case of many generator members in the same for body. E.g.
foo {
for (i in someList) {
i + 1
i + 2
i + 3
}
}
And, according to the contract of the API, this allocates a new frame. So, let's guard against it; something like this should work:
diff --git a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
index 03ae904c2..0bce42c8c 100644
--- a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
+++ b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/ObjectData.java
@@ -24,6 +24,7 @@ import org.pkl.core.ast.member.ObjectMember;
import org.pkl.core.runtime.VmObject;
import org.pkl.core.runtime.VmUtils;
import org.pkl.core.util.EconomicMaps;
+import org.pkl.core.util.Nullable;
/** Data collected by {@link GeneratorObjectLiteralNode} to generate a `VmObject`. */
public final class ObjectData {
@@ -36,12 +37,14 @@ public final class ObjectData {
private final EconomicMap<Object, MaterializedFrame> generatorFrames;
// The object's number of elements.
private int length;
+ private @Nullable MaterializedFrame currentFrame;
ObjectData(int parentLength) {
// optimize for memory usage by not estimating minimum size
members = EconomicMaps.create();
generatorFrames = EconomicMaps.create();
length = parentLength;
+ currentFrame = null;
}
UnmodifiableEconomicMap<Object, ObjectMember> members() {
@@ -56,6 +59,10 @@ public final class ObjectData {
return generatorFrames.isEmpty();
}
+ void resetForBindings() {
+ currentFrame = null;
+ }
+
void addElement(VirtualFrame frame, ObjectMember member, GeneratorMemberNode node) {
addMember(frame, (long) length, member, node);
length += 1;
@@ -70,8 +77,11 @@ public final class ObjectData {
CompilerDirectives.transferToInterpreter();
throw node.duplicateDefinition(key, member);
}
+ if (currentFrame == null) {
+ currentFrame = frame.materialize();
+ }
if (node.isFrameStored) {
- EconomicMaps.put(generatorFrames, key, frame.materialize());
+ EconomicMaps.put(generatorFrames, key, currentFrame);
}
}
diff --git a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
index 865b6c632..b01afbe94 100644
--- a/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
+++ b/pkl-core/src/main/java/org/pkl/core/ast/expression/generator/GeneratorForNode.java
@@ -68,7 +68,7 @@ public abstract class GeneratorForNode extends GeneratorMemberNode {
@Override
public final void execute(VirtualFrame frame, Object parent, ObjectData data) {
- initialize(frame);
+ initialize(frame, data);
executeWithIterable(frame, parent, data, iterableNode.executeGeneric(frame));
}
@@ -164,7 +164,8 @@ public abstract class GeneratorForNode extends GeneratorMemberNode {
}
}
- private void initialize(VirtualFrame frame) {
+ private void initialize(VirtualFrame frame, ObjectData data) {
+ data.resetForBindings();
if (unresolvedKeyTypeNode != null) {
CompilerDirectives.transferToInterpreterAndInvalidate();
var keySlot = frame.getFrameDescriptor().getNumberOfSlots();
* Copies `numberOfLocalsToCopy` locals from `sourceFrame`, starting at `firstSourceSlot`, to | ||
* `targetFrame`, starting at `firstTargetSlot`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Copies `numberOfLocalsToCopy` locals from `sourceFrame`, starting at `firstSourceSlot`, to | |
* `targetFrame`, starting at `firstTargetSlot`. | |
* Copies {@code numberOfLocalsToCopy} locals from {@code sourceFrame}, starting at {@code firstSourceSlot}, to | |
* {@code targetFrame}, starting at {@code firstTargetSlot}. |
for (int i = 0; i < numberOfLocalsToCopy; i++) { | ||
var sourceSlot = firstSourceSlot + i; | ||
var targetSlot = firstTargetSlot + i; | ||
// If, for a particular call site of this method, | ||
// slot kinds of `sourceDescriptor` will reach a steady state, | ||
// then slot kinds of `targetDescriptor` will too. | ||
var slotKind = sourceDescriptor.getSlotKind(sourceSlot); | ||
switch (slotKind) { | ||
case Boolean -> { | ||
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Boolean); | ||
targetFrame.setBoolean(targetSlot, sourceFrame.getBoolean(sourceSlot)); | ||
} | ||
case Long -> { | ||
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Long); | ||
targetFrame.setLong(targetSlot, sourceFrame.getLong(sourceSlot)); | ||
} | ||
case Double -> { | ||
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Double); | ||
targetFrame.setDouble(targetSlot, sourceFrame.getDouble(sourceSlot)); | ||
} | ||
case Object -> { | ||
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Object); | ||
targetFrame.setObject( | ||
targetSlot, | ||
sourceFrame instanceof MaterializedFrame | ||
// Even though sourceDescriptor.getSlotKind is now Object, | ||
// it may have been a primitive kind when `sourceFrame`'s local was written. | ||
// Hence we need to read the local with getValue() instead of getObject(). | ||
? sourceFrame.getValue(sourceSlot) | ||
: sourceFrame.getObject(sourceSlot)); | ||
} | ||
default -> { | ||
CompilerDirectives.transferToInterpreter(); | ||
throw new VmExceptionBuilder().bug("Unexpected FrameSlotKind: " + slotKind).build(); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: this comment:
Even though sourceDescriptor.getSlotKind is now Object,
it may have been a primitive kind whensourceFrame
's local was written.
Hence we need to read the local with getValue() instead of getObject().
I wonder if there is a deeper issue with how we are using frame descriptors. Right now, our WriteFrameSlotNode
will update the frame descriptor, but I'm not sure if that's totally right. For example, com.oracle.truffle.api.impl.FrameWithoutBoxing#clear
doesn't update the frame descriptor either (see https://graalvm.slack.com/archives/CNQSB2DHD/p1675722351829269 for more details).
In any case, you can avoid the weird edge case by looking at the frame slot tag instead:
for (int i = 0; i < numberOfLocalsToCopy; i++) { | |
var sourceSlot = firstSourceSlot + i; | |
var targetSlot = firstTargetSlot + i; | |
// If, for a particular call site of this method, | |
// slot kinds of `sourceDescriptor` will reach a steady state, | |
// then slot kinds of `targetDescriptor` will too. | |
var slotKind = sourceDescriptor.getSlotKind(sourceSlot); | |
switch (slotKind) { | |
case Boolean -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Boolean); | |
targetFrame.setBoolean(targetSlot, sourceFrame.getBoolean(sourceSlot)); | |
} | |
case Long -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Long); | |
targetFrame.setLong(targetSlot, sourceFrame.getLong(sourceSlot)); | |
} | |
case Double -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Double); | |
targetFrame.setDouble(targetSlot, sourceFrame.getDouble(sourceSlot)); | |
} | |
case Object -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Object); | |
targetFrame.setObject( | |
targetSlot, | |
sourceFrame instanceof MaterializedFrame | |
// Even though sourceDescriptor.getSlotKind is now Object, | |
// it may have been a primitive kind when `sourceFrame`'s local was written. | |
// Hence we need to read the local with getValue() instead of getObject(). | |
? sourceFrame.getValue(sourceSlot) | |
: sourceFrame.getObject(sourceSlot)); | |
} | |
default -> { | |
CompilerDirectives.transferToInterpreter(); | |
throw new VmExceptionBuilder().bug("Unexpected FrameSlotKind: " + slotKind).build(); | |
} | |
} | |
} | |
for (var i = 0; i < numberOfLocalsToCopy; i++) { | |
var sourceSlot = firstSourceSlot + i; | |
var targetSlot = firstTargetSlot + i; | |
var slotKind = FrameSlotKind.fromTag(sourceFrame.getTag(sourceSlot)); | |
switch (slotKind) { | |
case Boolean -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Boolean); | |
targetFrame.setBoolean(targetSlot, sourceFrame.getBoolean(sourceSlot)); | |
} | |
case Long -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Long); | |
targetFrame.setLong(targetSlot, sourceFrame.getLong(sourceSlot)); | |
} | |
case Double -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Double); | |
targetFrame.setDouble(targetSlot, sourceFrame.getDouble(sourceSlot)); | |
} | |
case Object -> { | |
targetDescriptor.setSlotKind(targetSlot, FrameSlotKind.Object); | |
targetFrame.setObject(targetSlot, sourceFrame.getObject(sourceSlot)); | |
} | |
default -> { | |
CompilerDirectives.transferToInterpreter(); | |
throw new VmExceptionBuilder().bug("Unexpected FrameSlotKind: " + slotKind).build(); | |
} | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue I do see here, though, is that we are materializing the same frame multiple times in the case of many generator members in the same for body
Multiple calls to frame.materialize()
are guaranteed to return the same materialized frame (which is a reference to the mutable virtual frame, not an immutable snapshot thereof). This seems important for Pkl performance because otherwise, every new X {...}
expression within the same root node would create a new materialized frame.
Right now, our WriteFrameSlotNode will update the frame descriptor, but I'm not sure if that's totally right.
WriteFrameSlotNode
updating the frame descriptor is how type profiling of locals works in Truffle. (The goal is to avoid boxing of primitives, esp. in interpreted code. I don't know if this profiling, which isn't free, improves Pkl real-world performance.) Note that the slot kind can only change from primitive type to Object
, i.e., it can only become more general.
In any case, you can avoid the weird edge case by looking at the frame slot tag instead:
I considered this but concluded it's incorrect. Source frames that share the same descriptor may be copied in an order that differs from the order they were executed/profiled in. If we ignore the source descriptor's slot kind during copying, the target descriptor's slot kind isn't guaranteed to reach a steady state. For example, long-long-long-Object-Object-Object may turn into Object-long-Object-long-Object-long. As far as I understand, this isn't desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple calls to frame.materialize() are guaranteed to return the same materialized frame (which is a reference to the mutable virtual frame, not an immutable snapshot thereof).
I can't find any docs that guarantee this. Can you provide a reference?
Their own docs say it returns a new frame, even if the implementation disagrees.
I considered this but concluded it's incorrect. Source frames that share the same descriptor may be copied in an order that differs from the order they were executed/profiled in. If we ignore the source descriptor's slot kind during copying, the target descriptor's slot kind isn't guaranteed to reach a steady state. For example, long-long-long-Object-Object-Object may turn into Object-long-Object-long-Object-long. As far as I understand, this isn't desirable.
Ah, I see. That makes sense!
In that case, I don't understand this ternary:
sourceFrame instanceof MaterializedFrame
? sourceFrame.getValue(sourceSlot)
: sourceFrame.getObject(sourceSlot))
Why is it safe to call sourceFrame.getObject
if it is not a MaterializedFrame
? And, in practice, it's always true; FrameWithoutBoxing
implements MaterializedFrame
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find any docs that guarantee this. Can you provide a reference?
No, but I'm sure it's true. In interpreted code, frame.materialize()
always returns frame
. In Graal JITted code, frame.materialize()
always yields the same MaterializedFrame
instance. Feel free to double-check by asking in Truffle Slack.
Why is it safe to call sourceFrame.getObject if it is not a MaterializedFrame?
Because then sourceFrame
is a VirtualFrame
, which means that copyLocals
is called while sourceFrame
is active (a VirtualFrame
may only be used within RootNode.execute
), which guarantees that sourceFrame.getDescriptor
accurately describes sourceFrame
.
And, in practice, it's always true;
It's always true in interpreted code, where frames are regular Java objects allocated on the Java heap. Once Graal JIT kicks in, VirtualFrame
is optimized away (that's what makes it "virtual").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I'm sure it's true. In interpreted code, frame.materialize() always returns frame. In Graal JITted code, frame.materialize() always yields the same MaterializedFrame instance. Feel free to double-check by asking in Truffle Slack.
Yup, seems like you're right; reference: https://graalvm.slack.com/archives/CNQSB2DHD/p1735839284090239
Because then sourceFrame is a VirtualFrame, which means that copyLocals is called while sourceFrame is active (a VirtualFrame may only be used within RootNode.execute), which guarantees that sourceFrame.getDescriptor accurately describes sourceFrame.
I think I'm missing something here? I can cause an active frame's descriptor to disagree with the frame slot's values, e.g.
execute(VirtualFrame frame) {
frame.getFrameDescriptor().setSlotKind(0, FrameSlotKind.Object);
frame.setBoolean(0, true);
}
I can this pass this frame to copyLocals
. How do we have guarantees that the descriptor accurately describes the frame?
Based on: #837
Motivation:
Changes:
SymbolTable.enterForGenerator()
RestoreForBindingsNode
during initial AST constructioninstead of calling
MemberNode.replaceBody()
later onobjectMemberInserter
isInIterable
executeAndSetEagerly
AmendFunctionNode
TypeNode.executeAndSet()
ResolveVariableNode
no longer needs to search auxiliary slotsRead(Enclosing)AuxiliarySlot
is no longer neededVirtualFrame
that is a copy of the current frame (arguments + slots)
and stores the iteration key and value in additional slots.
childNode.execute(newFrame)
owner.extraStorage
if their for-generator slots may be accessed when a generated member is executed
* retrieve the corresponding frame stored in
owner.extraStorage
* copy the retrieved frame's for-generator slots into slots of the current frame
Result:
generator
andAstBuilder
Limitations not addressed in this PR:
This should be easy to fix.
I think this could be fixed by:
value
withiterable[key]
I think this could be fixed by: