You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Caused by: java.lang.AssertionError
at org.qcri.rheem.core.optimizer.enumeration.StageAssignmentTraversal.<init>(StageAssignmentTraversal.java:92)
at org.qcri.rheem.core.optimizer.enumeration.StageAssignmentTraversal.assignStages(StageAssignmentTraversal.java:112)
at org.qcri.rheem.core.plan.executionplan.ExecutionPlan.createFrom(ExecutionPlan.java:222)
at org.qcri.rheem.core.api.Job.createInitialExecutionPlan(Job.java:382)
at org.qcri.rheem.core.api.Job.doExecute(Job.java:247)
... 37 more
Apparently, this problem appears because inputDataQuanta and the map call are connected twice: via the regular data flow and via the broadcast. If one inserts a map(x->x) before the map call or before broadcasting, the example works fine.
The above test can be used to reproduce the bug and should be fixed.
The text was updated successfully, but these errors were encountered:
I started working on this issue in branch rheem-44.
I pinpointed that both the optimizer and executor assume that each input channel is only fed once into each operator. Above code does break this assumption. I added changes to make the optimizer aware of the possibility of accessing an input channel twice. However, above test still fails during the execution (in the maintenance of the execution lineage). I stop working on this now for two reasons:
I feel that there are much more potential problems with consuming a channel twice. If we do not fix all of them, we might run into bugs later, that are then even harder to spot and reproduce (e.g., in the re-optimization).
The issue is too specific and the work-around to easy to put a lot of effort in.
So, please feel free to pick up this issue if you are feeling like it. 😉
The following code
produces this error
Apparently, this problem appears because
inputDataQuanta
and themap
call are connected twice: via the regular data flow and via the broadcast. If one inserts amap(x->x)
before the map call or before broadcasting, the example works fine.The above test can be used to reproduce the bug and should be fixed.
The text was updated successfully, but these errors were encountered: