Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Acquire StateMachine is overwrite saved to DB state machine to initial state #1142

Open
ksilisk opened this issue Mar 8, 2024 · 2 comments
Labels
status/need-triage Team needs to triage and take a first look

Comments

@ksilisk
Copy link

ksilisk commented Mar 8, 2024

When using acquireStateMachine in DefaultStateMachineService the persisted machine to be overwritten with a new machine context before reaching the restore logic so the restored machine is always in the initial state.

After delving deeper into this, I found the source of the problem.

The problem lies in the fact that the AbstractStateMachineFactory overwrites the state of the machine when it is starting in delegateAutoStartup if the autoStartup property is true in this method.

At this moment all interceptors adding for every regions which correct.

List<StateMachineInterceptor<S,E>> interceptors = stateMachineModel.getConfigurationData().getStateMachineInterceptors();
if (interceptors != null) {
for (final StateMachineInterceptor<S, E> interceptor : interceptors) {
// add persisting interceptor hooks to all regions
RegionPersistingInterceptorAdapter<S, E> adapter = new RegionPersistingInterceptorAdapter<>(interceptor);
machine.getStateMachineAccessor().doWithAllRegions(function -> function.addStateMachineInterceptor(adapter));
}
}

But later we have problems, because occur calling the function callPostStateChangeInterceptors in AbstractStateMachine after starting created statemachine.

private void callPostStateChangeInterceptors(State<S,E> state, Message<E> message, Transition<S,E> transition, StateMachine<S, E> stateMachine) {
try {
getStateMachineInterceptors().postStateChange(state, message, transition, this, stateMachine);
} catch (Exception e) {
log.warn("Interceptors threw exception in post state change", e);
}
}

And finally in AbstractPersistingStateMachineInterceptor occur write initial state created in AbstractStateMachineFactory to DB.

write(buildStateMachineContext(stateMachine, rootStateMachine, state, message), (T)stateMachine.getId());

Also, we can view this problem in Redis with MONITOR command.

screen

In the DefaultStateMachineService, the first command is create a new machine from the stateMachineFactory, and then restore it from the database, but at the time of creation the persisted state is overwritten to initial state.

public StateMachine<S, E> acquireStateMachine(String machineId, boolean start) {
log.info("Acquiring machine with id " + machineId);
StateMachine<S, E> stateMachine;
// naive sync to handle concurrency with release
synchronized (machines) {
stateMachine = machines.get(machineId);
if (stateMachine == null) {
log.info("Getting new machine from factory with id " + machineId);
stateMachine = stateMachineFactory.getStateMachine(machineId);
if (stateMachinePersist != null) {
try {
StateMachineContext<S, E> stateMachineContext = stateMachinePersist.read(machineId);
stateMachine = restoreStateMachine(stateMachine, stateMachineContext);
} catch (Exception e) {
log.error("Error handling context", e);
throw new StateMachineException("Unable to read context from store", e);
}
}
machines.put(machineId, stateMachine);
}
}
// handle start outside of sync as it might take some time and would block other machines acquire
return handleStart(stateMachine, start);

I think reorder the command in acquireStateMachine in DefaultStateMachineService can solve this problem.

Tell me, can this problem be solved in a similar way?
If yes, then I'm ready to start working on this task.

Thanks!)

@github-actions github-actions bot added the status/need-triage Team needs to triage and take a first look label Mar 8, 2024
@ksilisk
Copy link
Author

ksilisk commented Mar 8, 2024

For other developers facing the same problem, I found two ways how to solve this:

  1. Disable autoStartup property if it possible
  2. Implement your own implementation of the StateMachineService with a modified order of creating and restoring a machine from the DB

@qdrin
Copy link

qdrin commented Mar 12, 2024

Hello, I'm meeting the same thing. Restoring from 3 orthogonal regions lead to weird: "left" region is being restored normally to saved state, and two others - to initial states.
I guess this issue was fixed a bit earlier in #998 but not released. I'd like to solve this problem anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/need-triage Team needs to triage and take a first look
Projects
None yet
Development

No branches or pull requests

2 participants