Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added new handling of terminal states in GenerativeBelief MDP #559

Merged
merged 3 commits into from
Jul 12, 2024

Conversation

zsunberg
Copy link
Member

Termination in GenerativeBeliefMDPs has been a pain point (see #555, #544). This PR changes the default behavior so that when a terminal state is sampled from a belief state, the new state is terminalstate.

@zsunberg
Copy link
Member Author

@WhiffleFish , @himanshugupta1009 , @the-one-and-only-jackson , @FlyingWorkshop you have all dealt with this recently. What do you think?

A downside is that now the state type for GenerativeBeliefMDP is the union of the belief type and TerminalState.

One thing to note is that I added a pdf(b, s) == 0 check to isterminal:

isterminal(bmdp::GenerativeBeliefMDP, b) = all(s -> isterminal(bmdp.pomdp, s) || pdf(b, s) == 0.0, support(b))

so, perhaps that alone could fix most of the problems? Maybe the default behavior should be to just keep trying to update and throw a warning to try terminal_behavior = TerminalStateTerminalBehavior if there is an error.

@the-one-and-only-jackson

A downside is that now the state type for GenerativeBeliefMDP is the union of the belief type and TerminalState.

I think a union type is necessary if there is a possibility of calling gen on a belief with terminal states in its support, so this seems like a reasonable solution.

In cases where this is not possible (i.e. a terminal state will cause a "terminal" observation, leading to a terminal belief) ContinueTerminalBehavior can be used to preserve type stability, and should be noted in the documentation.

One thing to note is that I added a pdf(b, s) == 0 check to isterminal

Is an identically zero check necessary, or should there be a tolerance check instead?

@zsunberg
Copy link
Member Author

Ok, given @the-one-and-only-jackson 's concurrence, I think this is a good idea. I'll finish the docs and merge. If anyone wants me to stop, let me know soon.

Is an identically zero check necessary, or should there be a tolerance check instead?

I think identically zero is better.

@zsunberg zsunberg merged commit d18d9d5 into master Jul 12, 2024
15 checks passed
@zsunberg zsunberg deleted the gbmdp-terminal branch July 12, 2024 19:56
zsunberg added a commit that referenced this pull request Jul 12, 2024
bump POMDPTools version after #559
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants