Fixed reward overloading bug which results in incorrect rewards being used by POMDPs.solve #14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It looks like this reward overloading is causing issues with the SARSOP solver not outputting any meaningful solution. I was using this code on the RockSample task; when I looked at what reward is being written into the model.pomdpx, I see that all rewards are zero since both s and s' are always the same state. My environment is
macOS 11.2.2
Julia 1.3.1
(v1.3) pkg> st
Status
~/.julia/environments/v1.3/Project.toml
[8bb6e9a1] BeliefUpdaters v0.2.1
[159f3aea] Cairo v1.0.5
[a81c6b42] Compose v0.9.2
[a0b5b9ef] Cxx v0.4.0
[4b033969] DiscreteValueIteration v0.4.5
[7f35509c] POMDPGifs v0.1.0
[abefb91b] POMDPModelChecking v0.0.0
[08074719] POMDPModelTools v0.3.2
[182e52fb] POMDPPolicies v0.3.3
[e0d0a172] POMDPSimulators v0.3.9
[a93abf59] POMDPs v0.9.1
[92933f4c] ProgressMeter v1.4.0
[295af30f] Revise v3.1.12
[de008ff0] RockSample v0.1.3
[cef570c6] SARSOP v0.5.4
[f11abc24] Spot v0.1.1
[10745b16] Statistics
but I also had the same issue on a RedHat Linux server as well. Removing this reward line fixes the issue.