-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix blackouts feature error and add new unit tests #41
Conversation
…n by zero Additionally, refactored code to make it more robust and readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems a lot better than previous ones, thank you! I think for a future one it might benefit from having the bug fix commit really focused i.e. the only code that changes in that commit is code that needs to change to fix the bug, and any refactoring is done separately before / after. That would make it really simple for a reviewer to understand what you're changing, and think about how edge cases might affect the bug.
Sorry if it's just me, but I'm struggling to understand what was broken here:
the division by zero issue seems to be caused by the code in the LightStatus method - the condition step == _nextFrameSwitch * agentDecisionInterval
How does this condition cause a division by zero error, and how does the change you've made stop that from happening?
When episodeLength == 0, it’s impossible to have valid blackout values in the list because any blackout value (negative or non-negative) causes the validation to fail. This is due to the validation logic that checks if each blackout value is less than 0 or greater than or equal to episodeLength. With episodeLength == 0, any blackout value will fail this validation because it cannot be less than 0 and is not less than 0. As a result, blackouts didd not work when episodeLength (timeLimit) is zero. This issue was caused by a miscalculation in the validation logic. The updated code now correctly handles this case by ensuring that episodeLength is greater than zero. This prevents configurations where blackouts cannot function due to an episode length of zero. By validating that episodeLength is greater than zero, the code re-enables the functionality of blackouts and properly validates the new blackouts list. This is the most probable cause in my experience. Note that the logic is still sound. It's just when timeLimit ==0, this bug is apparent. |
- removed unnecessary exception handling. - removed explicit call to console.write to avoid unnecessary console logs The method is now more streamlined and clearer.
Sorry, I'm still not sure I get it. What's being divided by zero?
I thought the error was a zero division error. Is there a validation failure and then a zero division error?
How does this square with setting |
I think it's understandable to be confused as I think I wasn't too clear nor comprehensive on what was going on. However, for me it's clear what the problem is and how it's been addressed. Let's get you up to scratch. With the YAML provided above, let's go step by step (this is using the current main branch code) to reproduce and understand the problem and the fix.
This is the result of the division by zero error just below the console log: The stack trace:
" Now, we know that there is indeed a division by zero error in the trace. It's being called in As I've explained in the PR description what the error was, I'll put it here for completeness:
The PR mentions the specific error produced as we've reproduced. Now, the missing link here is the other part of why I think this was happening, specifically this point:
The problem is that AAI doesnt know what to do with episodeLength == 0 and blackouts is initialised (as a list). This is the reason why episodeLength == 0 results in the blackouts feature not working properly FOR THIS use case, explained above like so:
Blackouts for all other timeLimit value works fine. Remember as this was also the main reason why this PR was raised. Furthermore, I myself wasn't aware of this functionality - as the code did not have logic to handle the case of timeLimit == 0 with blackouts intervals (which was removed in earlier modifications of the script as a mistake or to fix the previous bug). I hope I shed some new light into this problem and the fix. I don't know if there is a need to simplify it further. I think the main thing to extract from this PR is that the lights.cs script and the functionality is more robust as a result and the original ability to have timeLimit ==0 and the blackouts working for this value works as intended now. Edit: To clarify, the episode will go infinite if timeLimit ==0 for any arena, but this isnt coupled with blackouts nor does it have anything to do with it other than the case then |
…eNumber() Now cleaner and readable.
The "Blackout interval is invalid" error is not a result of the divide by zero error; it is caused by the explicit throw in Lights.cs here. The zero division error happens in the second after "Blackout interval is invalid" - I think Unity is trying to handle the error we're throwing in an unhelpful way (The "Couldn't connect ... will perform inference instead" logline at the end of your first screenshot is interesting in that respect). So I feel that although these zero division errors are something that it would be worth looking into, I think they're a red herring and not actually related to the bug we're trying to fix here. -- Would it be right to say that this commit introduced extra validation on blackout values, that didn't take into account the possibility of If so, could this PR have been an update of line 59 of Lights.cs to something like this? (With appropriate unit test changes)
|
Unity will throw a generic error like division by zero when arena 0 isnt added to the arena configurations (the error makes sense as you can't divide by 0, but it's generic in the sense that it's not helpful much).
The root cause of this error is two fold:
In this code, were checking if each value in the list is more or equal to timeLimit:
This will result in an error because we're not accepting episodeLength as a value of 0.
The proposed changes already account for cases to handle such as this so and the tests are updated already in my previous commits. The original tests didnt account for episodeLenth == 0 as I didn't know this was an acceptable value due to the lack of the feature being used and no documentation, which is why the bug propagated this far. This has been resolved in code, hand-in-hand with better tests. I hope this makes sense now. |
Okay, thank you I think I've gotten my head around it now! Does this change still keep the validation in the code block in your most recent comment when the timeLimit isn't set to zero (i.e. throw if As a side note, I wonder if we should really only throw if |
Only throw if blackout > _episodeLength, as blackout == _episodeLength is acceptable. --- Edge case would be if: blackouts: [10, 100] -> meaning exactly equal to timeLimit (assuming timeLimit == 100) An invalid case would be: (blackouts: [10, 101] -> Invalid, as the blackout exceeds the timeLimit of 100.)
The new check throws an exception only if the blackout sequence has a value greater than _episodeLength, allowing a blackout to extend up to and including the end of the episode (blackout == _episodeLength). See latest commit. |
Okay, thank you for chatting through with me. Assuming this tests alright I'm happy with this one! |
PR Description
From AAI 3.1.3 upwards it seems the ability to use
blackouts
withtimeLimit = 0
is broken.Specific Errors Produced:
Lights.cs
here:In the LightStatus method, the step is compared against _nextFrameSwitch * agentDecisionInterval. If agentDecisionInterval is 0 (which is possible if timeLimit is 0), it will result in a division by zero error. Furthermore, the error occurs because If timeLimit is set to 0 and blackouts are specified (can be any value), it creates a contradiction. The blackouts are meant to occur at specific steps within the episode, but an episode with a timeLimit of 0 has no steps to begin with.
The error propagates to a division by zero error further in the stack trace, as described below.
Indirectly, the division by zero is caused by the code in the LightStatus method - the condition step == _nextFrameSwitch * agentDecisionInterval can lead to a division by zero if agentDecisionInterval is zero (which was the case). Unity is simply unable to load the first arena into the arena configurations list as a result of the primary error.
Proposed change(s)
This PR fixes the error, introduces more robust code, and more tests for
Lights.cs
script (blackouts feature).The proposed PR code in the Lights.cs file handles the scenario where timeLimit is 0 and blackouts are used correctly. It allows for infinite blackout patterns using negative values in the blackout sequence, and the episodeLength (timeLimit) does not impact the blackout functionality.
It adds robust code to handle edge cases such as:
Useful links (Github issues, discussion threads, etc.)
#42
Types of change(s)
Checklist
Other comments (if any)
Tests:
InfiniteEnumerator_IncrementValueOnMoveNext
Tests if the InfiniteEnumerator correctly increments the value when MoveNext() is called.
InfiniteEnumerator_ResetToZero
Tests if the InfiniteEnumerator resets to zero when Reset() is called.
LightsSwitch_InitializeWithValidParameters
Tests if the LightsSwitch can be initialized with valid parameters without throwing an exception.
LightsSwitch_HandleInfiniteBlackouts
Tests if the LightsSwitch correctly handles infinite blackouts.
LightsSwitch_ResetBehavior
Tests if the LightsSwitch resets its state correctly when Reset() is called.
LightsSwitch_HandleEmptyBlackoutList
Tests if the LightsSwitch handles an empty blackout list correctly.
LightsSwitch_HandleDifferentAgentDecisionIntervals
Tests if the LightsSwitch handles different agent decision intervals correctly.
LightsSwitch_ThrowExceptionForInvalidBlackoutSequence
Tests if the LightsSwitch throws an exception for an invalid blackout sequence.
LightsSwitch_HandleMultipleBlackouts
Tests if the LightsSwitch handles multiple blackouts correctly.
LightsSwitch_HandleInfiniteBlackoutsWithNegativeNumber
Tests if the LightsSwitch handles infinite blackouts with a negative number correctly.
LightsSwitch_ThrowExceptionForNegativeEpisodeLength
Tests if the LightsSwitch throws an exception for a negative episode length.
Yaml config example with error reproducible:
Screenshots (if any)