-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto fill history planes #452
Comments
There are several graphs made by Trevor in this thread: They show auto filling with the same position is an improvement over auto filling with zeros. And this is much easier than trying to guess a real history. |
I think this is worth implementing/experimenting when things stabilize (they kind of started to stabilize now, but there is already switching on resignation and switch to lc0 in a queue). |
Auto filling with the same position seems like a good first step. Easy to implement and seems to work very well too. I would propose to add that first, and then benchmark auto-fill vs same-fill, if results are comparable, same-fill should have the preference. |
I created a PR with some additional details here: #632 While I completely agree that this is less important than waiting for things to stabilize, I also think it's very unfortunate that people keep testing and forming opinions of Leela's strength based on cold-start positions. This is an easy improvement for that. Although, as noted in the PR, if the start position isn't handled as a special case, this will cause Leela to operate slightly different in self-play training data generation and match play at the opening (it looks like copying the history has a slightly flattening effect to the policy near the start position) -- but I don't think that effect will remain significant for long. |
I created a similar PR for for copying last position into history for lc0 |
Regarding dropping history completely... I've seen evidence that Leela's network cares a lot about the previous position, but not much about the others (2 through 7, where 0 is current). First point of evidence. Input weights to previous position are big, but all others before that are small. Second point of evidence... There just isn't very that much of a difference in the second node evaluation when history 2 through 7 is copied vs left all zero, as seen in the NextMove value comparisons here: However - copying is still better than leaving zero for those positions. The fact that Leela's network seems to care a lot about the previous position and only the previous position suggests that it is in fact using that feature, and that removing history completely will have a detrimental effect. Furthermore, removing history completely would prevent the network from seeing en passant moves. So I'd think that at a minimum, an en passant plane would have to be added, which would be fairly disruptive. |
Beside en passant, you probably also need another plane that indicate if a move is going to be a repetition. You might get away with ignoring 2 folds, but 3 folds are important, else the network completely depend on search for that. Not providing 2 or 3 folds might result in missed draws. Disruptive changes like that will need a complete new bootstrap of the NN. I'm glad only the first history plane is really used, that indeed indicates it's not making any stupid use of the history plane (like assuming a piece is still on a position because it was there before). |
Instead of going drastic like in #443
It would be cool if somebody added the ability to leela to produce some random history if it is missing. As long as this history includes a capture or pawn move as the last move, it should not affect 3 fold rule.
It might not be possible to do this for all positions, but for any legal position it shouldn't be too hard. The moves do not need to be realistic as long as they are legal right?
The text was updated successfully, but these errors were encountered: