Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy/repeat history planes #632

Open
wants to merge 2 commits into
base: next
Choose a base branch
from

Conversation

so-much-meta
Copy link

Many people want Leela to perform well in cold-start positions. However, the network evaluations suffer when there is no history. I propose that if position history is not available, then simply copy planes from the oldest available history.

This PR does this. The effects of making this change (policy evals at position + value evals after bestmove) are shown for the win-at-chess positions as tracked at win-at-chess tracking
See: win-at-chess--history-compare.pdf
I've also taken a look at the effect of rule50 input in cold start start positions.
See: win-at-chess--history--rule50.pdf

It is clear that copying history is a significant improvement for cold start positions.

There is potential for negative impact at the start position, and moves immediately succeeding it. However, looking at the start position and first black move evaluations, the impact seems to be minimal and will quickly be trained away -- and MCTS will mostly result in the same move selection anyway. It appears that there is a flattening effect to the policy, so the immediate impact would simply be slightly more diverse opening choices.

Of course, it would be possible to track the start position node and not fill history if the oldest node in history is the start position. Or simply parameterize the engine. However, consistency is the simplest option.

Here is go nodes 800 evaluation at startpos without history

info string    b3 ->       3   (V: 48.76%) (N:  1.21%) PV: b3 e5 Bb2 Nc6
info string    a3 ->       3   (V: 49.24%) (N:  1.20%) PV: a3 e5 c4
info string    d3 ->       3   (V: 49.70%) (N:  1.00%) PV: d3 d5 Nf3 Nf6
info string   Nc3 ->       4   (V: 49.80%) (N:  1.18%) PV: Nc3 d5 e4 d4 Nce2
info string    c3 ->       5   (V: 49.72%) (N:  1.62%) PV: c3 e5 d4 e4
info string    e3 ->      14   (V: 50.81%) (N:  2.92%) PV: e3 Nf6 Nf3 g6 d4
info string    g3 ->      15   (V: 50.26%) (N:  3.68%) PV: g3 d5 Nf3 Nf6 Bg2 g6 d4 Bg7 O-O
info string   Nf3 ->      46   (V: 51.31%) (N:  7.27%) PV: Nf3 d5 d4 Nf6 c4 e6 cxd5 exd5 Nc3 c6 Bf4 Bd6
info string    c4 ->      52   (V: 51.18%) (N:  8.62%) PV: c4 e5 g3 g6 d4 exd4 Qxd4 Nf6 Nc3 Bg7
info string    d4 ->     185   (V: 51.46%) (N: 21.35%) PV: d4 Nf6 c4 e6 Nf3 d5 cxd5 exd5 Nc3 c6 Bf4 Bd6 Bxd6 Qxd6 e3 O-O Qc2 Bg4 Ne5
info string    e4 ->     327   (V: 51.58%) (N: 44.47%) PV: e4 c5 Nf3 d6 Bb5+ Bd7 Bxd7+ Qxd7 O-O Nf6 Re1 e6 c3 Nc6 d4 cxd4 cxd4 d5 e5 Ne4
info string stm White winrate 51.38%

Here is go nodes 800 evaluation at startpos with copied/fake history

info string    a3 ->       3   (V: 48.92%) (N:  1.13%) PV: a3 e5 c4
info string    b3 ->       3   (V: 48.98%) (N:  1.20%) PV: b3 e5 Bb2 Nc6
info string    d3 ->       4   (V: 49.80%) (N:  1.20%) PV: d3 d5 Nf3 Nf6 Nbd2
info string    c3 ->       5   (V: 49.39%) (N:  1.62%) PV: c3 e5 d4 e4 c4
info string   Nc3 ->       6   (V: 50.13%) (N:  1.43%) PV: Nc3 d5 e4 d4 Nce2
info string    g3 ->       7   (V: 50.13%) (N:  1.83%) PV: g3 d5 Nf3 Nf6 Bg2
info string    e3 ->      11   (V: 50.77%) (N:  2.30%) PV: e3 Nf6 d4 g6 c4
info string   Nf3 ->      52   (V: 51.31%) (N:  8.33%) PV: Nf3 d5 d4 Nf6 c4 e6 cxd5 exd5 Nc3 c6 Bf4 Bd6
info string    c4 ->      68   (V: 51.29%) (N: 10.65%) PV: c4 c5 Nf3 Nf6 Nc3 e6 d4 cxd4
info string    d4 ->     270   (V: 51.41%) (N: 29.48%) PV: d4 d5 c4 e6 Nf3 Nf6 cxd5 exd5 Nc3 c6 Bf4 Bd6 Bxd6 Qxd6 e3 O-O Qc2 Bg4 Ne5
info string    e4 ->     321   (V: 51.55%) (N: 35.75%) PV: e4 c5 Nf3 d6 Bb5+ Bd7 Bxd7+ Qxd7 O-O Nf6 Re1 e6 c3 Nc6 d4 cxd4 cxd4 d5
info string stm White winrate 51.37%

@killerducky killerducky changed the base branch from master to next May 19, 2018 23:19
@so-much-meta
Copy link
Author

so-much-meta commented May 20, 2018

Note - per message from Alexander Lyashuk, this should just be considered an experiment (not for merge) until a consistent approach is decided upon for both lc0 and lczero.

See also: #633

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant