-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyze blunders, part 2 #164
Comments
Here's a game from the original thread that still applies today, as far as I understand it. ID: QueenTrap Comments: Specifically, there are two potential ways to try to get the queen out after Rg6 that Leela at first thinks can work. There's 22. Qxh7, but that falls to 22. .. Rxg2+ and a discovered attack on the queen - a Leela weakness. There's also 22. Rxe5, which fails to 22. .. Rxg7 and white cannot take the black queen on f5 because that undefends black's mate threat (Rxe1#) -- setting up a recapture that doesn't work because of undefending a mate threat is another Leela tactical weakness. ID485:
It still has e6g6 buried pretty deep, and thus it's never avoiding this trap. ID 10067:
It still takes 192K nodes from the blunder position to avoid g3g7
I find it interesting that the test nets have similar tactical weaknesses to the original net - which suggests they are just "hard" for the NN to learn? Perhaps starting at a larger net will help with that, perhaps certain positions just require a lot of nodes to overcome tactical weaknesses. I don't think there's a "bug" here to fix, but it's an instructive position. |
I'm hoping that upping the training cpuct from 1.2 will help prevent this. These searches used the default lc0 search parameters, right? |
I read that it can be reproducible on all networks, including main nets which are trained with "high" cpuct, so there is no evidence that different cpuct would help. I don't object changing cpuct for training runs, but don't expect from it to change things much. |
there are no such nets which are as strong as recent main or test nets |
This was all done with default search parameters. And yes, it's been a problem for hundreds of networks, including several of the test nets (I haven't tested them all). There are a few simpler positions from the original thread that showcase individual tactical weaknesses. They have shown improvement since about ID 450? Instead of the right move, or the refutation move, being policy 0.2% or worse, they're up over 1% and thus UCT can find them in 1-2k nodes, which is reasonable. But with a complex situation like this one, it has to overcome a bad policy down several lines, and the difference in eval is high so it takes a lot of nodes to move the eval enough even after hitting on the right moves. |
Got a blunder in a test game today. Time control was 30 moves in 60 minutes: https://lichess.org/MoD5vHSJ%22%5D |
Test10 ID 10104 in the following game it played 48...h5?? missing the tactic Rf8+ Kh7 Qc2+ Be4 Qxe4 Rxe4(removing the pin, this is one of the 2 the classic themes of Leela's tactical blunders) gxh3.
Hardware for Leela was GTX 1070 Ti, Lc0(1 July) was used, test10 ID 10104 and time control 40/2 repeating. Leela was just unlucky as the following analysis of the position(of the FEN, not PGN, but with PGN analysis it also avoid h5 in exactly 8 seconds too) shows. Lc0 Test10 10104:
|
Net-520-20180727 Blunder? |
ID TCEC-13.23.2.1 During TCEC. Leela versus Senpai. It's not a blunder but a spike of evaluation from 0.81 to 4.71 then back to 0.33.
The Eval spiked at move 27. Qf5. https://pasteboard.co/HydTAQf.png You can try to reproduce with: ./lc0 --nncache=2000000 --verbose-move-stats position fen 4rrk1/pp3pp1/3R1n1p/2q1nB2/5B2/2P2P2/P1Q2PKP/3R4 b - - 10 22 moves f6h5 f4g3 h5f6 d1d4 b7b5 f5d3 a7a5 d3e2 b5b4 |
ID TCEC-13.23.2.2 Bad Move: 79. Rxc3 Event "TCEC Season 13 - Division 4"]
(Tablebase draw) |
ID: TCEC - Season 13 - Divison 3 - Game 19.1 Black: LC0 0.16.1 (TCEC version) Move 29 ..Bg4 is wrong NET 10776 with release 16.0 is calculating Bc8 after 3.000.000 moves. Why did version 16.1 - Net 10520 not found this solution?
|
In the final position of the following PGN Leela as black has just lost a Knight. And the position is dead lost for black. So she should give a big positive score(test10 nets do that, for example 11089 gives +5.00 scores). Leela = Lc0v17 cuda default settings, 20230 net, with GTX 1070 Ti in infinite analysis mode.
[Event "?"]
Yet, perhaps this is not a real issue and just test20 has a very different mapping of the evaluation scores and this tiny looking +0.60 to correspond to +5.0 for test10 nets. Furthermore 20230 net doesn't want to play even for a moment the 5...Be7?? move that gives then Knight. |
Some default settings are not good for high node counts. cPUCT in particular should be higher than default. Have you tried using the CCCC settings (but with table) and seeing if that fixes it? |
About the game Leela-Fizbo at CCCC:
Leela = Lc0v17 cuda default settings, 11089 net, with GTX 1070 Ti, WITH 3,4,5,6 syzygy TBs, in infinite analysis mode. It keeps 133.h5 up to 1.300.000 nodes with around 60.000 TB hits and then goes to 133.Qb5 and then to 133.Qc7 with around 230.000 TB hits.
After 133.Qc7 following again Leela's recommendations for both players: Yet Leela after 4.200.000 nodes and 250.000 TB hits does not find either and wants to play Qd8+ that just draws! Lc0v17 11089:
|
No blunders anymore. |
(part 1 was here)
This is to gather fresh examples of blunders (
./lc0
, on nets trained in July 2018 or later)Important!
When reporting positions to analyze, please use the following form. It makes it easier to see what's problematic with the position:
lc0
/lczero
version, operating system, and non-default parameters (number of threads, batch size, fpu reduction, etc).(old text below)
There are many reports on forums asking about blunders, and the answers so far had been something along the lines "it's fine, it will learn eventually, we don't know exactly why it happens".
I think at this point it makes sense to actually look into them to confirm that there no some blind spots in training. For that we need to:
--temperature=1.0 --noise
)" to see how training data would look like for this position.Eventually all of this would be nice to have as a single command, but we can start manually.
For
lc0
, that can be done this way:--verbose-move-stats -t 1 --minibatch-size=1 --no-smart-pruning
(unless you want to debug specifically with other settings).Then run UCI interface, do command:
(PGN move to UCI notation can be converted using
pgn-extract -Wuci
)Then do:
see results, add some more nodes by running:
And look how counters change.
Counters:
Help wanted:
The text was updated successfully, but these errors were encountered: