Merge pull request #564 from glinscott/next

Merge next v0.9 into master
glinscott · May 9, 2018 · 39009b4 · 39009b4
2 parents 51e03d6 + fd182c4
commit 39009b4
Show file tree

Hide file tree

Showing 47 changed files with 3,879 additions and 844 deletions.
diff --git a/README.md b/README.md
@@ -43,7 +43,7 @@ Of course, we also appreciate code reviews, pull requests and Windows testers!
 ## Example of compiling - Ubuntu 16.04
 
     # Install dependencies
-    sudo apt install g++ git libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev
+    sudo apt install cmake g++ git libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev
 
     # Test for OpenCL support & compatibility
     sudo apt install clinfo && clinfo
@@ -76,78 +76,78 @@ The weights from the distributed training are downloadable from http://lczero.or
 
 Weights that we trained to prove the engine was solid are here https://github.com/glinscott/lczero-weights. The best weights obtained through supervised learning on a human dataset were with elo ratings > 2000.
 
-# Training a new net using self-play
+# Training
 
-Running the Training is not required to help the project, only the central server needs to do this.
-The distributed part is running the client to create self-play games. Those games are uploaded on
-http://lczero.org, and used as the input to the training process.
+The training pipeline resides in `training/tf`, this requires tensorflow running on linux (Ubuntu 16.04 in this case). 
 
-After compiling lczero (see below), try the following:
-```
-cd build
-cp ../scripts/train.sh .
-./train.sh
-```
-
-This should launch lczero in training mode.  It will begin self-play games, using the weights from weights.txt (initial weights can be downloaded from the repo above).  The training data will be written into the data subdirectory.
-
-Once you have enough games, you can simply kill the process.
+## Data preparation
 
-To run the training process, you need to have CUDA and Tensorflow installed.
-See the instructions on the Tensorflow page (I used the pip installation method
-into a virtual environment).  NOTE: You need a GPU accelerated version of
-Tensorflow to train, the CPU version doesn't support the input data format that
-is used.
+In order to start a training session you first need to download trainingdata from http://lczero.org/training_data. This data is packed in tar.gz balls each containing 10'000 games or chunks as we call them. Preparing data requires the following steps:
 
-Then, make sure to set up your config. Important fields to edit are the path the
-network is stored in, and the path to the input data.
 ```
-cd training/tf
-./parse.py configs/your-config.yaml
+tar -xzf games11160000.tar.gz
+ls training.* | parallel gzip {}
 ```
 
-That will bring up Tensorflow and start running training. You can look at the config file in `training/tf/configs/example.yaml` to get an idea of all the configurable parameters. This config file is meant to be a unified configuration for all the executable pythonscripts in the training directory.  After starting the above command, you should see output like this:
-```
-2018-01-12 09:57:00.089784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0, compute capability: 5.2)
-2018-01-12 09:57:13.126277: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 43496 of 65536
-2018-01-12 09:57:18.175088: I tensorflow/core/kernels/shuffle_dataset_op.cc:121] Shuffle buffer filled.
-step 100, policy loss=7.25049 mse=0.0988732 reg=0.254439 (0 pos/s)
-step 200, policy loss=6.80895 mse=0.0904644 reg=0.255358 (3676.48 pos/s)
-step 300, policy loss=6.33088 mse=0.0823623 reg=0.256656 (3652.74 pos/s)
-step 400, policy loss=5.86768 mse=0.0748837 reg=0.258076 (3525.1 pos/s)
-step 500, policy loss=5.42553 mse=0.0680195 reg=0.259414 (3537.3 pos/s)
-step 600, policy loss=5.0178 mse=0.0618027 reg=0.260582 (3600.92 pos/s)
+This repacks each chunk into a gzipped file ready to be parsed by the training pipeline. Note that the `parallel` command uses all your cores and can be installed with `apt-get install parallel`.
+
+## Training pipeline
+
+Now that the data is in the right format one can configure a training pipeline. This configuration is achieved through a yaml file, see `training/tf/configs/example.yaml`:
+
+```yaml
+%YAML 1.2
+---
+name: 'kb1-64x6'                       # ideally no spaces
+gpu: 0                                 # gpu id to process on
+
+dataset: 
+  num_chunks: 100000                   # newest nof chunks to parse
+  train_ratio: 0.90                    # trainingset ratio
+  input: '/path/to/chunks/*/draw/'     # supports glob
+
+training:
+    batch_size: 2048                   # training batch
+    total_steps: 140000                # terminate after these steps
+    shuffle_size: 524288               # size of the shuffle buffer
+    lr_values:                         # list of learning rates
+        - 0.02
+        - 0.002
+        - 0.0005
+    lr_boundaries:                     # list of boundaries
+        - 100000
+        - 130000
+    policy_loss_weight: 1.0            # weight of policy loss
+    value_loss_weight: 1.0             # weight of value loss
+    path: '/path/to/store/networks'    # network storage dir
+
+model:
+  filters: 64
+  residual_blocks: 6
 ...
-step 4000, training accuracy=96.9141%, mse=0.00218292
-Model saved in file: /home/gary/tmp/leela-chess/training/tf/leelaz-model-4000
 ```
 
-It saves out the new model every 4000 steps.  To evaluate the model, you can play it against itself or another AI:
-```
-cd src
-cp ../training/tf/leelaz-model-4000.txt ./newweights.txt
-cd ../scripts
-./run.sh
-```
+The configuration is pretty self explanatory, if you're new to training I suggest looking at the [machine learning glossary](https://developers.google.com/machine-learning/glossary/) by google. Now you can invoke training with the following command:
 
-This runs an evaluation match using [cutechess-cli](https://github.com/cutechess/cutechess).
+```bash
+./train.py --cfg configs/example.yaml --output /tmp/mymodel.txt
+```
 
-## Supervised training
+This will initialize the pipeline and start training a new neural network. You can view progress by invoking tensorboard:
 
-If you have expert games you wish to train from in PGN, you can generate
-training data from those for the network to learn from.  Run:
-```
-./lczero --supervise games.pgn
+```bash
+tensorboard --logdir leelalogs
 ```
-That will create a folder `supervise-games`, with the training data.  You can
-then train a network against that as usual.
 
-## Stopping/starting training
+If you now point your browser at localhost:6006 you'll see the trainingprogress as the trainingsteps pass by. Have fun!
 
-It is safe to kill the training process and restart it at any time.  It will
-automatically resume using the tensorflow checkpoint.
+## Restoring models
+
+The training pipeline will automatically restore from a previous model if it exists in your `training:path` as configured by your yaml config. For initializing from a raw `weights.txt` file you can use `training/tf/net_to_model.py`, this will create a checkpoint for you.
+
+## Supervised training
 
-You can use this to adjust learning rates, etc.
+Generating trainingdata from pgn files is currently broken and has low priority, feel free to create a PR.
 
 # Other projects
 

diff --git a/go/src/client/main.go b/go/src/client/main.go
@@ -78,7 +78,7 @@ func getExtraParams() map[string]string {
 	return map[string]string{
 		"user":     *USER,
 		"password": *PASSWORD,
-		"version":  "8",
+		"version":  "9",
 	}
 }
 

diff --git a/lc0/LC0VSProj/LC0VSProj.vcxproj b/lc0/LC0VSProj/LC0VSProj.vcxproj
@@ -13,11 +13,13 @@
   <ItemGroup>
     <ClCompile Include="..\src\chess\bitboard.cc" />
     <ClCompile Include="..\src\chess\board.cc" />
+    <ClCompile Include="..\src\chess\position.cc" />
     <ClCompile Include="..\src\engine.cc" />
     <ClCompile Include="..\src\main.cc" />
     <ClCompile Include="..\src\mcts\node.cc" />
     <ClCompile Include="..\src\mcts\search.cc" />
     <ClCompile Include="..\src\neural\cache.cc" />
+    <ClCompile Include="..\src\neural\encoder.cc" />
     <ClCompile Include="..\src\neural\factory.cc" />
     <ClCompile Include="..\src\neural\loader.cc" />
     <ClCompile Include="..\src\neural\network_mux.cc" />

diff --git a/lc0/meson.build b/lc0/meson.build
@@ -1,7 +1,6 @@
-project('lc0', 'cpp')
-  #        default_options : ['cpp_std=c++17'])
+project('lc0', 'cpp', default_options : ['cpp_std=c++14'])
 
-add_global_arguments('-std=c++17', '-Wthread-safety', language : 'cpp')
+add_global_arguments('-Wthread-safety', language : 'cpp')
 cc = meson.get_compiler('cpp')
 
 # Installed from https://github.com/FloopCZ/tensorflow_cc
@@ -26,10 +25,11 @@ tensorflow_cc = declare_dependency(
 deps = []
 deps += tensorflow_cc
 deps += cc.find_library('stdc++fs')
+deps += cc.find_library('pthread')
 deps += cc.find_library('libcublas', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
 deps += cc.find_library('libcudnn', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
 deps += cc.find_library('libcudart', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
-# deps += dependency('libprofiler')
+# deps += cc.find_library('libprofiler', dirs: ['/usr/local/lib'])
 
 nvcc = find_program('nvcc')
 cuda_files = [
@@ -44,9 +44,11 @@ cuda_gen = generator(nvcc,
 files = [
   'src/chess/bitboard.cc',
   'src/chess/board.cc',
+  'src/chess/position.cc',
   'src/mcts/node.cc',
   'src/mcts/search.cc',
   'src/neural/cache.cc',
+  'src/neural/encoder.cc',
   'src/neural/factory.cc',
   'src/neural/loader.cc',
   'src/neural/writer.cc',

diff --git a/lc0/src/chess/board.cc b/lc0/src/chess/board.cc
@@ -171,7 +171,7 @@ static const Move::Promotion kPromotions[] = {
 
 }  // namespace
 
-MoveList ChessBoard::GeneratePseudovalidMoves() const {
+MoveList ChessBoard::GeneratePseudolegalMoves() const {
   MoveList result;
   for (auto source : our_pieces_) {
     // King
@@ -501,8 +501,88 @@ bool ChessBoard::IsUnderAttack(BoardSquare square) const {
   return false;
 }
 
-std::vector<MoveExecution> ChessBoard::GenerateValidMoves() const {
-  MoveList move_list = GeneratePseudovalidMoves();
+bool ChessBoard::IsLegalMove(Move move, bool was_under_check) const {
+  const auto& from = move.from();
+  const auto& to = move.to();
+
+  // If we are already under check, also apply move and check if valid.
+  // TODO(mooskagh) Optimize this case
+  if (was_under_check) {
+    ChessBoard board(*this);
+    board.ApplyMove(move);
+    return !board.IsUnderCheck();
+  }
+
+  // En passant. Complex but rare. Just apply
+  // and check that we are not under check.
+  if (from.row() == 4 && pawns_.get(from) && from.col() != to.col() &&
+      pawns_.get(7, to.col())) {
+    ChessBoard board(*this);
+    board.ApplyMove(move);
+    return !board.IsUnderCheck();
+  }
+
+  // If it's kings move, check that destination
+  // is not under attack.
+  if (from == our_king_) {
+    // Castlings were checked earlier.
+    if (std::abs(static_cast<int>(from.col()) - static_cast<int>(to.col())) > 1)
+      return true;
+    return !IsUnderAttack(to);
+  }
+
+  // Not check that piece was pinned. And it was, check that after the move
+  // it is still on like of attack.
+  int dx = from.col() - our_king_.col();
+  int dy = from.row() - our_king_.row();
+
+  // If it's not on the same file/rank/diagonal as our king, cannot be pinned.
+  if (dx != 0 && dy != 0 && std::abs(dx) != std::abs(dy)) return true;
+  dx = (dx > 0) - (dx < 0);  // Sign.
+  dy = (dy > 0) - (dy < 0);
+  auto col = our_king_.col();
+  auto row = our_king_.row();
+  while (true) {
+    col += dx;
+    row += dy;
+    // Attacking line left board, good.
+    if (!BoardSquare::IsValid(row, col)) return true;
+    const BoardSquare square(row, col);
+    // The source square of the move is now free.
+    if (square == from) continue;
+    // The destination square if the move is our piece. King is not under
+    // attack.
+    if (square == to) return true;
+    // Our piece on the line. Not under attack.
+    if (our_pieces_.get(square)) return true;
+    if (their_pieces_.get(square)) {
+      if (dx == 0 || dy == 0) {
+        // Have to be afraid of rook-like piece.
+        return !rooks_.get(square);
+      } else {
+        // Have to be afraid of bishop-like piece.
+        return !bishops_.get(square);
+      }
+      return true;
+    }
+  }
+}
+
+MoveList ChessBoard::GenerateLegalMoves() const {
+  const bool was_under_check = IsUnderCheck();
+  MoveList move_list = GeneratePseudolegalMoves();
+  MoveList result;
+  result.reserve(move_list.size());
+
+  for (Move m : move_list) {
+    if (IsLegalMove(m, was_under_check)) result.emplace_back(m);
+  }
+
+  return result;
+}
+
+std::vector<MoveExecution> ChessBoard::GenerateLegalMovesAndPositions() const {
+  MoveList move_list = GeneratePseudolegalMoves();
   std::vector<MoveExecution> result;
 
   for (const auto& move : move_list) {
@@ -624,24 +704,22 @@ bool ChessBoard::HasMatingMaterial() const {
   int our = __builtin_popcountll(our_pieces_.as_int());
   int their = __builtin_popcountll(their_pieces_.as_int());
 #endif
-  if (our > 2 || their > 2) {
+  if (our + their < 4) {
+    // K v K, K+B v K, K+N v K.
+    return false;
+  }
+  if (!our_knights().empty() || !their_knights().empty()) {
     return true;
   }
 
-  if (our == 1 || their == 1) return false;
-
-  bool odd_bishop = false;
-  bool even_bishop = false;
-  int bishop_count = 0;
-  for (auto x : bishops_) {
-    ++bishop_count;
-    if (x.as_int() % 2)
-      odd_bishop = true;
-    else
-      even_bishop = true;
-  }
-  if (bishop_count > 1 && (even_bishop != odd_bishop)) return false;
-  return true;
+  // Only kings and bishops remain.
+
+  constexpr BitBoard kLightSquares(0x55AA55AA55AA55AAULL);
+  constexpr BitBoard kDarkSquares(0xAA55AA55AA55AA55ULL);
+
+  bool light_bishop = bishops_.intersects(kLightSquares);
+  bool dark_bishop = bishops_.intersects(kDarkSquares);
+  return light_bishop && dark_bishop;
 }
 
 string ChessBoard::DebugString() const {
@@ -691,4 +769,4 @@ string ChessBoard::DebugString() const {
   return result;
 }
 
-}  // namespace lczero
+}  // namespace lczero
diff --git a/lc0/src/chess/board.h b/lc0/src/chess/board.h
@@ -47,7 +47,7 @@ class ChessBoard {
 
   // Generates list of possible moves for "ours" (white), but may leave king
   // under check.
-  MoveList GeneratePseudovalidMoves() const;
+  MoveList GeneratePseudolegalMoves() const;
   // Applies the move. (Only for "ours" (white)). Returns true if 50 moves
   // counter should be removed.
   bool ApplyMove(Move move);
@@ -56,9 +56,14 @@ class ChessBoard {
   // Checks if "our" (white) king is under check.
   bool IsUnderCheck() const { return IsUnderAttack(our_king_); }
   // Checks whether at least one of the sides has mating material.
+
   bool HasMatingMaterial() const;
-  // Returns a list of valid moves and board positions after the move is made.
-  std::vector<MoveExecution> GenerateValidMoves() const;
+  // Generates legal moves.
+  MoveList GenerateLegalMoves() const;
+  // Check whether pseudolegal move is legal.
+  bool IsLegalMove(Move move, bool was_under_check) const;
+  // Returns a list of legal moves and board positions after the move is made.
+  std::vector<MoveExecution> GenerateLegalMovesAndPositions() const;
 
   uint64_t Hash() const {
     return HashCat({our_pieces_.as_int(), their_pieces_.as_int(),