Skip to content

Commit

Permalink
Merge pull request #564 from glinscott/next
Browse files Browse the repository at this point in the history
Merge next v0.9 into master
  • Loading branch information
killerducky authored May 9, 2018
2 parents 51e03d6 + fd182c4 commit 39009b4
Show file tree
Hide file tree
Showing 47 changed files with 3,879 additions and 844 deletions.
112 changes: 56 additions & 56 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Of course, we also appreciate code reviews, pull requests and Windows testers!
## Example of compiling - Ubuntu 16.04

# Install dependencies
sudo apt install g++ git libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev
sudo apt install cmake g++ git libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev

# Test for OpenCL support & compatibility
sudo apt install clinfo && clinfo
Expand Down Expand Up @@ -76,78 +76,78 @@ The weights from the distributed training are downloadable from http://lczero.or

Weights that we trained to prove the engine was solid are here https://github.com/glinscott/lczero-weights. The best weights obtained through supervised learning on a human dataset were with elo ratings > 2000.

# Training a new net using self-play
# Training

Running the Training is not required to help the project, only the central server needs to do this.
The distributed part is running the client to create self-play games. Those games are uploaded on
http://lczero.org, and used as the input to the training process.
The training pipeline resides in `training/tf`, this requires tensorflow running on linux (Ubuntu 16.04 in this case).

After compiling lczero (see below), try the following:
```
cd build
cp ../scripts/train.sh .
./train.sh
```

This should launch lczero in training mode. It will begin self-play games, using the weights from weights.txt (initial weights can be downloaded from the repo above). The training data will be written into the data subdirectory.

Once you have enough games, you can simply kill the process.
## Data preparation

To run the training process, you need to have CUDA and Tensorflow installed.
See the instructions on the Tensorflow page (I used the pip installation method
into a virtual environment). NOTE: You need a GPU accelerated version of
Tensorflow to train, the CPU version doesn't support the input data format that
is used.
In order to start a training session you first need to download trainingdata from http://lczero.org/training_data. This data is packed in tar.gz balls each containing 10'000 games or chunks as we call them. Preparing data requires the following steps:

Then, make sure to set up your config. Important fields to edit are the path the
network is stored in, and the path to the input data.
```
cd training/tf
./parse.py configs/your-config.yaml
tar -xzf games11160000.tar.gz
ls training.* | parallel gzip {}
```

That will bring up Tensorflow and start running training. You can look at the config file in `training/tf/configs/example.yaml` to get an idea of all the configurable parameters. This config file is meant to be a unified configuration for all the executable pythonscripts in the training directory. After starting the above command, you should see output like this:
```
2018-01-12 09:57:00.089784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0, compute capability: 5.2)
2018-01-12 09:57:13.126277: I tensorflow/core/kernels/shuffle_dataset_op.cc:110] Filling up shuffle buffer (this may take a while): 43496 of 65536
2018-01-12 09:57:18.175088: I tensorflow/core/kernels/shuffle_dataset_op.cc:121] Shuffle buffer filled.
step 100, policy loss=7.25049 mse=0.0988732 reg=0.254439 (0 pos/s)
step 200, policy loss=6.80895 mse=0.0904644 reg=0.255358 (3676.48 pos/s)
step 300, policy loss=6.33088 mse=0.0823623 reg=0.256656 (3652.74 pos/s)
step 400, policy loss=5.86768 mse=0.0748837 reg=0.258076 (3525.1 pos/s)
step 500, policy loss=5.42553 mse=0.0680195 reg=0.259414 (3537.3 pos/s)
step 600, policy loss=5.0178 mse=0.0618027 reg=0.260582 (3600.92 pos/s)
This repacks each chunk into a gzipped file ready to be parsed by the training pipeline. Note that the `parallel` command uses all your cores and can be installed with `apt-get install parallel`.

## Training pipeline

Now that the data is in the right format one can configure a training pipeline. This configuration is achieved through a yaml file, see `training/tf/configs/example.yaml`:

```yaml
%YAML 1.2
---
name: 'kb1-64x6' # ideally no spaces
gpu: 0 # gpu id to process on

dataset:
num_chunks: 100000 # newest nof chunks to parse
train_ratio: 0.90 # trainingset ratio
input: '/path/to/chunks/*/draw/' # supports glob

training:
batch_size: 2048 # training batch
total_steps: 140000 # terminate after these steps
shuffle_size: 524288 # size of the shuffle buffer
lr_values: # list of learning rates
- 0.02
- 0.002
- 0.0005
lr_boundaries: # list of boundaries
- 100000
- 130000
policy_loss_weight: 1.0 # weight of policy loss
value_loss_weight: 1.0 # weight of value loss
path: '/path/to/store/networks' # network storage dir

model:
filters: 64
residual_blocks: 6
...
step 4000, training accuracy=96.9141%, mse=0.00218292
Model saved in file: /home/gary/tmp/leela-chess/training/tf/leelaz-model-4000
```

It saves out the new model every 4000 steps. To evaluate the model, you can play it against itself or another AI:
```
cd src
cp ../training/tf/leelaz-model-4000.txt ./newweights.txt
cd ../scripts
./run.sh
```
The configuration is pretty self explanatory, if you're new to training I suggest looking at the [machine learning glossary](https://developers.google.com/machine-learning/glossary/) by google. Now you can invoke training with the following command:

This runs an evaluation match using [cutechess-cli](https://github.com/cutechess/cutechess).
```bash
./train.py --cfg configs/example.yaml --output /tmp/mymodel.txt
```

## Supervised training
This will initialize the pipeline and start training a new neural network. You can view progress by invoking tensorboard:

If you have expert games you wish to train from in PGN, you can generate
training data from those for the network to learn from. Run:
```
./lczero --supervise games.pgn
```bash
tensorboard --logdir leelalogs
```
That will create a folder `supervise-games`, with the training data. You can
then train a network against that as usual.

## Stopping/starting training
If you now point your browser at localhost:6006 you'll see the trainingprogress as the trainingsteps pass by. Have fun!

It is safe to kill the training process and restart it at any time. It will
automatically resume using the tensorflow checkpoint.
## Restoring models

The training pipeline will automatically restore from a previous model if it exists in your `training:path` as configured by your yaml config. For initializing from a raw `weights.txt` file you can use `training/tf/net_to_model.py`, this will create a checkpoint for you.

## Supervised training

You can use this to adjust learning rates, etc.
Generating trainingdata from pgn files is currently broken and has low priority, feel free to create a PR.

# Other projects

Expand Down
2 changes: 1 addition & 1 deletion go/src/client/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ func getExtraParams() map[string]string {
return map[string]string{
"user": *USER,
"password": *PASSWORD,
"version": "8",
"version": "9",
}
}

Expand Down
2 changes: 2 additions & 0 deletions lc0/LC0VSProj/LC0VSProj.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,13 @@
<ItemGroup>
<ClCompile Include="..\src\chess\bitboard.cc" />
<ClCompile Include="..\src\chess\board.cc" />
<ClCompile Include="..\src\chess\position.cc" />
<ClCompile Include="..\src\engine.cc" />
<ClCompile Include="..\src\main.cc" />
<ClCompile Include="..\src\mcts\node.cc" />
<ClCompile Include="..\src\mcts\search.cc" />
<ClCompile Include="..\src\neural\cache.cc" />
<ClCompile Include="..\src\neural\encoder.cc" />
<ClCompile Include="..\src\neural\factory.cc" />
<ClCompile Include="..\src\neural\loader.cc" />
<ClCompile Include="..\src\neural\network_mux.cc" />
Expand Down
10 changes: 6 additions & 4 deletions lc0/meson.build
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
project('lc0', 'cpp')
# default_options : ['cpp_std=c++17'])
project('lc0', 'cpp', default_options : ['cpp_std=c++14'])

add_global_arguments('-std=c++17', '-Wthread-safety', language : 'cpp')
add_global_arguments('-Wthread-safety', language : 'cpp')
cc = meson.get_compiler('cpp')

# Installed from https://github.com/FloopCZ/tensorflow_cc
Expand All @@ -26,10 +25,11 @@ tensorflow_cc = declare_dependency(
deps = []
deps += tensorflow_cc
deps += cc.find_library('stdc++fs')
deps += cc.find_library('pthread')
deps += cc.find_library('libcublas', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
deps += cc.find_library('libcudnn', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
deps += cc.find_library('libcudart', dirs: ['/opt/cuda/lib64/', '/usr/local/cuda/lib64/'])
# deps += dependency('libprofiler')
# deps += cc.find_library('libprofiler', dirs: ['/usr/local/lib'])

nvcc = find_program('nvcc')
cuda_files = [
Expand All @@ -44,9 +44,11 @@ cuda_gen = generator(nvcc,
files = [
'src/chess/bitboard.cc',
'src/chess/board.cc',
'src/chess/position.cc',
'src/mcts/node.cc',
'src/mcts/search.cc',
'src/neural/cache.cc',
'src/neural/encoder.cc',
'src/neural/factory.cc',
'src/neural/loader.cc',
'src/neural/writer.cc',
Expand Down
116 changes: 97 additions & 19 deletions lc0/src/chess/board.cc
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ static const Move::Promotion kPromotions[] = {

} // namespace

MoveList ChessBoard::GeneratePseudovalidMoves() const {
MoveList ChessBoard::GeneratePseudolegalMoves() const {
MoveList result;
for (auto source : our_pieces_) {
// King
Expand Down Expand Up @@ -501,8 +501,88 @@ bool ChessBoard::IsUnderAttack(BoardSquare square) const {
return false;
}

std::vector<MoveExecution> ChessBoard::GenerateValidMoves() const {
MoveList move_list = GeneratePseudovalidMoves();
bool ChessBoard::IsLegalMove(Move move, bool was_under_check) const {
const auto& from = move.from();
const auto& to = move.to();

// If we are already under check, also apply move and check if valid.
// TODO(mooskagh) Optimize this case
if (was_under_check) {
ChessBoard board(*this);
board.ApplyMove(move);
return !board.IsUnderCheck();
}

// En passant. Complex but rare. Just apply
// and check that we are not under check.
if (from.row() == 4 && pawns_.get(from) && from.col() != to.col() &&
pawns_.get(7, to.col())) {
ChessBoard board(*this);
board.ApplyMove(move);
return !board.IsUnderCheck();
}

// If it's kings move, check that destination
// is not under attack.
if (from == our_king_) {
// Castlings were checked earlier.
if (std::abs(static_cast<int>(from.col()) - static_cast<int>(to.col())) > 1)
return true;
return !IsUnderAttack(to);
}

// Not check that piece was pinned. And it was, check that after the move
// it is still on like of attack.
int dx = from.col() - our_king_.col();
int dy = from.row() - our_king_.row();

// If it's not on the same file/rank/diagonal as our king, cannot be pinned.
if (dx != 0 && dy != 0 && std::abs(dx) != std::abs(dy)) return true;
dx = (dx > 0) - (dx < 0); // Sign.
dy = (dy > 0) - (dy < 0);
auto col = our_king_.col();
auto row = our_king_.row();
while (true) {
col += dx;
row += dy;
// Attacking line left board, good.
if (!BoardSquare::IsValid(row, col)) return true;
const BoardSquare square(row, col);
// The source square of the move is now free.
if (square == from) continue;
// The destination square if the move is our piece. King is not under
// attack.
if (square == to) return true;
// Our piece on the line. Not under attack.
if (our_pieces_.get(square)) return true;
if (their_pieces_.get(square)) {
if (dx == 0 || dy == 0) {
// Have to be afraid of rook-like piece.
return !rooks_.get(square);
} else {
// Have to be afraid of bishop-like piece.
return !bishops_.get(square);
}
return true;
}
}
}

MoveList ChessBoard::GenerateLegalMoves() const {
const bool was_under_check = IsUnderCheck();
MoveList move_list = GeneratePseudolegalMoves();
MoveList result;
result.reserve(move_list.size());

for (Move m : move_list) {
if (IsLegalMove(m, was_under_check)) result.emplace_back(m);
}

return result;
}

std::vector<MoveExecution> ChessBoard::GenerateLegalMovesAndPositions() const {
MoveList move_list = GeneratePseudolegalMoves();
std::vector<MoveExecution> result;

for (const auto& move : move_list) {
Expand Down Expand Up @@ -624,24 +704,22 @@ bool ChessBoard::HasMatingMaterial() const {
int our = __builtin_popcountll(our_pieces_.as_int());
int their = __builtin_popcountll(their_pieces_.as_int());
#endif
if (our > 2 || their > 2) {
if (our + their < 4) {
// K v K, K+B v K, K+N v K.
return false;
}
if (!our_knights().empty() || !their_knights().empty()) {
return true;
}

if (our == 1 || their == 1) return false;

bool odd_bishop = false;
bool even_bishop = false;
int bishop_count = 0;
for (auto x : bishops_) {
++bishop_count;
if (x.as_int() % 2)
odd_bishop = true;
else
even_bishop = true;
}
if (bishop_count > 1 && (even_bishop != odd_bishop)) return false;
return true;
// Only kings and bishops remain.

constexpr BitBoard kLightSquares(0x55AA55AA55AA55AAULL);
constexpr BitBoard kDarkSquares(0xAA55AA55AA55AA55ULL);

bool light_bishop = bishops_.intersects(kLightSquares);
bool dark_bishop = bishops_.intersects(kDarkSquares);
return light_bishop && dark_bishop;
}

string ChessBoard::DebugString() const {
Expand Down Expand Up @@ -691,4 +769,4 @@ string ChessBoard::DebugString() const {
return result;
}

} // namespace lczero
} // namespace lczero
11 changes: 8 additions & 3 deletions lc0/src/chess/board.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ class ChessBoard {

// Generates list of possible moves for "ours" (white), but may leave king
// under check.
MoveList GeneratePseudovalidMoves() const;
MoveList GeneratePseudolegalMoves() const;
// Applies the move. (Only for "ours" (white)). Returns true if 50 moves
// counter should be removed.
bool ApplyMove(Move move);
Expand All @@ -56,9 +56,14 @@ class ChessBoard {
// Checks if "our" (white) king is under check.
bool IsUnderCheck() const { return IsUnderAttack(our_king_); }
// Checks whether at least one of the sides has mating material.

bool HasMatingMaterial() const;
// Returns a list of valid moves and board positions after the move is made.
std::vector<MoveExecution> GenerateValidMoves() const;
// Generates legal moves.
MoveList GenerateLegalMoves() const;
// Check whether pseudolegal move is legal.
bool IsLegalMove(Move move, bool was_under_check) const;
// Returns a list of legal moves and board positions after the move is made.
std::vector<MoveExecution> GenerateLegalMovesAndPositions() const;

uint64_t Hash() const {
return HashCat({our_pieces_.as_int(), their_pieces_.as_int(),
Expand Down
Loading

0 comments on commit 39009b4

Please sign in to comment.