Skip to content

Commit

Permalink
Reorganize markdown docs
Browse files Browse the repository at this point in the history
  • Loading branch information
IlyaMuravjov committed May 5, 2024
1 parent e06f67d commit 7494806
Show file tree
Hide file tree
Showing 8 changed files with 426 additions and 251 deletions.
63 changes: 10 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,76 +1,33 @@
## CFPQ_PyAlgo

The CFPQ_PyAlgo is a repository for developing, testing and evaluating solvers for
Formal-Language-Constrained Path Problems, such as Context-Free Path Queries and Regular Path Queries.
Formal-Language-Constrained Path Problems, such as Context-Free Path Queries (CFPQ) and Regular Path Queries (RPQ).

All algorithms are based on the [GraphBLAS](http://graphblas.org/index.php?title=Graph_BLAS_Forum) framework that allows to represent graphs as matrices
All algorithms are based on the [GraphBLAS](http://graphblas.org) framework that allows to represent graphs as matrices
and work with them in terms of linear algebra.

## Installation
First of all you need to clone repository with its submodules:

```bash
git clone --recurse-submodules -b murav/optimize-matrix https://github.com/JetBrains-Research/CFPQ_PyAlgo.git
cd CFPQ_PyAlgo/
git submodule init
git submodule update
```
Then the easiest way to get started is to use Docker. An alternative is to install everything directly.

### Using Docker
The first way to start is to use Docker:

```bash
# build docker image
docker build --tag cfpq_py_algo .

# run docker container
docker run --rm -it -v ${PWD}:/CFPQ_PyAlgo cfpq_py_algo bash
```
After it, you can develop everything locally and run tests and benchmarks inside the container.
Also, you can use PyCharm Professional and [configure an interpreter using Docker](https://www.jetbrains.com/help/pycharm/using-docker-as-a-remote-interpreter.html).

### Direct install
The other way is to install everything into your local Python 3.9 interpreter or virtual environment.

First of all you need to install [pygraphblas](https://github.com/michelp/pygraphblas) package.
```bash
pip3 install pygraphblas==5.1.8.0
```
Secondly you need to install `cfpq_data_devtools` package and other requirements:

```bash
cd deps/CFPQ_Data
pip3 install -r requirements.txt
python3 setup.py install --user

cd ../../
pip3 install pygraphblas==5.1.8.0 # optional (needed for legacy algorithms and their tests)
pip3 install -r requirements.txt
```
To check if the installation was successful you can run simple tests
```bash
python3 -m pytest test -v -m "CI"
```
For the installation instructions, refer to [docs/install.md](docs/install.md).

## CLI
CFPQ_Algo provides a command line interface for running
CFPQ_PyAlgo provides a command line interface for running
all-pairs CFPQ solver with relation query semantics.

See [cfpq_cli/README](cfpq_cli/README.md) for more details.
For more details, refer to [docs/cli.md](docs/cli.md).

## Evaluation

CFPQ_PyAlgo provides scripts for performing evaluating performance
of various CFPQ solvers (icluding third-party ones).
CFPQ_PyAlgo provides scripts for evaluating performance
of various CFPQ solvers (including third-party ones).

See [cfpq_eval/README](cfpq_eval/README.md) for more details.
For more details, refer to [docs/eval.md](docs/eval.md).

## Project structure
The global project structure is the following:
The global project structure is the following.

```
├── cfpq_algo - new optimized CFPQ algorithm implementations
├── cfpq_algo - FastMatrixCFPQ and MatrixCFPQ algorithms implementations
├── cfpq_cli - scripts for running CFPQ algorithms
├── cfpq_eval - scripts for evaluating performance of various CFPQ solvers (icluding third-party ones)
├── cfpq_matrix - matrix wrappers that improve performance of operations with matrices
Expand Down
105 changes: 3 additions & 102 deletions cfpq_cli/README.md
Original file line number Diff line number Diff line change
@@ -1,106 +1,7 @@
# CFPQ_CLI
## CFPQ CLI

The `cfpq_cli` module provides a Command Line Interface (CLI) for solving
Context-Free Language Reachability (CFL-r) problem for all vertex pairs
[the Context-Free Language Reachability (CFL-r) problem](../docs/clfr_problem) for all vertex pairs
in a graph with respect to a specified context-free grammar.

## Getting Started

Ensure the CFPQ_PyAlgo project is properly set up on your system before using the CLI.
Setup instructions are available in the project's main [README](../README.md).

## Usage

### Running the Script

For detailed information on script options, execute the following command:

```bash
cd .. # Should be run from CFPQ_PyAlgo project root directory
python3 -m cfpq_cli.run_all_pairs_cflr --help
```

The basic command usage is as follows:

```
python3 -m cfpq_cli.run_all_pairs_cflr [OPTIONS] ALGORITHM GRAPH GRAMMAR
```

- `ALGORITHM` selects the algorithm. The available options are `IncrementalAllPairsCFLReachabilityMatrix` and `NonIncrementalAllPairsCFLReachabilityMatrix`.
- `GRAPH` specifies the path to the graph file.
- `GRAMMAR` indicates the path to the grammar file.

#### Optional Arguments

- `--time-limit TIME_LIMIT` sets the maximum execution time in seconds.
- `--out OUT` specifies the output file for saving vertex pairs.
- `--disable-optimize-block-matrix` disables the optimization of block matrices.
- `--disable-optimize-empty` disables the optimization for empty matrices.
- `--disable-lazy-add` disables lazy addition optimization.
- `--disable-optimize-format` disables optimization of matrix formats.

### Example

To solve the CFL-R problem using an incremental algorithm with a 60-second time limit for
[indexed_tree.g](../test/pocr_data/indexed_an_bn/indexed_tree.g) and
[an_bn_indexed.cnf](../test/pocr_data/indexed_an_bn/an_bn_indexed.cnf) and get results in
[results.txt](../results.txt) execute:

```bash
cd .. # Should be run from CFPQ_PyAlgo project root directory
python3 -m cfpq_cli.run_all_pairs_cflr \
IncrementalAllPairsCFLReachabilityMatrix \
test/pocr_data/indexed_an_bn/indexed_tree.g \
test/pocr_data/indexed_an_bn/an_bn_indexed.cnf \
--time-limit 60 \
--out results.txt
```

### Grammar Format

The grammar file should be formatted with each production rule on a separate line, adhering to the following schema:

```
<LEFT_SYMBOL> [RIGHT_SYMBOL_1] [RIGHT_SYMBOL_2]
```

- `<LEFT_SYMBOL>`: the symbol on the left-hand side of a production rule.
- `<RIGHT_SYMBOL_1>` and `<RIGHT_SYMBOL_2>`: the symbols on the right-hand side of the production rule, each of them is optional.
- The symbols must be separated by whitespace.
- The last two line specify the start symbol in the format
```
Count:
<START_SYMBOL>
```

#### Example
```
S AS_i b_i
AS_i a_i S
S c
Count:
S
```

### Graph Format

The graph file should represent edges using the format:

```
<EDGE_SOURCE> <EDGE_DESTINATION> <EDGE_LABEL> [LABEL_INDEX]
```

- `<EDGE_SOURCE>` and `<EDGE_DESTINATION>`: specify the source and destination nodes of an edge.
- `<EDGE_LABEL>`: the label associated with the edge.
- `[LABEL_INDEX]`: an optional index for labels with subscripts, indicating the subscript value.
- The symbols must be separated by whitespace
- Labels with subscripts must end with "\_i". For example, an edge $1 \xrightarrow{x_10} 2$ is denoted by `1 2 x_i 10`.

#### Example
```
1 2 a_i 1
2 3 b_i 1
2 4 b_i 2
1 5 c
```
For more details, refer to [docs/cli.md](../docs/cli.md).
100 changes: 4 additions & 96 deletions cfpq_eval/README.md
Original file line number Diff line number Diff line change
@@ -1,98 +1,6 @@
# CFPQ Evaluation
## CFPQ Evaluator

The `cfpq_eval` module evaluates performance of various CFPQ solvers,
integrating with both CFPQ_PyAlgo itself and third-party tools.
The `cfpq_eval` module is responsible for evaluating performance of various Context-Free Path Querying (CFPQ) solvers,
including both CFPQ_PyAlgo itself and third-party tools.

## Setting up the environment

Build and run a Docker container for evaluation using [Dockerfile-all-tools](../Dockerfile-all-tools).

Build Docker image:
```bash
cd .. # Should be run from CFPQ_PyAlgo project root directory

# Load base image
wget -O pearl.tar.gz https://figshare.com/ndownloader/files/42214812
docker load --input pearl.tar.gz
rm pearl.tar.gz

# Build eval image
docker build -f Dockerfile-all-tools -t cfpq/py_algo_eval .
```

Run Docker container:
```bash
docker run -it cfpq/py_algo_eval bash
```

## Running the Script

For detailed information on evaluation script options, execute the following command:

```bash
cd .. # Should be run from CFPQ_PyAlgo project root directory
python3 -m cfpq_eval.eval_all_pairs_cflr --help
```

The basic command usage is as follows:

```
# Should be run in cfpq_eval Docker container
python3 -m cfpq_eval.eval_all_pairs_cflr algo_config.csv data_config.csv results_path [--rounds ROUNDS] [--timeout TIMEOUT]
```

- `algo_config.csv` specifies algorithm configurations (e.g. `configs/algo/fast_matrix_cfpq.csv`).
- `data_config.csv` specifies the dataset (e.g. `configs/data/small_examples.csv`).
- `results_path` specifies path for saving raw results.
- `--rounds` sets run times per config (default is 1).
- `--timeout` limits each configuration's execution time in seconds (optional).

## Configuration Files

### Premade Configurations

The CFPQ_eval Docker image includes premade configurations located in the `/py_algo/configs` folder.
### Algorithm Configuration

The `algo_config.csv` outlines algorithms and settings. Supported algorithms:

- `IncrementalAllPairsCFLReachabilityMatrix`
- `NonIncrementalAllPairsCFLReachabilityMatrix`
- `pocr`
- `pearl`
- `graspan`
- `gigascale`

For Matrix-based algorithms options described in [cfpq_cli/README](../cfpq_cli/README.md)
can be used to alter the behaviour.

#### Example

```
algo_name,algo_settings
"Matrix (some optimizations disabled)",IncrementalAllPairsCFLReachabilityMatrix --disable-optimize-empty --disable-lazy-add
"pocr",pocr
```

### Data Configuration

The `data_config.csv` pairs graph and grammar files,
referenced files should be in format described in [cfpq_cli/README](../cfpq_cli/README.md).

#### Example

```
graph_path,grammar_path
data/graphs/aa/leela.g,data/grammars/aa.cnf
data/graphs/java/eclipse.g,data/grammars/java_points_to.cnf
```

## Interpreting Results

Raw data is saved to `results_path`, while quick summary including mean execution time,
memory usage, and output size are rendered in standard output stream.

## Custom Tools Integration

Custom CFPQ solvers can be evaluated by implementing `AllPairsCflrToolRunner` interface
and updating `run_appropriate_all_pairs_cflr_tool()` function.
For more details, refer to [docs/eval.md](../docs/eval.md).
10 changes: 10 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
## CFPQ_PyAlgo Documentation

This folder contains documentation for CFPQ_PyAlgo project.

## Contents

- [Solver installation](install.md)
- [Solver usage](cli.md)
- [Performance evaluator installation](eval_install.md)
- [Performance evaluator usage](eval.md)
Loading

0 comments on commit 7494806

Please sign in to comment.