Inductive Logic Programming (ILP) Grammar and Linter

This defines a simple grammar (cmd/ILPLang.g4) and a command-line tool which can be used to lint for problems in dataset formatting.

Overview

The target is a linter binary to help point out issues when tokenizing or parsing a dataset.

Example 1: No Errors

When the dataset is well-formatted, nothing is returned:

smokes(person1).
friends(person1,person2).
friends(person2,person1).

./linter -tokens -file=examples/pos/pos1.txt
./linter -file=examples/pos/pos1.txt
# (No output for either case)

Example 2: Bad Data

When there is something in the data that cannot be recognized, problems are directed to stderr:

friends(person1,person2).
Bad Data.

./linter -tokens -file=examples/neg/neg1.txt
line 2:0 token recognition error at: 'B'
line 2:3 token recognition error at: ' '
line 2:4 token recognition error at: 'D'
./linter -file=examples/neg/neg1.txt
line 2:0 token recognition error at: 'B'
line 2:3 token recognition error at: ' '
line 2:4 token recognition error at: 'D'
line 2:5 missing '(' at 'ata'
line 2:8 mismatched input '.' expecting {')', ','}

Example 3: Regression Examples

The parser can also look for regressionExample values, used in regression data sets.

The parser will not check whether an entire dataset is correct (regressionExample in labeled as positive, empty negative examples, and facts). But this could be accomplished fairly easily elsewhere.

regressionExample(medv(id100),33.2).
regressionExample(medv(id101),27.5).
regressionExample(medv(id10),18.9).
regressionExample(medv(id102),26.5).

Usage

Download a Binary

Precompiled binaries are listed on the GitHub Releases page, and the latest version can be downloaded with these links:

Platform	Link
Linux/amd64	Download
macOS/amd64	Download
Windows/amd64	Download

Build from Source

Building requires a Go compiler.

cd cmd
go build

A copy of the generated ANTLR parser files are committed to the repository, and rebuilding them requires an ANTLR Parser Generator.

make clean
make linter

Limitations

This grammar is extremely conservative currently: the only tokens allowed are lowercase characters, integers, and underscores.

a(x_1,y_1).
b(x_1).

Contributions

Alexander L. Hayes - Indiana University, Bloomington

Some ideas were taken from the FOPC_MLN_ILP_Parser developed by Jude Shavlik and Trevor Walker (and possibly contributed to by many others who went unnamed in the source code). There are a few versions of their Tokenizers (StreamTokenizerJWS and StreamTokenizerTAW) and Parser currently used in other projects.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
cmd		cmd
docs		docs
examples		examples
test		test
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
lint.sh		lint.sh
mkdocs.yml		mkdocs.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inductive Logic Programming (ILP) Grammar and Linter

Overview

Usage

Download a Binary

Build from Source

Limitations

Contributions

About

Releases 3

Languages

License

srlearn/linter

Folders and files

Latest commit

History

Repository files navigation

Inductive Logic Programming (ILP) Grammar and Linter

Overview

Usage

Download a Binary

Build from Source

Limitations

Contributions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Languages