Releases: digitalheir/java-probabilistic-earley-parser
v0.10.0
v0.9.12
- Added lenient scanning options
-scanmode drop
and-scanmode wildcard
for handling when the grammar can't find the terminal type for a token. See #7
You can use this project as a library in your Java application or as a standalone command-line app.
Using the app from command-line
We define a grammar in a .cfg
file.
By default, the parser will assume that you distinguish non-terminals from terminals by capitalizing them. You can use a custom category handler if you call the API from Java code.
# grammar.cfg
S -> NP VP (1.0) # specify probability between 0 and 1 by appending between parentheses
NP -> D N # probability defaults to 1.0
VP → V NP # Use '->' or '→'
D → the
N → noses (0.7)
V → noses (0.3)
V → sniff (0.9)
N → sniff (0.1)
Execute runnable jar on the terminal:
java -jar probabilistic-earley-parser-0.9.12-jar-with-dependencies.jar -i grammar.cfg -goal S the noses sniff the noses
This will give the Viterbi parse to the Sentence "the noses sniff the noses":
0.44099999999999995 (= 0.7 * 0.7 * 0.9)
└── <start>
└── S
├── NP
│ ├── D
│ │ └── the (the)
│ └── N
│ └── noses (noses)
└── VP
├── V
│ └── sniff (sniff)
└── NP
├── D
│ └── the (the)
└── N
└── noses (noses)
v0.9.11
- Added command line functionality; include runnable jar with dependency included
- Fixed parsing of rule probability from
.cfg
files
You can use this project as a library in your Java application or as a standalone command-line app.
By default, the parser will assume that you distinguish non-terminals from terminals by capitalizing them. You can also add a custom category handler if you call the API from Java code.
Create a UTF8-encoded .cfg
file that contains your grammar, such as the following:
# grammar.cfg
S -> NP VP (1.0) # specify probability between 0 and 1 by appending between parentheses
NP -> D N # probability defaults to 1.0
VP → V NP # Use '->' or '→'
D → the
N → noses (0.7)
V → noses (0.3)
V → sniff (0.9)
N → sniff (0.1)
Execute runnable jar on the terminal:
java -jar probabilistic-earley-parser-0.9.11-jar-with-dependencies.jar -i grammar.cfg -goal S the noses sniff the noses
This will give the Viterbi parse to the Sentence "the noses sniff the noses":
0.44099999999999995 (= 0.7 * 0.7 * 0.9)
└── <start>
└── S
├── NP
│ ├── D
│ │ └── the (the)
│ └── N
│ └── noses (noses)
└── VP
├── V
│ └── sniff (sniff)
└── NP
├── D
│ └── the (the)
└── N
└── noses (noses)