Skip to content
jweese edited this page Oct 21, 2010 · 5 revisions

This page is a description of all currently-implemented features in thrax. For instructions on implementing your own, see feature function implementation. Entries on this page are in this form:

Feature name

Label: feature name as shown in output "name=value"

Included in: which value to add to features key in thrax.conf to include this feature

Mathematical description.

Probability of source phrase given target phrase

Label: SourcePhraseGivenTarget

Included in: phrase

For a rule like ( X \to \langle \alpha ;, \beta \rangle ), let ( c(\cdot) ) be the number of times a particular phrase has been seen among all the extracted rules. Then we calculate ( p(\alpha | \beta) = \frac{c(\alpha,\beta)}{c(\beta)} ) and the value of this feature is ( - \log{ p(\alpha|\beta)} ).

Probability of target phrase given source phrase

Label: TargetPhraseGivenSource

Included in: phrase

Just as in SourcePhraseGivenTarget above, except the calculation is ( - \log{ \frac{c(\alpha,\beta)}{c(\alpha)}} ).

Lexical probability of source given target

Label: LexprobSourceGivenTarget

Included in: lex

Lexical probability of target given source

Label: LexprobTargetGivenSource

Included in: lex

Does the source side have adjacent nonterminal symbols?

Label: Adjacent

Included in: samt

Is this rule purely lexical?

Label: Lexical

Included in: samt

Is this rule purely abstract?

Label: Abstract

Included in: samt

Does the rule contain an X nonterminal?

Label: ContainsX

Included in: samt

Does the rule consume source terminal symbols without producing target output?

Label: SourceTerminalsButNoTarget

Included in: samt

Does the rule produce target output without consuming source terminals?

Label: TargetTerminalsButNoSource

Included in: samt

Is the rule monotonic, or is there reordering?

Label: Monotonic

Included in: samt

Clone this wiki locally