-
Notifications
You must be signed in to change notification settings - Fork 6
Feature functions
This page is a description of all currently-implemented features in thrax. For instructions on implementing your own, see feature function implementation. Entries on this page are in this form:
Label: feature name as shown in output "name=value"
Name: which value to add to features key in thrax.conf to include this feature
Mathematical description.
Label: SourcePhraseGivenTarget
Name: e2fphrase
For a rule like ( X \to \langle \alpha ;, \beta \rangle ), let ( c(\cdot) ) be the number of times a particular phrase has been seen among all the extracted rules. Then we calculate ( p(\alpha | \beta) = \frac{c(\alpha,\beta)}{c(\beta)} ) and the value of this feature is ( - \log{ p(\alpha|\beta)} ).
Label: TargetPhraseGivenSource
Name: f2ephrase
Just as in SourcePhraseGivenTarget
above, except the calculation is ( - \log{ \frac{c(\alpha,\beta)}{c(\alpha)}} ).
Label: LexprobSourceGivenTarget
,LexprobTargetGivenSource
Name: lexprob
Label: Adjacent
Name: adjacent
Label: Lexical
Name: lexical
Label: Abstract
Name: abstract
Label: ContainsX
Name: x-rule
Label: SourceTerminalsButNoTarget
Name: source-terminals-without-target
Label: TargetTerminalsButNoSource
Name: target-terminals-without-source
Label: Monotonic
Name: monotonic
Label: PhrasePenalty
Name: phrase-penalty
Label: TargetWords
Name: target-word-count
Label: RarityPenalty
Name: rarity
( = \exp( 1 - c(r)) )
Label: UnalignedSource
, UnalignedTarget
Name: unaligned-count