Skip to content
stephantul edited this page Feb 21, 2020 · 1 revision

pattern.en parser

The English parser can be invoked from the command-line. The pattern module should be installed (i.e., located in /site-packages, see installation instructions) or the current working directory should be the one that contains the pattern folder.

> python -m pattern.en -f file.txt

If no options are given a full parse is executed (i.e. tokenization, tagging, chunking, relations and lemmata). Otherwise, you need to explicitly list every required option:

`-O` `--tokenize` Tokenize the input.
`-T ` `--tags ` Parse part-of-speech tags.
`-C`  `--chunks ` Parse chunks and `PNP` tags. 
`-R`  `--relations`  Parse verb/predicate relations. 
`-L`  `--lemmata ` Parse lemmata (wasbe). 
`-f ` `--file`  Input file path. 
`-s ` `--string ` Input string. 
`-e`  `--encoding`  Specify character encoding (utf-8 by default). 
`-v ` --version Print current version of Pattern.

Short options can be concatenated. Also note the xml option which produces XML output:

> python -m pattern.en xml -OT -s 'The black cat sat on the mat.'

pattern.es | de | fr | it | nl parsers

The parsers for other languages work in the same way. Note the xml option (produces XML output).

> python -m pattern.es -s 'El gato negro se sienta en la estera.'

> python -m pattern.de -s 'Die schwarze Katze liegt auf der Matte.'

> python -m pattern.fr -s "Le chat noir s'était assis sur le tapis."

> python -m pattern.it -s 'Il gatto nero faceva le fusa.'

> python -m pattern.nl -s 'De zwarte kat zat op de mat.'
Clone this wiki locally