GitHub - fastnlp/JointCwsParser: Code for "A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing"

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

This is the code for the paper A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Requirements

This project needs the natural language processing python package fastNLP. You can install by the following command

pip install fastNLP

Data

Your data should in the format as following

1	中国	_	NR	NR	_	4	nn	_	_
2	残疾人	_	NN	NN	_	4	nn	_	_
3	体育	_	NN	NN	_	4	nn	_	_
4	事业	_	NN	NN	_	5	nsubj	_	_
5	方兴未艾	_	VV	VV	_	0	root	_	_

1	新华社	_	NR	NR	_	12	dep	_	_

The 1st, 3rd, 6th, 7th(starts from 0) column should be words, pos tags, dependency heads and dependency labels, respectively. Empty line separate two instances.

You should place your data like the following structure

-JointCwsParser
    ...
    -train.py
    -train_bert.py
-data
    -ctb5
        -train.conll
        -dev.conll
        -test.conll
    -ctb7
        -...
    -ctb9
        -...

We use code from https://github.com/hankcs/TreebankPreprocessing to convert the original format into the conll format.

Run the code

You can directly run by

python train.py --dataset ctb5

or

python train_bert.py --dataset ctb5

FastNLP will download pretrained embeddings or BERT weight automatically.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
models		models
modules		modules
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py
train_bert.py		train_bert.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Requirements

Data

Run the code

About

Releases

Packages

Languages

fastnlp/JointCwsParser

Folders and files

Latest commit

History

Repository files navigation

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Requirements

Data

Run the code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages