-
-
Notifications
You must be signed in to change notification settings - Fork 274
[WIP] Train Vietnamese Dependency Parsing
Vu Anh edited this page Jul 1, 2021
·
3 revisions
In this work, we build a Vietnamese Dependency Parser using Biaffine Attention in a graph-based dependency parser on VLSP 2020 Dependency Parsing dataset.
Input vectors
The input vector is composed of two parts: the word embedding and the CharLSTM word representation vector of
Biaffine Attention Mechanism
Compute the score of a dependency via biaffine attention:
Parameter settings
Model parameters
Component | Hyper-Parameter | Value | |
---|---|---|---|
Embedding | BERT |
n_bert_layers dimension |
4 768 |
LSTM | Encoder |
n_lstm_hidden n_lstm_layers lstm_dropout |
400 3 0.33 |
Training Parameters
Hyper-Parameter | Value |
---|---|
optimizer | Adam |
Choose batch_size (5000) right help us alots
VLSP 2020 Dataset
Train: 8151 sentences, Test: 1122 sentences
- Using wandb logs is very handful. We can easily watch logs, loss graph with nearly zero setup