Implementation for EMNLP 2018 paper: Multi-Head Attention with Disagreement Regularization and NAACL 2019 paper: Information Aggregation for Multi-Head Attention with Routing-by-Agreement, based on the THUMT toolkit.
More details including data and pre-trained models are coming later.