- Although QA systems are improving, Most existing methods focus on specific datasets and not much is known as to how these methods generalize to out of domain and unseen RC tasks.
- The paper builds a model on top XLnet and then finetuned on multiple RC datasets.
- Finetuning the XLNET model is done by an additional MLP layer.
- The experiments are run on GPUs and TPUs. The results for Training on single dataset but testing on all other dataset are shown below: