Skip to content

Latest commit

 

History

History
executable file
·
197 lines (168 loc) · 7.39 KB

README.md

File metadata and controls

executable file
·
197 lines (168 loc) · 7.39 KB

Medical Question Understanding

1. Papers included in this repository

  • "A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question Understanding"
    Khalil Mrini, Franck Dernoncourt, Seunghyun Yoon, Trung Bui, Walter Chang, Emilia Farcas, Ndapa Nakashole
    ACL 2021 (Main, Long Paper)
    PDF | ACL Anthology | BibTeX | Video Presentation
  • "Joint Summarization-Entailment Optimization for Consumer Health Question Understanding"
    Khalil Mrini, Franck Dernoncourt, Walter Chang, Emilia Farcas, Ndapa Nakashole
    NAACL 2021 workshop on NLP for Medical Conversations (NLPMC)
    Best Student Paper Award
    PDF | ACL Anthology | BibTeX
  • "UCSD-Adobe at MEDIQA 2021: Transfer Learning and Answer Sentence Selection for Medical Summarization"
    Khalil Mrini, Franck Dernoncourt, Seunghyun Yoon, Trung Bui, Walter Chang, Emilia Farcas, Ndapa Nakashole
    NAACL 2021 workshop on Biomedical NLP (BioNLP)
    PDF | ACL Anthology | BibTeX
  • 2. Installation

    Use the following commands to install this package:

    pip install --editable ./
    pip install transformers
    

    Download pre-trained models like this:

    mkdir models
    cd models
    wget https://dl.fbaipublicfiles.com/fairseq/models/bart.large.xsum.tar.gz
    tar -xzvf bart.large.xsum.tar.gz
    rm bart.large.xsum.tar.gz
    cd ..
    

    3. Data preprocessing

    Preprocess the MeQSum dataset as follows, where "MeQSum" is the same folder as the one (parent folder) where this repository is located:

    sh ./examples/joint_rqe_sum/preprocess_MeQSum.sh .. MeQSum
    mv MeQSum-bin ../MeQSum-bin
    
    TASK=../MeQSum
    for SPLIT in train dev
    do
      for LANG in source target
      do
        python -m examples.roberta.multiprocessing_bpe_encoder \
        --encoder-json encoder.json \
        --vocab-bpe vocab.bpe \
        --inputs "$TASK/$SPLIT.$LANG" \
        --outputs "$TASK/$SPLIT.bpe.$LANG" \
        --workers 60 \
        --keep-empty;
      done
    done
    
    fairseq-preprocess \
      --source-lang "source" \
      --target-lang "target" \
      --trainpref "${TASK}/train.bpe" \
      --validpref "${TASK}/dev.bpe" \
      --destdir "${TASK}-bin/" \
      --workers 60 \
      --srcdict dict.txt \
      --tgtdict dict.txt;
    

    4. Training Commands

    To train the model with the Gradually Soft Parameter Sharing loss on MeQSum:

    TOTAL_NUM_UPDATES=810 
    WARMUP_UPDATES=81
    LR=3e-05
    MAX_TOKENS=512
    UPDATE_FREQ=4
    BART_PATH=./models/bart.large.xsum/model.pt
    
    CUDA_VISIBLE_DEVICES=1 fairseq-train ../MeQSum-bin \
        --restore-file $BART_PATH \
        --max-tokens $MAX_TOKENS \
        --task joint_rqe_sum \
        --source-lang source --target-lang target \
        --truncate-source \
        --layernorm-embedding \
        --share-all-embeddings \
        --share-decoder-input-output-embed \
        --batch-size 8 \
        --reset-optimizer --reset-dataloader --reset-meters \
        --required-batch-size-multiple 1 \
        --arch bart_large_gradsoft \
        --criterion grad_soft --add-prev-output-tokens \
        --label-smoothing 0.1 \
        --dropout 0.1 --attention-dropout 0.1 \
        --weight-decay 0.01 --optimizer adam --adam-betas "(0.9, 0.999)" --adam-eps 1e-08 \
        --clip-norm 0.1 \
        --lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_NUM_UPDATES --warmup-updates $WARMUP_UPDATES \
        --update-freq $UPDATE_FREQ \
        --skip-invalid-size-inputs-valid-test \
        --find-unused-parameters;
    

    To train the model with the Joint Summarization-Entailment loss on MeQSum:

    TOTAL_NUM_UPDATES=810 
    WARMUP_UPDATES=81
    LR=3e-05
    MAX_TOKENS=512
    UPDATE_FREQ=4
    BART_PATH=./models/bart.large.xsum/model.pt
    
    CUDA_VISIBLE_DEVICES=0 fairseq-train ../MeQSum-bin \
        --restore-file $BART_PATH \
        --max-tokens $MAX_TOKENS \
        --task joint_rqe_sum \
        --source-lang source --target-lang target \
        --truncate-source \
        --layernorm-embedding \
        --share-all-embeddings \
        --share-decoder-input-output-embed \
        --beam 2 \
        --batch-size 8 \
        --reset-optimizer --reset-dataloader --reset-meters \
        --required-batch-size-multiple 1 \
        --arch bart_large \
        --criterion joint_rqe_sum --add-prev-output-tokens \
        --label-smoothing 0.1 \
        --dropout 0.1 --attention-dropout 0.1 \
        --weight-decay 0.01 --optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-08 --no-epoch-checkpoints --no-last-checkpoints \
        --clip-norm 0.1 \
        --max-epoch 10 \
        --lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_NUM_UPDATES --warmup-updates $WARMUP_UPDATES \
        --update-freq $UPDATE_FREQ \
        --skip-invalid-size-inputs-valid-test \
        --find-unused-parameters;
    

    To train the model with the Summarization loss on MeQSum:

    TOTAL_NUM_UPDATES=5400 
    WARMUP_UPDATES=81
    LR=3e-05
    MAX_TOKENS=512
    UPDATE_FREQ=4
    BART_PATH=./models/bart.large.xsum/model.pt
    
    CUDA_VISIBLE_DEVICES=1 fairseq-train ../MeQSum-bin \
        --restore-file $BART_PATH \
        --max-tokens $MAX_TOKENS \
        --task translation \
        --source-lang source --target-lang target \
        --truncate-source \
        --layernorm-embedding \
        --share-all-embeddings \
        --share-decoder-input-output-embed \
        --batch-size 8 \
        --reset-optimizer --reset-dataloader --reset-meters \
        --required-batch-size-multiple 1 \
        --arch bart_large \
        --criterion label_smoothed_cross_entropy \
        --label-smoothing 0.1 --max-epoch 100 \
        --dropout 0.1 --attention-dropout 0.1 \
        --weight-decay 0.01 --optimizer adam --adam-betas "(0.9, 0.999)" --adam-eps 1e-08 \
        --clip-norm 0.1 \
        --lr-scheduler polynomial_decay --lr $LR --total-num-update $TOTAL_NUM_UPDATES --warmup-updates $WARMUP_UPDATES \
        --update-freq $UPDATE_FREQ \
        --skip-invalid-size-inputs-valid-test --no-epoch-checkpoints --no-last-checkpoints \
        --find-unused-parameters;
    

    Fairseq License

    fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

    Fairseq Citation

    Please cite as:

    @inproceedings{ott2019fairseq,
      title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
      author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
      booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
      year = {2019},
    }