Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-shot inference with t-few does not produce the scores and inferences #21

Open
hansfarrell opened this issue Apr 20, 2024 · 1 comment

Comments

@hansfarrell
Copy link

Hi @stefanhgm,

I am able to run the t-few code to infer all the datasets with any number of shots except 0 shot. Following your instruction, I changed the num_shot to 0 in the .sh file to run the pl_train.py but the output experiment folder does not contain the supposed dev_scores.json and the inferences. It works for the other shots, the issue only happens for zero-shot. I've attached the terminal output when the code is run for your reference.
Screenshot 2024-04-20 113030
Screenshot 2024-04-20 113107

@stefanhgm
Copy link
Contributor

stefanhgm commented May 27, 2024

Hello @hansfarrell,

I am very sorry for coming back to you so late. Still it is a bit difficult your problem with these information. I checked the zero-shot case again and here is my minimal bin/few-shot-pretrained-100k.sh file for income and t03b working for me:

#!/bin/bash
allow_skip_exp=True
eval_before_training=True
balanced_ibc=True

train_batch_size=4
grad_accum_factor=1

lr=0.003
re='^[0-9]+$'

cuda_device=0

# Set adaptively
num_steps=0
eval_epoch_interval=0

for model in 't03b' # 't03b'
do
  # For zero-shot set to '0', for all to 'all'
  for num_shot in 0
  do
    # Datasets: car, income, heart, diabetes, jungle, bank, blood, calhousing, creditg, jungle
    for dataset in income
    do
      # Zero-shot
      eval_before_training=True
      num_steps=0
      # Few-shot
      # [...]

      for seed in 42
      do
      # Print the command to run
      # [...]
      # Run the command
        CUDA_VISIBLE_DEVICES=${cuda_device} CONFIG_PATH=/root/t-few/configs HF_HOME=/root/.cache/huggingface \
        python -m src.pl_train -c ${model}.json+ia3.json+global.json -k dataset=${dataset} load_weight="pretrained_checkpoints/${model}_ia3_finish.pt" num_steps=${num_steps} num_shot=${num_shot} \
        exp_name=${model}_${dataset}_numshot${num_shot}_seed${seed}_ia3_pretrained100k few_shot_random_seed=${seed} seed=${seed} allow_skip_exp=${allow_skip_exp} eval_before_training=${eval_before_training} eval_epoch_interval=${eval_epoch_interval} \
        batch_size=${train_batch_size} grad_accum_factor=${grad_accum_factor} lr=${lr}
      done
    done
  done
done

The output looks as follows and I also get the file dev_scores.json in the directory exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/:

Validation sanity check: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 611/611 [01:05<00:00, 5.97it/s]{"AUC": 0.7616058157520148, "PR": 0.4978171030717906, "micro_f1": 0.2939911966424404, "macro_f1": 0.2707494088356843, "accuracy": 0.2939911966424404, "num": 9769, "num_steps": -1, "score_gt": 0.41221203334451834, "score_cand": 0.3321806071468185}

I hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants