Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce precision #54

Open
HHHH17 opened this issue Apr 3, 2023 · 0 comments
Open

Reproduce precision #54

HHHH17 opened this issue Apr 3, 2023 · 0 comments

Comments

@HHHH17
Copy link

HHHH17 commented Apr 3, 2023

Dear author, I have tried to reproduce The CVPR 2022 consnet model. But in the xe training stage, the highest cider only reached 1.260, after sc training, only reached 1.393, much lower than 1.411 that the paper reported. I post my training config of xe below, can you help me to find some mistakes, thank you.

CUDNN_BENCHMARK: true
DATALOADER:
  ANNO_FOLDER: ../open_source_dataset/mscoco_dataset
  ATTRIBUTE_FILE: ''
  FEATS_FOLDER: ../open_source_dataset/mscoco_dataset/features/CLIP_RN101_49
  FILE_PATHS: []
  GV_FEAT_FILE: ''
  INF_BATCH_SIZE: 200
  MAX_FEAT_NUM: 50
  NEGATIVE_SIZE: -1
  NUM_WORKERS: 6
  RELATION_FILE: ''
  SAMPLE_IDS: []
  SAMPLE_PROB: 0.2
  SEQ_PER_SAMPLE: 5
  TEST_BATCH_SIZE: 32
  TRAIN_BATCH_SIZE: 8
  USE_GLOBAL_V: true
DATASETS:
  TEST: MSCoCoCOSNetDataset
  TRAIN: MSCoCoCOSNetDataset
  VAL: MSCoCoCOSNetDataset
DECODE_STRATEGY:
  BEAM_SIZE: 3
  NAME: BeamSearcher
ENGINE:
  NAME: DefaultTrainer
INFERENCE:
  GENERATION_MODE: true
  ID_KEY: image_id
  NAME: COCOEvaler
  TEST_ANNFILE: ../open_source_dataset/mscoco_dataset/captions_test5k.json
  TEST_EVAL_START: -1
  VALUE: caption
  VAL_ANNFILE: ../open_source_dataset/mscoco_dataset/captions_val5k.json
  VAL_EVAL_START: -1
  VOCAB: ../open_source_dataset/mscoco_dataset/vocabulary.txt
LOSSES:
  LABELSMOOTHING: 0.1
  MARGIN: 0.2
  MAX_VIOLATION: true
  NAMES:
  - LabelSmoothing
  - SemComphderLoss
LR_SCHEDULER:
  FACTOR: 1.0
  GAMMA: 0.1
  MIN_LR: 1.0e-05
  MODEL_SIZE: 512
  NAME: NoamLR
  STEPS:
  - 3
  STEP_SIZE: 3
  WARMUP: 20000
  WARMUP_FACTOR: 0.0
  WARMUP_METHOD: linear
MODEL:
  BERT:
    ATTENTION_PROBS_DROPOUT_PROB: 0.1
    FFN_DROPOUT_PROB: 0.2
    G_LAYER_DROP: 0.0
    HIDDEN_ACT: relu
    HIDDEN_DROPOUT_PROB: 0.1
    HIDDEN_SIZE: 512
    INTERMEDIATE_DROP: 0.2
    INTERMEDIATE_SIZE: 2048
    LAYER_DROP: 0.0
    NUM_ATTENTION_HEADS: 8
    NUM_GENERATION_LAYERS: 6
    NUM_HIDDEN_LAYERS: 6
    NUM_UNDERSTANDING_LAYERS: 6
    U_LAYER_DROP: 0.0
    V_LAYER_DROP: 0.0
    V_NUM_HIDDEN_LAYERS: 6
    V_TARGET_SIZE: 0
  COSNET:
    FILTER_WEIGHT: 1.0
    MAX_POS: 26
    NUM_CLASSES: 906
    NUM_SEMCOMPHDER_LAYERS: 3
    RECONSTRUCT_WEIGHT: 0.1
    SLOT_SIZE: 6
  DECODER: COSNetDecoder
  DECODER_DIM: 512
  DEVICE: cuda
  EMA_DECAY: 0.9999
  ENCODER: COSNetEncoder
  ENCODER_DIM: 512
  ENSEMBLE_WEIGHTS:
  - ''
  ITM_NEG_PROB: 0.5
  MAX_SEQ_LEN: 20
  META_ARCHITECTURE: TransformerEncoderDecoder
  MODEL_WEIGHTS:
  - 1.0
  - 1.0
  PREDICTOR: BasePredictor
  PRED_DROPOUT: 0.5
  PRETRAINING:
    DO_LOWER_CASE: true
    FROM_PRETRAINED: bert-base-uncased
    MODEL_NAME: bert-base-uncased
  TOKEN_EMBED:
    ACTIVATION: none
    DIM: 512
    DROPOUT: 0.1
    ELU_ALPHA: 0.5
    NAME: TokenBaseEmbedding
    POSITION: SinusoidEncoding
    POSITION_MAX_LEN: 5000
    TYPE_VOCAB_SIZE: 0
    USE_NORM: true
  USE_EMA: false
  VISUAL_EMBED:
    ACTIVATION: relu
    DROPOUT: 0.5
    ELU_ALPHA: 0.5
    G_IN_DIM: 512
    IN_DIM: 2048
    LOCATION_SIZE: 0
    NAME: VisualGridEmbedding
    OUT_DIM: 512
    USE_NORM: true
  VOCAB_SIZE: 10200
  V_PREDICTOR: ''
  WEIGHTS: ''
OUTPUT_DIR: ./cosnet_output_baseline
SCHEDULED_SAMPLING:
  INC_EVERY_EPOCH: 3
  INC_PROB: 0.05
  MAX_PROB: 0.5
  START_EPOCH: 9999
SCORER:
  CIDER_CACHED: ../open_source_dataset/mscoco_dataset/mscoco_train_cider.pkl
  EOS_ID: 0
  GT_PATH: ../open_source_dataset/mscoco_dataset/mscoco_train_gts.pkl
  NAME: BaseScorer
  TYPES:
  - Cider
  WEIGHTS:
  - 1.0
SEED: -1
SOLVER:
  ALPHA: 0.99
  AMSGRAD: false
  BASE_LR: 0.0005
  BETAS:
  - 0.9
  - 0.999
  BIAS_LR_FACTOR: 1.0
  CENTERED: false
  CHECKPOINT_PERIOD: 1
  DAMPENING: 0.0
  EPOCH: 35
  EPS: 1.0e-08
  EVAL_PERIOD: 1
  GRAD_CLIP: 0.1
  GRAD_CLIP_TYPE: value
  INITIAL_ACCUMULATOR_VALUE: 0.0
  LR_DECAY: 0.0
  MOMENTUM: 0.9
  NAME: Adam
  NESTEROV: 0.0
  NORM_TYPE: 2.0
  WEIGHT_DECAY: 0.0
  WEIGHT_DECAY_BIAS: 0.0
  WEIGHT_DECAY_NORM: 0.0
  WRITE_PERIOD: 20
VERSION: 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant