Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reimplementation] KeyError when running evaluation #35

Open
3 tasks done
chrimidt opened this issue Jul 12, 2024 · 3 comments
Open
3 tasks done

[Reimplementation] KeyError when running evaluation #35

chrimidt opened this issue Jul 12, 2024 · 3 comments

Comments

@chrimidt
Copy link

Prerequisite

💬 Describe the reimplementation questions

Hi, I'm trying to evaluate the model but I'm running into a KeyErrror with "start_coord" when running

python ./tools/test.py "configs\cfinet\faster_rcnn_r50_fpn_cfinet_1x.py" "work_dirs\epoch_12.pth" --eval bbox

using the pre-trained weights.

This is the output when running:

(my_env) C:\Users\min_s\cvproject\CFINet>python ./tools/test.py "C:\Users\min_s\cvproject\CFINet\configs\cfinet\faster_rcnn_r50_fpn_cfinet_1x.py" "C:\Users\min_s\cvproject\CFINet\work_dirs\epoch_12.pth" --eval bbox
c:\users\min_s\cvproject\cfinet\mmdet\utils\setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
c:\users\min_s\cvproject\cfinet\mmdet\utils\setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
loading annotations into memory...
Done (t=0.45s)
creating index...
index created!
loading annotations into memory...
Done (t=0.28s)
creating index...
index created!
c:\users\min_s\cvproject\cfinet\mmdet\models\losses\iou_loss.py:266: UserWarning: DeprecationWarning: Setting "linear=True" in IOULoss is deprecated, please use "mode=linear" instead.
warnings.warn('DeprecationWarning: Setting "linear=True" in '
c:\users\min_s\cvproject\cfinet\mmdet\models\dense_heads\anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
c:\users\min_s\cvproject\cfinet\mmdet\models\dense_heads\anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead
warnings.warn('DeprecationWarning: anchor_generator is deprecated, '
load checkpoint from local path: C:\Users\min_s\cvproject\CFINet\work_dirs\epoch_12.pth
The model and loaded state dict do not match exactly

unexpected key in source state_dict: rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias

missing keys in source state_dict: rpn_head.stages.0.rpn_conv.conv.weight, rpn_head.stages.0.rpn_conv.conv.bias, rpn_head.stages.0.rpn_reg.weight, rpn_head.stages.0.rpn_reg.bias, rpn_head.stages.1.rpn_conv.conv.weight, rpn_head.stages.1.rpn_cls.weight, rpn_head.stages.1.rpn_cls.bias, rpn_head.stages.1.rpn_reg.weight, rpn_head.stages.1.rpn_reg.bias

[ ] 0/7428, elapsed: 0s, ETA:C:\Users\min_s\cvproject\my_env\lib\site-packages\torch\functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 7428/7428, 10.6 task/s, elapsed: 699s, ETA: 0s

Merge detected results of patch for whole image evaluating...
Traceback (most recent call last):
File "./tools/test.py", line 287, in
main()
File "./tools/test.py", line 279, in main
metric = dataset.evaluate(outputs, **eval_kwargs)
File "c:\users\min_s\cvproject\cfinet\mmdet\datasets\sodad.py", line 506, in evaluate
merged_results = self.merge_dets(
File "c:\users\min_s\cvproject\cfinet\mmdet\datasets\sodad.py", line 426, in merge_dets
x_start, y_start = data_info['start_coord']
KeyError: 'start_coord'

Any clue to what I'm missing here? Thank you!

Environment

Same as described in the repository.

sys.platform: win32
Python: 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3070 Ti Laptop GPU
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
NVCC: Cuda compilation tools, release 11.3, V11.3.58
MSVC: Microsoft (R) C/C++ Optimizing Compiler Version 19.40.33812 for x64
GCC: n/a
PyTorch: 1.10.0+cu113
PyTorch compiling details: PyTorch built with:

  • C++ Version: 199711
  • MSVC 192829337
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  • OpenMP 2019
  • LAPACK is enabled (usually provided by MKL)
  • CPU capability usage: AVX2
  • CUDA Runtime 11.3
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.2
  • Magma 2.5.4
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=C:/w/b/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/w/b/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON,

TorchVision: 0.11.0+cu113
OpenCV: 4.10.0
MMCV: 1.5.0
MMCV Compiler: MSVC 192930140
MMCV CUDA Compiler: 11.3
MMDetection: 2.26.0+2167eeb

Expected results

No response

Additional information

In the configs/base/datasts/sodad.py

data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'divData/Annotations/train.json',
img_prefix=data_root + 'divData/Images/',
pipeline=train_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/train.json'
),
val=dict(
type=dataset_type,
ann_file=data_root + 'rawData/Annotations/val.json',
img_prefix=data_root + 'rawData/Images/',
pipeline=test_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/val_wo_ignore.json'
),
test=dict(
type=dataset_type,
ann_file=data_root + 'rawData/Annotations/test.json',
img_prefix=data_root + 'rawData/Images/',
pipeline=test_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/test_wo_ignore.json'
))

@chrimidt chrimidt changed the title [Reimplementation] [Reimplementation] KeyError when running evaluation Jul 12, 2024
@shaunyuan22
Copy link
Owner

the ann_file and img_prefix in val and test part should be the divided data instead of raw data

@chrimidt
Copy link
Author

Thank you for your reply, that worked! I had mixed up the data somehow. However, I'm suspecting I'm not evaluating on the correct data, as when the evaluation is done, these are the results:

(my_env) C:\Users\min_s\cvproject\CFINet>python ./tools/test.py "C:\Users\min_s\cvproject\CFINet\configs\cfinet\faster_rcnn_r50_fpn_cfinet_1x.py" "C:\Users\min_s\cvproject\CFINet\configs\cfinet\faster_rcnn_r50_fpn_fi_roi_head_0822\epoch_12.pth" --work-dir C:\Users\min_s\cvproject\CFINet\work_dirs\Test --eval bbox
c:\users\min_s\cvproject\cfinet\mmdet\utils\setup_env.py:38: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
c:\users\min_s\cvproject\cfinet\mmdet\utils\setup_env.py:48: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
loading annotations into memory...
Done (t=1.72s)
creating index...
index created!
loading annotations into memory...
Done (t=0.44s)
creating index...
index created!
c:\users\min_s\cvproject\cfinet\mmdet\models\losses\iou_loss.py:266: UserWarning: DeprecationWarning: Setting "linear=True" in IOULoss is deprecated, please use "mode=linear" instead.
warnings.warn('DeprecationWarning: Setting "linear=True" in '
c:\users\min_s\cvproject\cfinet\mmdet\models\dense_heads\anchor_head.py:116: UserWarning: DeprecationWarning: num_anchors is deprecated, for consistency or also use num_base_priors instead
warnings.warn('DeprecationWarning: num_anchors is deprecated, '
c:\users\min_s\cvproject\cfinet\mmdet\models\dense_heads\anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead
warnings.warn('DeprecationWarning: anchor_generator is deprecated, '
load checkpoint from local path: C:\Users\min_s\cvproject\CFINet\configs\cfinet\faster_rcnn_r50_fpn_fi_roi_head_0822\epoch_12.pth
The model and loaded state dict do not match exactly

unexpected key in source state_dict: rpn_head.rpn_conv.weight, rpn_head.rpn_conv.bias, rpn_head.rpn_cls.weight, rpn_head.rpn_cls.bias, rpn_head.rpn_reg.weight, rpn_head.rpn_reg.bias

missing keys in source state_dict: rpn_head.stages.0.rpn_conv.conv.weight, rpn_head.stages.0.rpn_conv.conv.bias, rpn_head.stages.0.rpn_reg.weight, rpn_head.stages.0.rpn_reg.bias, rpn_head.stages.1.rpn_conv.conv.weight, rpn_head.stages.1.rpn_cls.weight, rpn_head.stages.1.rpn_cls.bias, rpn_head.stages.1.rpn_reg.weight, rpn_head.stages.1.rpn_reg.bias

[ ] 0/161579, elapsed: 0s, ETA:C:\Users\min_s\cvproject\my_env\lib\site-packages\torch\functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 161579/161579, 10.2 task/s, elapsed: 15886s, ETA: 0s

Merge detected results of patch for whole image evaluating...
Merge results completed, it costs 21.4 seconds.

Evaluating bbox...
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=3.94s).
Accumulating evaluation results...
DONE (t=0.66s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= Small | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= Small | maxDets=100 ] = 0.003
Average Precision (AP) @[ IoU=0.75 | area= Small | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50:0.95 | area= eS | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50:0.95 | area= rS | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= gS | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=Normal | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= Small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= Small | maxDets=300 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= Small | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= eS | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= rS | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= gS | maxDets=1000 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=Normal | maxDets=1000 ] = 0.000

+---------------+-------+----------------+-------+--------------+-------+
| category | AP | category | AP | category | AP |
+---------------+-------+----------------+-------+--------------+-------+
| people | 0.000 | rider | 0.000 | bicycle | 0.000 |
| motor | 0.000 | vehicle | 0.003 | traffic-sign | 0.001 |
| traffic-light | 0.002 | traffic-camera | 0.000 | warning-cone | 0.000 |
+---------------+-------+----------------+-------+--------------+-------+
{'bbox_AP': 0.001, 'bbox_AP_50': 0.003, 'bbox_AP_75': 0.001, 'bbox_AP_eS': 0.001, 'bbox_AP_rS': 0.0, 'bbox_AP_gS': 0.0, 'bbox_AP_Normal': 0.0, 'bbox_mAP_copypaste': '0.001 0.003 0.001 0.001 0.000 0.000 0.000 '}

The config file:

data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'divData/Annotations/train.json',
img_prefix=data_root + 'divData/Images/train',
pipeline=train_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/train.json'
),
val=dict(
type=dataset_type,
ann_file=data_root + 'divData/Annotations/val.json',
img_prefix=data_root + 'divData/Images/val',
pipeline=test_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/val_wo_ignore.json'
),
test=dict(
type=dataset_type,
ann_file=data_root + 'divData/Annotations/test.json',
img_prefix=data_root + 'divData/Images/test',
pipeline=test_pipeline,
ori_ann_file=data_root + 'rawData/Annotations/test_wo_ignore.json'
))

Do you have any idea what could be wrong? I don't suspect the results are meant to be 0...

I'm hoping for your reply.

@shaunyuan22
Copy link
Owner

shaunyuan22 commented Jul 17, 2024

it seems that the trianing fails. what are the batch size and learning rate? the training log could be helpful to figure out this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants