Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Average Precision Calculation does not calculate area under curve #46

Open
JFT90 opened this issue Feb 2, 2023 · 6 comments
Open

Average Precision Calculation does not calculate area under curve #46

JFT90 opened this issue Feb 2, 2023 · 6 comments

Comments

@JFT90
Copy link

JFT90 commented Feb 2, 2023

In the metric code, the AP value seems to be calculated incorrectly.
https://github.com/valeoai/RADIal/blob/main/FFTRadNet/utils/metrics.py#L199

AP should be calculated as the area under the precision-recall curve instead of just taking the mean over precisions:
https://en.wikipedia.org/w/index.php?title=Information_retrieval&oldid=793358396#Average_precision

@ArthurOuaknine
Copy link
Collaborator

Hello.
In practice, the area under the precision-recall curve is simulated by computing the average of precisions for discretized values of recall (see your own wikipedia link, section "Average precision, equation 4).
In the object detection framework, these recall thresholds are equivalent to Intersection over Union (IoU) thresholds.

The common practice in object detection (PASCAL VOC, COCO and so on) is to define an average precision with, either a set of IoU threshold and perform the same equation than explained in the wikipedia page (e.g. $\text{AP}^{0.1 : 0.9}$), or considering a single IoU threshold (e.g. $\text{AP}^{0.5}$).

In our case, we considered only a single IoU threshold (0.5) and compute the AP as the average of precision of the kept predictions. There is no error, this is a basic metric in the PASCAL VOC benchmark. Feel free to propose your own evaluation metric if you prefer.

For further details, please have a look to this link.
Hope it will help.
Arthur

@JFT90
Copy link
Author

JFT90 commented Feb 3, 2023

Hi,
Thanks for your quick reply.
I don't think your code does what is stated in either your link (see: https://pyimagesearch.com/2022/05/02/mean-average-precision-map-using-the-coco-evaluator/#H3Calc) or equation 4 on wikipedia.

If you want to take the mean you need to calculate interpolated precisions first over equally spaced recall levels:
image

I think in your code the interpolated precision is not calculated (second equation).

For example checkout lines 135 - 138 in nuscenes devkit for interpolation of the precision on equally spaced recall levels:
https://github.com/nutonomy/nuscenes-devkit/blob/9cba5925dc51b5c855252ee031eee5d6422f556b/python-sdk/nuscenes/eval/detection/algo.py#L135
Or the PIXOR Implementation:
https://github.com/philip-huang/PIXOR/blob/2f64e84f72d915f8b5b80d4615bc9a0c881b5dab/srcs/postprocess.py#L159

I don't want to suggest an additional evaluation metric but want to correct the existing one.

Best regards!

Edit: To make it more clear - you are only "integrating" a limited recall range:
recalls for the confidence thresholds (for your pretrained model) from 0.1 to 0.9: [0.7301587301587301, 0.7753968253968254, 0.8015873015873016, 0.8158730158730159, 0.8285714285714286, 0.8420634920634921, 0.8523809523809524, 0.8690476190476191, 0.8809523809523809]

@ArthurOuaknine
Copy link
Collaborator

Ok I see.
It should be related to this loop:

for threshold in np.arange(0.1,0.96,0.1):

The current thresholds are [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9].
The code should be for threshold in np.arange(0, 1.1, 0.1): to integrate the thresholds 0. and 1..
I did not implement this code so I don't know why this choice has been made. But it looks like we're missing two threshold values.
@jrebut any idea?

Thanks for pointing this out.

@JFT90
Copy link
Author

JFT90 commented Feb 5, 2023

I expect that only adding more confidence thresholds will still not correctly sample the recall precision curve.
Maybe add this code to correctly compute the AP from recalls and precisions (similar to the PIXOR implementation):

def get_envelope(precisions):
    """Compute the precision envelope.
    Args:
      precisions:
    Returns:
    """
    for i in range(precisions.size - 1, 0, -1):
        precisions[i - 1] = np.maximum(precisions[i - 1], precisions[i])
    return precisions

def get_ap(recalls, precisions):
    """Calculate average precision.
    Args:
      recalls: sorted recall list - ascending order
      precisions: precision list in order corresponding to sorted recall list
    Returns (float): average precision.
    """
  
    # first append sentinel values at the end
    recalls = np.concatenate(([0.0], recalls, [1.0]))
    precisions = np.concatenate(([0.0], precisions, [0.0]))

    precisions = get_envelope(precisions)

    # to calculate area under PR curve, look for points where X axis (recall) changes value
    i = np.where(recalls[1:] != recalls[:-1])[0]

    # and sum (\Delta recall) * prec
    ap = np.sum((recalls[i + 1] - recalls[i]) * precisions[i + 1])
    return ap

Change the loop to ensure correct sorting:
for threshold in np.linspace(0.001,0.95,19)[::-1]:

and instead of np.mean(perfs['precision']):

ap = get_ap(perfs['recall'], perfs['precision'])

@ArthurOuaknine
Copy link
Collaborator

Thanks for your valuable comment, it is appreciated.
I leave this issue open and let @jrebut do the integration and verify the performances.

@JFT90
Copy link
Author

JFT90 commented Feb 6, 2023

You are welcome - I think the issue is set to closed currently. So might be good to reopen

@ArthurOuaknine ArthurOuaknine reopened this Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants