When a bin has samples of the same probability #19

forrestbao · 2024-05-14T01:22:44Z

Hi,

I just came across an interesting corner case: some bins have samples of the same probability.

The code below will reproduce the error.

import calibration as cal

model_probs = [[0.5507, 0.4493], 
 [0.8764, 0.1236],
 [0.1822, 0.8178],
 [0.3814, 0.6186],
 [0.9725, 0.0275],
 [0.281,  0.719 ],
 [0.8817, 0.1183],
 [0.8193, 0.1807],
 [0.4806, 0.5194],
 [0.9415, 0.0585],
 [0.4648, 0.5352],
 [0.9561, 0.0439]]
labels = [0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0]

calibrator = cal.PlattBinnerMarginalCalibrator(len(labels), num_bins=4)
calibrator.train_calibration(model_probs, labels)
print (calibrator._bins)

The shape of the first row in calibrator._bins is (3,) instead of (4,) as expected.

We looked into the reason and found that the last two bins have samples of the same probabilities.

We are wondering whether in such a case, an error message should be thrown out or the probabilities should have been added with noises.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When a bin has samples of the same probability #19

When a bin has samples of the same probability #19

forrestbao commented May 14, 2024

When a bin has samples of the same probability #19

When a bin has samples of the same probability #19

Comments

forrestbao commented May 14, 2024