Skip to content

Latest commit

 

History

History

false_positives

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Discussion of False Positives

As discussed in the paper, humans are not well suited to judge the realism of LiDAR point clouds. Here we give several examples of true and false positives identified in the study.

False Positives

As discussed in Section IV-C-3, the voting mechanism sets a threshold, V, for determining if a test case is a false positive based on if V or more SUTs failed on that test case. In Table IV, shown above, we use V=3 as the threshold. Below we examine several examples of false positives found:

Accuracy Metric

Under the accuracy metric we find only 6 of the 167 failures are false positives. We examine two examples below:

Example 1

In the below example, the mutation takes a point cloud containing a truck and alters the intensity of the points so that they are at a much lower intensity than before. Looking at the image, we can clearly identify that the truck is now appears much darker; however, it appears to be at a similar intensity level as the car in the background behind it. This very clearly demonstrates how humans are ill-suited for judging realism for LiDAR, as we have no inherent intuition for intensities.

Original Labeling: Original Labeling

Original Intensities: Original Intensities

With Vehicle Intensities altered: With Intensities Altered

The JS3C-Net (17.5%), SalsaNext (17.2%), and SqueezeNetV3 (14.9%) models failed on this test. For each of the SUTs, we see that they mislabeled different regions of the truck, confusing it for truck versus car versus bus. Further, these mislabelings appear in splotches that do not seem to correlate with any specific aspect of the vehicle.

JS3C-Net Performance: JS3C-Net Performance

SalsaNext Performance: SalsaNext Performance

SqueezeNetV3 Performance: SqueezeNetV3 Performance

Example 2

In the below example, the mutation takes a point cloud in which the vehicle is driving on a street and adds a car on the street partially in the grass. Visually inspecting the new image, there is no clear indication why this might not be realistic. The added vehicle appears partially off the roadway, but this is a reasonable location for the car; perhaps this vehicle had an accident and pulled off of the road into the grass. The car appears large; however, this is the same distance that the car was in its original point cloud so that is realistic. There are a few regions around the car where there are no LiDAR readings (shown in black), but this happens at various places around the image, so while this may be part of the issue it is not possible to intuit directly.

Original: Original Labeling

With Vehicle Added: Labeling with vehicle added

The Cylinder3D (15.1%), SPVNAS (7.0%), and SalsaNext (8.1%) models failed on this test. Their labelings are shown below. For each of the SUTs, we see that they mislabeled different regions of the added vehicle, confusing it for car versus bus versus building.

Cylinder3D Performance on new PC: Cylinder3D Performance

SPVNAS Performance on new PC: SPVNAS Performance

SalsaNext Performance on new PC: SalsaNext

Jaccard Metric

Under the accuracy metric we find only 175 of the 1638 failures are false positives. We examine two examples below:

Example 1

In the below example, the mutation adjusts the scale of the truck (purple) in the upper right of the image. This is the only example where our intuition may tell us this could be a false positive. We can see that in the updated image, the truck appears to have shrunk and left a hole shown in black. Further investigation is required to understand what might have caused this issue.

Original: Original

With Vehicle Scale Adjusted With Vehicle Scale Adjusted

The Cylinder3D (5.7), SPVNAS (7.6), and JS3C-Net (7.5) models failed this test under the Jaccard metric. In each case we can see that the SUT mislabeled the truck as either a bus (blue) or building (orange), which led to a large change in the Jaccard metric since there are no other trucks in the scene.

Cylinder3D Performance: Cylinder3D Performance

SPVNAS Performance: SPVNAS Performance

JS3C-Net Performance: JS3C-Net Performance

Example 2

In the below example, the mutation adds a mirrored sign in the sidewalk on the right side of the image. Although sidewalks are typically free from signs, this is still physically feasible in rare circumstances, and so it is difficult for a human to judge if there is something unrealistic about the way this sign appears in this context. This may demonstrate how this method of determining false positives can over-estimate the amount of false positives as a very difficult but realistic test case may lead to multiple failures and thus be marked as a false positive. In this case, perhaps the sign is entirely realistic but the uncommon placement of the sign has yielded multiple failures.

Original: Original

With Added Mirrored Sign: With Added Sign

The Cylinder3D (10.2), SalsaNext (10.0), and SqueezeNetV3 (9.6) failed this test under the Jaccard metric. In each of the three cases we can see that the model mislabeled the added sign as building; since there are no other signs in the image, missing this class leads to a large change in the Jaccard metric.

Cylinder3D Performance: Cylinder3D Performance

SalsaNext Performance: SalsaNext Performance

SqueezeNetV3: SqueezeNetV3 Performance

Different choices of V

Note, the columns here are not cummulative. When evaluating false positives as in the study, a false positive is determined based on a number of SUTs greater than or equal to V. This table is presented based on equal to V. Sum the columns to the right to find the aggregate false positive metric.

AccuracyJaccard
Mutation Total 1 SUT 2 SUTs 3 SUTs 4 SUTs 5 SUTs Total 1 SUT 2 SUTs 3 SUTs 4 SUTs 5 SUTs
Add Rotate41 27 12 1 1 ---711 430 204 56 16 5
Add Mirror Rotate44 35 7 2 --- ---751 433 239 61 12 6
Remove6 6 --- --- --- ---26 25 1 --- --- ---
Vehicle Intensity65 55 8 2 --- ---182 151 27 4 --- ---
Vehicle Deform5 5 --- --- --- ---10 10 --- --- --- ---
Vehicle Scale12 12 --- --- --- ---95 70 11 5 9 ---
Sign Replace0 --- --- --- --- ---38 34 3 1 --- ---
Total173 140 27 5 1 ---1813 1153 485 127 37 11
% (81%) (16%) (3%) (1%) (64%) (27%) (7%) (2%) (1%)