-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ShapeIncreasingDistanceValidator #1083
Conversation
- additional unit test - update documentation to match latest development
return shape.shapePtLon() == otherShape.shapePtLon() | ||
&& shape.shapePtLat() == otherShape.shapePtLat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the errors we've been seeing a lot is when there are two shape points that are very close to each other (<1 meter apart) and the shape distance was equal. It could've been that the shape distance units in a particular feed were in kilometers and that due to a rounding process the distance was equal and therefore probably an OK value. It would be great if this check here could return true if the coordinates are less than 1 meter apart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question is why 1 meter, why not 80cm or 1.20m?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 meter is arbitrary honestly. Maybe @barbeau can provide some more educated insights into GPS precision and measurement.
RULES.md
Outdated
|
||
<a name="EqualShapeDistanceNotice"/> | ||
|
||
When sorted by `shape.shape_pt_sequence`, two consecutive shape points must not have equal values for `shape_dist_traveled` if their GPS coordinates are the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should read "if their GPS coordinates are not the same". It's an error to have equal distances with different coordinates, it's a warning to have equal distances and equal coordinates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only other concern is that at least for our feed, which generates equal distances every stop pair, we get 10s of thousands of these notices per validation. I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should read "if their GPS coordinates are not the same".
👍🏾
It's an error to have equal distances with different coordinates, it's a warning to have equal distances and equal coordinates.
Or if said coordinates are actually less more than 1 meter away according to #1083 (comment).
My only other concern is that at least for our feed, which generates equal distances every stop pair, we get 10s of thousands of these notices per validation. I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
As of now a maximum of 1000 of notices are exported by type and severity, and the validation report gives information about the total number of notices per type and severity.
gtfs-validator/core/src/main/java/org/mobilitydata/gtfsvalidator/notice/NoticeContainer.java
Line 54 in a871d4b
private static final int MAX_EXPORTS_PER_NOTICE_TYPE_AND_SEVERITY = 1_000; |
gtfs-validator/core/src/main/java/org/mobilitydata/gtfsvalidator/notice/NoticeContainer.java
Lines 190 to 193 in a871d4b
if (i > maxExportsPerNoticeTypeAndSeverity) { | |
// Do not export too many notices for this type. | |
break; | |
} |
I'd love to have the ability to output an abbreviated json validation report that reported the number of times each notices happened, but only the first X occurrences.
I see this as separate from the bug fixed here. We could allow users to reduce this limit in a separate PR by making the limit user configurable. cc @isabelle-dr.
- additional unit test - update documentation to match latest development
Some stats: As of now:
Before the fix:
Also, with this fix no additional error type is introduced (i.e. no dataset go from valid to invalid). cc @isabelle-dr |
Hello, A few thoughts/questions about the original issue and the modification of
Here are other options we could look at:
This makes a lot of sense. As @lionel-nj mentioned, I'd be interested to look at what would be the best threshold to use.
|
I'm in favor of having two distinct rules with different error levels.
I like option 2.
I suppose this could technically be OK in a 3-dimensional sense if there was a transit vehicle that traveled along a pattern that involved an elevator with different "stops" at different levels. However, I don't think that currently exists anywhere yet, so it may be best to keep this as an error.
I'll defer to @barbeau's expertise here.
Yes, this should be an error, but I'd be surprised if it occurred in the wild. |
This would be ideal (i.e., create the threshold based on a percentile based on real data), but to my knowledge not easy to calculate at the moment without a lot of manual effort. One way to approach this is based on precision of decimal places for latitude and longitude. From http://wiki.gis.com/wiki/index.php/Decimal_degrees:
So, if you cap the number of decimal places to 5 the smallest difference between values you can represent is 1.11 meters. Therefore, it would be possible for two values less than 1.11 meters apart to be capped/rounded to the same latitude and longitude value. So perhaps 1.11 meters is a good value? That should capture any "same" values that result from precision/rounding issues at 5 decimal places. That's also better accuracy that you'd typically see with an off-the-shelf consumer-grade GNSS unit capturing a vehicle path in motion. |
I prefer option 2 here.
We have historically scheduled holds at stops as two consecutive departures from the same stop, so those would have shown up as equal distance in stop_times. I believe we now schedule those holds as actual holds with arrival and departure times or post-process them to merge the pair of stops into an arrival and departure time for one stop. The spec seems to allow our old practice, so we'd expect that equal stop time distances would be allowed, if not encouraged.
1.11 m is fine, but realistically, anything less that 13 m (length of our regular buses) is basically the same from an operations perspective.
If the shape contains a loop then you could definitely have two different shape points with the same coordinates and different |
Thanks for the comments on this PR 🙏
Good point. I was thinking of consecutive stops there, which would be an error I think. Let's keep this as a potential candidate for later if we see a need.
Opened #1084 to follow up. So in conclusion, we agree on the option 2:
|
- create EqualShapeDistanceSameCoordinatesNotice - create EqualShapeDistanceDiffCoordinatesNotice - update docstrings
- update unit tests
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
return shape.shapePtLon() != otherShape.shapePtLon() | ||
&& shape.shapePtLat() != otherShape.shapePtLat() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have different coordinates while either lon XOR lat is identical. Shouldn't this be !(shape.shapePtLon() == otherShape.shapePtLon() && shape.shapePtLat() == otherShape.shapePtLat())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be !(shape.shapePtLon() == otherShape.shapePtLon() && shape.shapePtLat() == otherShape.shapePtLat())
Thanks for flagging that! Modified in 574cd7b.
if (MAX_DISTANCE_SHAPEPOINTS_METERS | ||
< getDistanceMeters(curr.shapePtLatLon(), prev.shapePtLatLon())) { | ||
noticeContainer.addValidationNotice( | ||
new EqualShapeDistanceDiffCoordinatesNotice(prev, curr)); | ||
} | ||
} else if (curr.shapePtLat() == prev.shapePtLat() | ||
&& curr.shapePtLon() == prev.shapePtLon()) { | ||
// equal shape_dist_traveled and same coordinates | ||
noticeContainer.addValidationNotice( | ||
new DecreasingOrEqualShapeDistanceNotice( | ||
curr.shapeId(), | ||
curr.csvRowNumber(), | ||
curr.shapeDistTraveled(), | ||
curr.shapePtSequence(), | ||
prev.csvRowNumber(), | ||
prev.shapeDistTraveled(), | ||
prev.shapePtSequence())); | ||
new EqualShapeDistanceSameCoordinatesNotice(prev, curr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if coordinates are different and getDistanceMeters(curr.shapePtLatLon(), prev.shapePtLatLon()) <= MAX_DISTANCE_SHAPEPOINTS_METERS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, no notice should be triggered, see:
Line 103 in 737adab
public void shapeWithEqualDistance_closeGpsCoordinates_shouldNotGenerateNotice() { |
This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and comments in-line before approval :)
.../main/java/org/mobilitydata/gtfsvalidator/validator/StopTimeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/mobilitydata/gtfsvalidator/validator/ShapeIncreasingDistanceValidator.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Maxime Armstrong <[email protected]>
…ilityData/gtfs-validator into fix-ShapeIncreasingDistanceValidator
This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: |
.../main/java/org/mobilitydata/gtfsvalidator/validator/StopTimeIncreasingDistanceValidator.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
} | ||
|
||
@Test | ||
public void noShapeDistTravelled_shouldNotGenerateNotice() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition 🚀
Thank you for your contribution ! @evansiroky @botanize @maximearmstrong @isabelle-dr |
@@ -34,7 +36,7 @@ public static GtfsShape createShapePoint( | |||
double shapePtLat, | |||
double shapePtLon, | |||
int shapePtSequence, | |||
double shapeDistTraveled) { | |||
Double shapeDistTraveled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for coming in late here, but is there a reason this was changed from double
to Double
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was done in order to support testing agains null value.
closes #1070
Summary:
This PR provides support to: make warning-level notice when an equal
shape_dist_traveled
value is encountered and the previous row in the sequence had the same GPS coordinate.Expected behavior:
DecreasingShapeDistanceNotice
as anERROR
: should be issued whenshape_dist_traveled
decreases between two consecutive shape points (based onshape_pt_sequence
);EqualShapeDistanceDiffCoordinatesNotice
as anERROR
: should be issued whenshape_dist_traveled
is equal between two consecutives shape points that have different GPS coordinates;EqualShapeDistanceSameCoordinatesNotice
as aWARNING
: should be issued whenshape_dist_traveled
is equal between two consecutives shape points that have the same GPS coordinates.@evansiroky @botanize your comments and feedback are more than welcome :)
Please make sure these boxes are checked before submitting your pull request - thanks!
gradle test
to make sure you didn't break anything[ ] Include screenshot(s) showing how this pull request works and fixes the issue(s)