Allow segmentations to be passed in vector format #114

lupinthief · 2023-11-07T09:33:59Z

traccuracy version: 0.0.2
Python version: 3.8
Operating System: Windows 11

Description

It would be useful for geospatial applications, situations where gt and pred domains don't necessarily match, and to minimise memory use and data storage volumes if segmentations could be passed in vector format rather than as raster/np.array masks.

Looking through the codebase, I think this could be achieved by effectively bypassing the regionprops stage for label extraction and instead passing a pd.DataFrame with label, t, and geometry columns, where the geometry column is a coordinate sequence representing a polygon.

Matching would then require calculating intersections of these polygons. Shapely does this nicely for geodata and would seem the obvious way to go but it could probably be implemented without creating the additional (presumably optional) dependency.

cmalinmayor · 2023-11-07T15:56:37Z

Thanks for the interesting idea, we haven't yet considered segmentations that aren't masks. Can you give an example dataset with segmentations in a vector format that we could use for dev/testing?

I don't think this is at the top of our list to support right now, but we can see what the other devs think. I'd be happy to revisit this feature after Version 1 release once we have all the basics of the library implemented. I agree that this change does require being more flexible with the type of the segmentation that we store in the TrackingGraph, so I will keep that in mind as we solidify the API.

lupinthief · 2023-11-07T16:57:06Z

Thanks for the response. Absolutely understand that this is probably a bit left-field at the moment, and it may always be. I'm hoping I might be able to do a bit of work on it in the next few weeks.

Here is a sample of the GT dataset I'm working with. It represents iceberg movement around Greenland and contains a json string. If you read it into a df you'll see there's a 'str_geom' column that defines the geometry and columns for ID, parent, t, x, y, z. I'm trying to use btrack to track them automatically.

CI2D3_subset_subset_for_btrack.txt

It looks like the TrackingGraph doesn't care what the segmentation is at the moment, just the loaders and matchers, or am I wrong?

cmalinmayor · 2023-11-07T17:48:57Z

It looks like the TrackingGraph doesn't care what the segmentation is at the moment, just the loaders and matchers, or am I wrong?

The segmentation is just stored in the TrackingGraph, with a docstring that says it is a np.array. You are correct that it truly can be any type, as long as the loader and the matcher agree on what it is. Nothing would need to change in the TrackingGraph except we should update the docstring - I more meant that since we are still finalize the API, it is good to know that we should keep them decoupled.

Here is a sample of the GT dataset I'm working with. It represents iceberg movement around Greenland and contains a json string. If you read it into a df you'll see there's a 'str_geom' column that defines the geometry and columns for ID, parent, t, x, y, z. I'm trying to use btrack to track them automatically.

Thank you!! Glad you found this library to help evaluate the results 🙂. If you can write a loader/matcher that works for your data, even if we don't merge it, you should still be able to run the metrics! We are actively working on documenting the meaning of all the metrics more thoroughly, so hopefully soon it will be even more obvious which ones are most relevant for icebergs vs cells.

DragaDoncila · 2023-11-15T21:36:45Z

Thanks for this issue @lupinthief! I think exploring other domains for traccuracy could be a cool idea!

It sounds like right now the main difference is the data representation, so I tend to agree with @cmalinmayor that a loader is what would unlock your workflow. If it is possible (and not very inefficient or otherwise impractical) to load your polygons into a dense numpy array, then I think even the matching would work fine - we just need pixel-wise classification. But I realize that may be a non-starter for your data. If that were the case then a matcher would also be required.

lupinthief · 2024-01-17T15:10:31Z

Just a quick update on this - I've got a somewhat hacky vector-based loader and matcher working. Converting to numpy arrays would be feasible but, in the long run, seems like an inefficient way to handle and transport these data so I've stuck with vector.
A similar approach may be helpful for issue #134, allowing point to polygon comparisons as well as point to point within a given search radius, which could also be handy. It could be easily adaptable to handle actual IOUs rather than bounding box IOUs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow segmentations to be passed in vector format #114

Allow segmentations to be passed in vector format #114

lupinthief commented Nov 7, 2023

cmalinmayor commented Nov 7, 2023

lupinthief commented Nov 7, 2023

cmalinmayor commented Nov 7, 2023

DragaDoncila commented Nov 15, 2023

lupinthief commented Jan 17, 2024

Allow segmentations to be passed in vector format #114

Allow segmentations to be passed in vector format #114

Comments

lupinthief commented Nov 7, 2023

Description

cmalinmayor commented Nov 7, 2023

lupinthief commented Nov 7, 2023

cmalinmayor commented Nov 7, 2023

DragaDoncila commented Nov 15, 2023

lupinthief commented Jan 17, 2024