Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v.fill.holes: Add new tool to remove inner isles, keep outer boundary #2486

Merged
merged 10 commits into from
Jul 22, 2023

Conversation

wenzeslaus
Copy link
Member

@wenzeslaus wenzeslaus commented Jul 20, 2022

Example

Input

Screenshot from 2022-07-20 13-02-31

Output

Screenshot from 2022-07-20 13-02-57

@wenzeslaus
Copy link
Member Author

I would like for this to go to core rather than the grass-addons repo because it seems like basic functionality, but I have seen some topology issues with complex data (will post on that later).

@wenzeslaus wenzeslaus added enhancement New feature or request vector Related to vector data processing labels Jul 20, 2022
@wenzeslaus
Copy link
Member Author

What I'm missing here? I have an issue with the current implementation. If there are touching areas, it does not create a valid topology: some of the areas are not build. The same happens to be me with real-word large data and small hand-drawn data. Any comments or questions welcome.

Failing example

Input

Screenshot from 2022-07-22 15-48-44

Output

Screenshot from 2022-07-22 15-48-51

Output: boundaries and centroids

Screenshot from 2022-07-22 15-50-29

Output: vector digitizer rendering

Screenshot from 2022-07-22 15-50-56

v.build output

Number of centroids exceeds number of areas: 4 > 2
Number of incorrect boundaries: 2
Number of centroids outside area: 2 

Data

GitHub does not take the GRASS pack format even with other file extensions I tried, so it is a zipped .pack file: test2.zip

@wenzeslaus wenzeslaus added this to the 8.4.0 milestone Jul 25, 2022
@neteler
Copy link
Member

neteler commented Aug 28, 2022

Does @metzm have a suggestion here on the topology issue?

Comment on lines 214 to 207
Vect_set_release_support(&input);
Vect_set_release_support(&output);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, the operating system does it better and faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the performance on a vector with 186,940 areas (180,015 after the operation). Here are its stats:

nodes=227263
boundaries=352424
centroids=166520
areas=186940
islands=61779
primitives=518944

I used perf stat for time and valgrind for the leaked memory. Here are the results:

memory release perf stat -r 10 seconds time elapsed (3x10 runs) valgrind LEAK SUMMARY still reachable; definitively lost
enabled 24.39, stddev 0.18 (24.181, 24.488, 24.513) 8,297,834 bytes in 145 blocks; 191,032 bytes in 184,529 blocks
disabled 23.97, stddev 0.36 (23.557, 24.212, 24.151) 248,489,855 bytes in 6,580,166 blocks blocks; 6,579 bytes in 76 blocks
difference (enabled minus disabled) 0.42 (1.7%) -240,192,021 (regression) ; 177,950 (improvement)

I don't think the results here support the idea that OS will be "much faster" as the old doc/debugging.txt claims in a note from 2005 (OSGeo/grass-legacy@fb92c0f). The difference of averages is smaller than 2*stddev.

However, given the overall lack of memory handling and a policy to tackle that, I'm fine leaving that out for the small performance gain (for now).

vector/v.fill.holes/main.c Outdated Show resolved Hide resolved
@wenzeslaus wenzeslaus modified the milestones: 8.3.0, 8.4.0 Feb 10, 2023
@wenzeslaus
Copy link
Member Author

The new version works! Thanks @metzm for the pointers!

The updated version contains documentation including images and notebook to generate them. I included some notes on topology because it was not what I expected. It as pytest-based tests.

Some features this could have, but I do not plan to implement:

  • maximum (or minimum) size to remove
  • keep features of other vector types (e.g. keep points)
  • keep boundaries with categories (may represent lines which are at the same time boundaries)
  • copy all attribute tables with layer=-1 (Now it copies only relevant rows of the attribute table for the selected layer. This usually should be the whole table because all areas with centroids are kept, so wanting to simply preserve everything makes sense.)

From my perspective, this is ready to merged. Reviews welcome.

@wenzeslaus wenzeslaus marked this pull request as ready for review July 14, 2023 17:54
@wenzeslaus wenzeslaus changed the title v.fill.holes: Remove isles, keep outer boundary v.fill.holes: Remove inner isles, keep outer boundary Jul 17, 2023
@petrasovaa
Copy link
Contributor

Perhaps change the PR name to make it clear it's a completely new tool.

@wenzeslaus wenzeslaus changed the title v.fill.holes: Remove inner isles, keep outer boundary v.fill.holes: Add new tool to remove inner isles, keep outer boundary Jul 18, 2023
@wenzeslaus wenzeslaus merged commit 3083e78 into OSGeo:main Jul 22, 2023
@wenzeslaus wenzeslaus deleted the v_fill_holes branch July 22, 2023 01:31
landam pushed a commit to landam/grass that referenced this pull request Oct 25, 2023
…OSGeo#2486)

* Write boundaries of areas with centroids.
* Writes each boundary only once using index of already written boundaries.
* Copies attributes.
* Has tests for removal of holes are preservation of attributes.
* Has documentation including an image and a notebook to generate the image.
* Does not free memory at the end as in many existing modules.
neteler pushed a commit to nilason/grass that referenced this pull request Nov 7, 2023
…OSGeo#2486)

* Write boundaries of areas with centroids.
* Writes each boundary only once using index of already written boundaries.
* Copies attributes.
* Has tests for removal of holes are preservation of attributes.
* Has documentation including an image and a notebook to generate the image.
* Does not free memory at the end as in many existing modules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request vector Related to vector data processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants