Add elbow_chart to help determine ideal class size #32
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hey, @mthh.
Loved the experience using JenksPy, especially the model-like interface by @yasirroni. My use case involved trying to figure out the ideal class size (similar to this) and I had some code on hand, so thought it might be a good idea to integrate it into the package. I'm not sure if there is a methodological precedence to using elbow charts based on goodness of variance fit to do so, but the added convenience should be nice for people who need it.
At a high level, the elbow_chart function takes in the data array, a lower bound (default to 2) and an upper bound. It then gets the GVF values for each class size between the bounds (inclusive of both) and returns an elbow chart along with the results.
How Function Operates
Changes Made
Things to Discuss
Apologies for the long message, but wanted to ensure clarity. I also have ideas about integrating parallel processing in the for loop for additional performance gains, and also a sampling strategy (as mentioned here), but I think that can wait for another PR.
Example Usage