Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

point_id, cluster_labels and ID_col are all confusing and create problem. #105

Open
ArcticSnow opened this issue Jan 23, 2024 · 0 comments

Comments

@ArcticSnow
Copy link
Owner

I am running into problem with the confusion around the definition and usage of the variables df_centroids.point_id, df_centroids.cluster_labels and the implementation of non numeric point_id.

I think this must be clarified:
So df_centroids is the central table keeping track of the point at which downscaling occurs. These points have a name, lat, lon and many other attributes.

  • point_id: originally was loosely defined as the index of each point in the table, as well as a surrogate of a name of point. It will be split in 2:
    • point_name that is a string,
    • point_ind that will be an integer corresponding to the table index df_centroids.index
  • cluster_labels: output of the kmean algo. This is an integer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant