Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checks geodataframe for crs before conversion #1459

Merged
merged 7 commits into from
Dec 2, 2024

Conversation

Azaya89
Copy link
Contributor

@Azaya89 Azaya89 commented Nov 25, 2024

@Azaya89 Azaya89 requested review from ahuang11 and maximlt November 25, 2024 15:30
@Azaya89 Azaya89 self-assigned this Nov 25, 2024
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 87.50000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 88.96%. Comparing base (9078378) to head (599aa48).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
hvplot/tests/testgeowithoutgv.py 88.23% 2 Missing ⚠️
hvplot/converter.py 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1459      +/-   ##
==========================================
+ Coverage   88.94%   88.96%   +0.01%     
==========================================
  Files          52       52              
  Lines        7781     7800      +19     
==========================================
+ Hits         6921     6939      +18     
- Misses        860      861       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Azaya89 Azaya89 requested review from hoxbro and ahuang11 November 25, 2024 17:03
Copy link
Member

@maximlt maximlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't automatically project with tiles=True when the data is:

  • a lazy dataset
  • a spatialpandas object

If think we need to update the docs to mention the second item, the first one is already documented.

image

Side note but I think I just realized that GeoDataFrame means:

  • Geometry DataFrame for spatialpandas
  • GeoSpatial DataFrame for geopandas

Could that be true? :)

@Azaya89
Copy link
Contributor Author

Azaya89 commented Nov 26, 2024

So we don't automatically project with tiles=True when the data is:

  • a spatialpandas object

We do project when tiles=True if data is a spatialpandas object. See test in L101:

        bk_plot = bk_renderer.get_plot(plot)
        assert bk_plot.projection == 'mercator'  # projection enabled due to `tiles=True`

Side note but I think I just realized that GeoDataFrame means:

  • Geometry DataFrame for spatialpandas
  • GeoSpatial DataFrame for geopandas

Could that be true? :)

I think so. See https://github.com/holoviz/hvplot/blob/main/hvplot/util.py#L496

@Azaya89 Azaya89 requested a review from maximlt November 26, 2024 11:15
@maximlt
Copy link
Member

maximlt commented Nov 26, 2024

We do project when tiles=True if data is a spatialpandas object. See test in L101:

@Azaya89, no :) The data is presented on a map with Mercator coordinates but hvPlot doesn't convert the coordinates to Mercator, so they all go on vacation to the null island! I guess that, when the data is a spatialpandas object, we should apply the same logic as for a standard Pandas DataFrame object. This is where is_geodataframe is a bit misleading, as far as I can see spatialpandas objects don't contain any geospatial metadata (e.g. CRS) contrary to geopandas objects.

image
Full Code

import geopandas as gpd
import spatialpandas as spd
import hvplot.pandas

# hack to get spatialpandas to work (already fixed, not in a final release yet)
import numpy as np
np.VisibleDeprecationWarning = np.exceptions.VisibleDeprecationWarning

data = {
    'City': ['London', 'Paris', 'Berlin', 'Madrid', 'Rome', 'Vienna', 'Warsaw', 'Amsterdam'],
    'Country': ['United Kingdom', 'France', 'Germany', 'Spain', 'Italy', 'Austria', 'Poland', 'Netherlands'],
    'Latitude': [51.5074, 48.8566, 52.5200, 40.4168, 41.9028, 48.2082, 52.2297, 52.3676],
    'Longitude': [-0.1278, 2.3522, 13.4050, -3.7038, 12.4964, 16.3738, 21.0122, 4.9041]
}
gdf = gpd.GeoDataFrame(
    data,
    geometry=gpd.points_from_xy(data['Longitude'], data['Latitude']),
    crs="EPSG:4326",
)

gdf.hvplot.points(tiles=True, color='red')

sdf = spd.GeoDataFrame(gdf)
sdf.hvplot.points(tiles=True, color='red')

@Azaya89
Copy link
Contributor Author

Azaya89 commented Nov 26, 2024

This is where is_geodataframe is a bit misleading, as far as I can see spatialpandas objects don't contain any geospatial metadata (e.g. CRS) contrary to geopandas objects.

Is having a _geometry column not considered geospatial metadata? https://github.com/holoviz/spatialpandas/blob/main/spatialpandas/geodataframe.py#L17

@maximlt
Copy link
Member

maximlt commented Nov 26, 2024

This is where is_geodataframe is a bit misleading, as far as I can see spatialpandas objects don't contain any geospatial metadata (e.g. CRS) contrary to geopandas objects.

Is having a _geometry column not considered geospatial metadata? https://github.com/holoviz/spatialpandas/blob/main/spatialpandas/geodataframe.py#L17

That certainly means it has something geo-related but not necessarily geographic, which is what we need here to perform the conversion.

Copy link
Member

@jbednar jbednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right to me!

Copy link
Member

@maximlt maximlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_geodataframe returns True when the data is a spatialpandas.dask.DaskGeoDataFrame, so the code enters a branch where it might do some operation, while we want to skip auto-projecting when we deal with lazy objects (elif not_is_lazy_data(...)). It has no bad consequence at the moment since spatialpandas objs have no crs attribute. However, I don't find that very clean and future-proof. Let's chat tomorrow @Azaya89 how to improve this since it's also related to #1396 (this PR doesn't link to or report any issue so I'm not sure what it was fixing).

        if is_geodataframe(data):
            if getattr(data, 'crs', None) is not None:
                data = data.to_crs(epsg=3857)
            return data, x, y
        elif not is_lazy_data(data):
             ...
        return data, x, y

@Azaya89
Copy link
Contributor Author

Azaya89 commented Nov 28, 2024

Let's chat tomorrow @Azaya89 how to improve this since it's also related to #1396 (this PR doesn't link to or report any issue so I'm not sure what it was fixing).

OK, although I mentioned what the #1396 PR was fixing here holoviz-topics/examples#386 (comment)

@maximlt
Copy link
Member

maximlt commented Nov 28, 2024

Let's chat tomorrow @Azaya89 how to improve this since it's also related to #1396 (this PR doesn't link to or report any issue so I'm not sure what it was fixing).

OK, although I mentioned what the #1396 PR was fixing here holoviz-topics/examples#386 (comment)

Ok nice it's mentioned in another place! But ideally there should be an issue in hvPlot, or at the very least a description of the issue in the PR that fixes it. That makes the job of the release manager easier for instance.

@Azaya89
Copy link
Contributor Author

Azaya89 commented Nov 28, 2024

Ok nice it's mentioned in another place! But ideally there should be an issue in hvPlot, or at the very least a description of the issue in the PR that fixes it. That makes the job of the release manager easier for instance.

Alright. Maybe I can add a description now in the PR...

@Azaya89 Azaya89 requested a review from maximlt November 29, 2024 14:50
@maximlt maximlt merged commit 19206c5 into holoviz:main Dec 2, 2024
9 checks passed
@Azaya89 Azaya89 deleted the azaya/1457 branch December 2, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"AttributeError: 'DataFrame' object has no attribute 'crs'" for spatialpandas polygons
5 participants