Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong output cardinality #80

Open
CourchesneA opened this issue May 25, 2023 · 3 comments
Open

Wrong output cardinality #80

CourchesneA opened this issue May 25, 2023 · 3 comments

Comments

@CourchesneA
Copy link

Hi,

Using a simple query for historical weather data for the past year, I get a lot of duplicated values.

ec = ECHistoricalRange(station_id=10761, timeframe="daily", daterange=(datetime.datetime(2022,1,1), datetime.datetime(2022,12,31)))
weather_data = ec.get_data()
print(weather_data.shape)
>>> (4380, 30)

I was expecting 365 values, but I get 4380, which are exactly the same records repeated 12 time. I assume there is a loop iterating over months happening somewhere, but I'm not sure where

@michaeldavie
Copy link
Owner

For some reason it seems like the API is returning a full year when only a month is requested. @darrenwiens are you able to take a look at this?

@fitzb
Copy link
Contributor

fitzb commented Jan 7, 2025

This is caused by the pulling the yearly data down for every month when it isn't required. YOu are getting exactly 12*365 days. So if you set the time frame to 2, the month parameter is ignored. So you get the data 12 times. Should be simple enough to wrap the retrieval in a some conditionals and then filtering the data frame to drop the values outside the desired range. For daily summaries you want to iterate over the years of the span and filter to keep the rows between the days. You also are going to need to deduplicate the rows. I will take a swing at this and add some tests to check this.

curl "https://climate.weather.gc.ca/climate_data/bulk_data_e.html?stationID=10761&Year=2022&Month=1&format=csv&timeframe=2&submit=Download+Data" | wc -l

@fitzb
Copy link
Contributor

fitzb commented Jan 9, 2025

This is fixed by #97. v0.8.0 released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants