Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when downloading funding data - "cannot convert input with unit 's'" #11

Open
ewald-florian opened this issue Mar 14, 2024 · 4 comments
Assignees

Comments

@ewald-florian
Copy link

  • Lake API version: 0.12.0
  • Python version: 3.11.7
  • Operating System: Ubuntu version: 20.04.3 LTS

Description

Error when trying do download PERP funding data as the API tries to convert the column "next_funding_time" to pd.datetime which fails since the data is not given in unix format.

Reproduce Error

table = "funding"
exchange = "BINANCE_FUTURES"
trading_pair = "BTC-USDT-PERP"

start_date = datetime(2023, 1, 1, 0, 0)
end_date = datetime(2023, 12, 31, 0, 0)

df = lakeapi.load_data( 
    table=table,
    start=start_date,
    end=end_date,
    symbols=[trading_pair],
    exchanges=[exchange],
    drop_partition_cols=True,
)

Error Message

cannot convert input with unit 's'

Cause of Trouble

lake-api/main.py line 216

if "next_funding_time" in df.columns:
        df["next_funding_time"] = pd.to_datetime(df["next_funding_time"], unit="s", cache=True)

Problem

The content of column "next_funding_time" is presumably not given in unix format but rather the absolute number of nano seconds until the next funding time so it is rather a time-difference than a time-stamp. I have not read the Binance API documentation, this is just the first explanation which came to my mind.

Potential Solution

Just leave "next_funding_time" in its plain format or optionally rename it to something like "ns_to_next_funding_time".

if "next_funding_time" in df.columns:
        df.rename(columns={"next_funding_time": "ns_to_next_funding_time"}, inplace=True)

Alternatively, "next_funding_time" could just be added to origin_time to get a timestamp column format. However, at least in the limited samples I have checked, next_funding_time does not really match with the specific time difference to the actual next funding data point anyways, so I don't think this would actually add useful information.

@ewald-florian
Copy link
Author

I have now experimented a bit more with the data and figured out that "next_funding_time" is actually in unix and I can convert it afterwards using the original syntax:
pd.to_datetime(df["next_funding_time"], unit="s", cache=True)
It just breaks during the download process. Hence, I closed my pull request as this solved the problem for me but is obviously not a general adequate fix for the problem.

@leftys
Copy link

leftys commented Mar 15, 2024

That's weird, I tried now with recent binance futures funding rates and they seem to work well including to_datetime conversion. Maybe some older data cause the conversion to break, I will investigate further.

@leftys leftys self-assigned this Mar 15, 2024
@leftys
Copy link

leftys commented Mar 15, 2024

It seems older binance futures data use funding rates in nanosecond format, so unit has to be set to 'ns'. Later I introduced this bug a few versions back by automatically converting the timestamp to pandas datetime.

I released fix in lakeapi 0.13.0, check it out!

@ewald-florian
Copy link
Author

Thanks for fixing this so quickly! I just tested the exact same request with version 0.13.0 and it works without errors now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants