Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto update parquets #20

Conversation

nup002
Copy link

@nup002 nup002 commented Mar 5, 2021

This pull request adds new functionality:

  • Optional direct update of parquet files
    Parquet files can be updated directly without the need for intermediate CSV files. This makes it much more convenient for users who have downloaded the parquet files from kaggle and wish to keep the up to date.

  • Improved output
    A progressbar is displayed during the fetching of candles, showing the user how many candles will be retrieved, and how long it is expected to take.
    image

  • Additional user settings
    The user can set global variables on the top of Main.py to control how the script behaves.
    image

The code has also been refactored in order to more easily implement the changes listed above. The functionality of the code has not been changed, and the script can be used just as before this pull request.

yo2x and others added 10 commits January 13, 2021 17:02
…tion_errors

[ADD] Added support for catching other types of connection errors
…tion_errors

[FIX] added a timeout of 30 seconds by default
Added progressbar to data retrieval.
Removed unecessary prints.
Refactored code further.
Implemented direct-to-parquet candles update.
Compressed (parquet) and data (csv) folders specified with top level settings.
Removed code related to development.
…eded the local time and would cause a progressbar exception.
@nup002
Copy link
Author

nup002 commented Mar 5, 2021

I have noticed a few minor issues that I will attempt to correct. Please wait with the merge.

@gosuto-inzasheru
Copy link
Member

Wooooow this has been on my list for so long (#13), really appreciate this! I wouldn't even mind getting rid of the csv files completely—they just take up space. I will wait for your final PR and dig through the code. Impression so far is awesome!

New setting: Optional skipping of inactive pairs.
User options are printed when program runs.
@nup002
Copy link
Author

nup002 commented Mar 5, 2021

Happy you like it. A few more issues to iron out.

@nup002
Copy link
Author

nup002 commented Mar 6, 2021

Okay, it's ready on my end. There were no additional issues that had to be fixed.

@gosuto-inzasheru gosuto-inzasheru changed the base branch from master to dev-nup002-auto-parquets March 15, 2021 16:53
@gosuto-inzasheru gosuto-inzasheru merged commit 56d141d into onchainification:dev-nup002-auto-parquets Mar 15, 2021
if bar is not None:
time_covered = datetime.fromtimestamp(last_timestamp / 1000) - start_datetime
minutes_covered = int(time_covered.total_seconds()/60)
bar.max_value = max(int((datetime.now() - start_datetime).total_seconds()/60), minutes_covered)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nup002 can I ask what you are doing here? It seems to introduce a bug where the max of the bar can become higher than what is communicated in line 154 (total_minutes_of_data).

Wouldn't it be better for consistency to stick to max(bar.max_value, minutes_covered)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, max(bar.max_value, minutes_covered) is a much more elegant solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants