-
-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn on imposing max rows and fix for curve #1408
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1408 +/- ##
==========================================
+ Coverage 87.39% 88.81% +1.41%
==========================================
Files 50 51 +1
Lines 7490 7618 +128
==========================================
+ Hits 6546 6766 +220
+ Misses 944 852 -92 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I feel like
MAX_ROWS
could be increased, 10000 is kind of low specially considering that now HoloViews uses the webGL backend by default for Bokeh. How about 100000? - Just for discussion for now, do you think we should expose this kind of setting in the explorer? Maybe in an advanced tab?
hvplot/ui.py
Outdated
@@ -688,10 +688,22 @@ def _plot(self): | |||
if len(df) > MAX_ROWS and not ( | |||
self.kind in KINDS['stats'] or kwargs.get('rasterize') or kwargs.get('datashade') | |||
): | |||
df = df.sample(n=MAX_ROWS) | |||
if self.kind == 'line': | |||
param.main.param.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about displaying these warnings in the alert box in addition/in place of as a programmatic warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to do df = df.sample(n=MAX_ROWS).sort_index()
. Otherwise the line plot will not work as it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree with sort_index; the x may not always be index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. is there an option to add sort_index as an option?
something like df.hvplot.explorer(x='x', y='y', kind='line', sort_index=True)
where the default is sort_index=False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it probably makes more sense to do sort_values(selected_x)
when sampling for line, but I still think head
is better, and perhaps a slider for sample size
Closes #1406
Adds a warning and checks if kind == curve to prevent random sampling; only sample head.