-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Revise and rework user-guide/expressions #19360
base: main
Are you sure you want to change the base?
docs: Revise and rework user-guide/expressions #19360
Conversation
2dd881c
to
4041c8b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #19360 +/- ##
==========================================
- Coverage 80.24% 80.21% -0.04%
==========================================
Files 1523 1523
Lines 209545 209853 +308
Branches 2434 2434
==========================================
+ Hits 168148 168324 +176
- Misses 40842 40974 +132
Partials 555 555 ☔ View full report in Codecov by Sentry. |
@stinodego could you review this one this week? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the points below for some general feedback. Take these as suggestions, not law :)
- Overview page: There are quite a lot of subchapters - perhaps we can group them in a sensible way at least in the overview (e.g. put all the namespaces (struct, string, ...) together, put all operation types together (window, folds, ...), you get the picture).
- Basic operations: n_unique feels a bit out of place here - it would fit better in the Aggregation section.
- Expression expansion: The naming sections feels out of place - it's good to mention suffix/prefix but otherwise it's not related to expression expansion.
- Expression expansion: We shouldn't list the full selector API in the user guide - we have the API reference for this.
- Feels like plugins should be its own chapter - not part of Expressions.
- NumPy functions: feels like this shouldn't be under expressions, rather in an 'interop' chapter somewhere.
- A lot of Rust examples seem to be missing still.
- Nice that we can better display math now!
!!! note | ||
The function `all` is syntactic sugar for `pl.col("*")`, but since the argument `"*"` is a special case and `pl.all` reads more like English, the usage of `pl.all` is preferred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This note doesn't render correctly. I think it needs an empty line after !!! note
and indentation.
There are a bunch of other notes / warnings that have the same problem. I won't note all of them - just do a quick search and replace and you'll find them.
# --8<-- [end:col-exclude] | ||
|
||
# --8<-- [start:duplicate-error] | ||
import polars.exceptions as plexc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the syntax from polars.exceptions import DuplicateError
is nicer, and I would like our user guide to use this 'recommended' way of importing exceptions.
I realize this is subjective and honestly I don't have any very strong arguments - just thought I'd note it.
|
||
| Operator | Operation | | ||
| -------- | -------------------- | | ||
| `A | B` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't render correctly.
--8<-- "python/user-guide/expressions/lists.py:array-example" | ||
``` | ||
|
||
The example above shows how to use the parameter `schema_overrides` to specify that the columns “bit_flags” and “tic_tac_toe” have the data type `Array`, parametrised by the data type of the elements contained within and by the shape of each array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good example for schema_overrides
- it's better to just use schema
here.
|
||
{{code_block('user-guide/concepts/data-types/categoricals','example',[])}} | ||
|
||
From the code block above you can see that the data type `Enum` requires the categories to be provided upfront, while the data type `Categorical` infers the categories at runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code block does not render.
--8<-- "docs/source/user-guide/expressions/speed_rank_by_type.svg" | ||
</div> | ||
|
||
Note how the row for the Pokémon “Goldbat” has a “Speed” value of `90`, which is greater than the value `80` of the Pokémon “Venusaur”, and yet the latter was ranked 1 because “Goldbat” and “Venusar” do not share the same value for the column “Type 1”. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Pokemon is called Golbat, not Goldbat. Very important!
PLEASE do not review this PR by looking at the diff.
Checkout the PR and build the docs with
mkdocs serve
.TODO: