Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Revise and rework user-guide/expressions #19360

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rodrigogiraoserrao
Copy link
Collaborator

@rodrigogiraoserrao rodrigogiraoserrao commented Oct 21, 2024

PLEASE do not review this PR by looking at the diff.
Checkout the PR and build the docs with mkdocs serve.

TODO:

  • revise page on expression plugins
  • revise page on user-defined Python functions

@rodrigogiraoserrao rodrigogiraoserrao changed the title docs: revise and rework user-guide/expressions docs: Revise and rework user-guide/expressions Oct 21, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation python Related to Python Polars rust Related to Rust Polars and removed title needs formatting labels Oct 21, 2024
Copy link

codecov bot commented Oct 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.21%. Comparing base (8a76dad) to head (4041c8b).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #19360      +/-   ##
==========================================
- Coverage   80.24%   80.21%   -0.04%     
==========================================
  Files        1523     1523              
  Lines      209545   209853     +308     
  Branches     2434     2434              
==========================================
+ Hits       168148   168324     +176     
- Misses      40842    40974     +132     
  Partials      555      555              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46
Copy link
Member

@stinodego could you review this one this week?

Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the points below for some general feedback. Take these as suggestions, not law :)

  • Overview page: There are quite a lot of subchapters - perhaps we can group them in a sensible way at least in the overview (e.g. put all the namespaces (struct, string, ...) together, put all operation types together (window, folds, ...), you get the picture).
  • Basic operations: n_unique feels a bit out of place here - it would fit better in the Aggregation section.
  • Expression expansion: The naming sections feels out of place - it's good to mention suffix/prefix but otherwise it's not related to expression expansion.
  • Expression expansion: We shouldn't list the full selector API in the user guide - we have the API reference for this.
  • Feels like plugins should be its own chapter - not part of Expressions.
  • NumPy functions: feels like this shouldn't be under expressions, rather in an 'interop' chapter somewhere.
  • A lot of Rust examples seem to be missing still.
  • Nice that we can better display math now!

Comment on lines +101 to +102
!!! note
The function `all` is syntactic sugar for `pl.col("*")`, but since the argument `"*"` is a special case and `pl.all` reads more like English, the usage of `pl.all` is preferred.
Copy link
Member

@stinodego stinodego Oct 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note doesn't render correctly. I think it needs an empty line after !!! note and indentation.

There are a bunch of other notes / warnings that have the same problem. I won't note all of them - just do a quick search and replace and you'll find them.

# --8<-- [end:col-exclude]

# --8<-- [start:duplicate-error]
import polars.exceptions as plexc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the syntax from polars.exceptions import DuplicateError is nicer, and I would like our user guide to use this 'recommended' way of importing exceptions.

I realize this is subjective and honestly I don't have any very strong arguments - just thought I'd note it.


| Operator | Operation |
| -------- | -------------------- |
| `A | B` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render correctly.

--8<-- "python/user-guide/expressions/lists.py:array-example"
```

The example above shows how to use the parameter `schema_overrides` to specify that the columns “bit_flags” and “tic_tac_toe” have the data type `Array`, parametrised by the data type of the elements contained within and by the shape of each array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good example for schema_overrides - it's better to just use schema here.


{{code_block('user-guide/concepts/data-types/categoricals','example',[])}}

From the code block above you can see that the data type `Enum` requires the categories to be provided upfront, while the data type `Categorical` infers the categories at runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code block does not render.

--8<-- "docs/source/user-guide/expressions/speed_rank_by_type.svg"
</div>

Note how the row for the Pokémon “Goldbat” has a “Speed” value of `90`, which is greater than the value `80` of the Pokémon “Venusaur”, and yet the latter was ranked 1 because “Goldbat” and “Venusar” do not share the same value for the column “Type 1”.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Pokemon is called Golbat, not Goldbat. Very important!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants