Skip to content

Commit

Permalink
Merge branch 'main' into struct-named-elements
Browse files Browse the repository at this point in the history
  • Loading branch information
etiennebacher authored May 7, 2024
2 parents a1faf84 + d6b0b62 commit 8ca2570
Show file tree
Hide file tree
Showing 81 changed files with 3,290 additions and 723 deletions.
47 changes: 22 additions & 25 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ env:
permissions: read-all

jobs:
documentation:
build:
runs-on: ubuntu-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
Expand Down Expand Up @@ -99,34 +99,31 @@ jobs:
run: |
task setup-python-tools
- name: Setup Pages
uses: actions/configure-pages@v5

- name: Build docs
run: task build-website

- name: upload docs
if: ${{ github.event_name == 'pull_request' }}
uses: actions/upload-artifact@v4
if: always()
uses: actions/upload-pages-artifact@v3
with:
name: docs
path: docs

- uses: webfactory/[email protected]
env:
DEPLOY_DOCS: ${{ secrets.DEPLOY_DOCS }}
if: ${{ (github.event_name != 'pull_request') && (github.repository_owner == 'pola-rs') }}
with:
ssh-private-key: ${{ secrets.DEPLOY_DOCS }}

# https://www.mkdocs.org/user-guide/deploying-your-docs/
- name: Build site and deploy to GitHub pages
env:
DEPLOY_DOCS: ${{ secrets.DEPLOY_DOCS }}
if: ${{ (github.event_name != 'pull_request') && (github.repository_owner == 'pola-rs') }}
uses: JamesIves/github-pages-deploy-action@v4
with:
clean: true
branch: main
folder: docs
repository-name: rpolars/rpolars.github.io
ssh-key: true
clean-exclude: |
.nojekyll
deploy:
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: ubuntu-latest
needs: build
if: ${{ (github.event_name != 'pull_request') }}
steps:
- name: Deploy to GitHub Pages
id: deployment
if: ${{ (github.event_name != 'pull_request') }}
uses: actions/deploy-pages@v4
2 changes: 1 addition & 1 deletion .github/workflows/mega-linter.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ jobs:
id: ml
# You can override MegaLinter flavor used to have faster performances
# More info at https://megalinter.io/flavors/
uses: oxsecurity/megalinter/flavors/cupcake@v7.10.0
uses: oxsecurity/megalinter/flavors/cupcake@v7.11.1
env:
# All available variables are described in documentation
# https://megalinter.io/configuration/
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/release-lib.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ jobs:
r: devel
- os: macos-14
r: oldrel-1
- os: windows-latest
r: devel

permissions:
contents: read
Expand Down
4 changes: 2 additions & 2 deletions .lycheeignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
https://megalinter.io/configuration/
https://r-lib.github.io/p/pak/stable/%s/%s/%s
https://megalinter.io/flavors/
https://rpolars.github.io/vignettes
https://rpolars.github.io/man
https://pola-rs.github.io/r-polars/vignettes
https://pola-rs.github.io/r-polars/man
10 changes: 5 additions & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: polars
Title: Lightning-Fast 'DataFrame' Library
Version: 0.16.1.9000
Version: 0.16.3.9000
Depends: R (>= 4.2)
Imports: utils, codetools, methods
Authors@R:
Expand All @@ -18,11 +18,11 @@ Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
SystemRequirements: Cargo (rustc package manager), cmake
URL: https://rpolars.github.io/,
URL: https://pola-rs.github.io/r-polars/,
https://github.com/pola-rs/r-polars,
https://rpolars.r-universe.dev/polars
Suggests:
arrow,
arrow (>= 15.0.1),
bench,
bit64,
callr,
Expand All @@ -32,7 +32,7 @@ Suggests:
jsonlite,
knitr,
lubridate,
nanoarrow,
nanoarrow (>= 0.4.0),
nycflights13,
patrick,
pillar,
Expand Down Expand Up @@ -118,5 +118,5 @@ Collate:
'zzz.R'
Config/rextendr/version: 0.3.1
VignetteBuilder: knitr
Config/polars/LibVersion: 0.39.2
Config/polars/LibVersion: 0.39.3
Config/polars/RustToolchainVersion: nightly-2024-04-15
2 changes: 1 addition & 1 deletion DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ About Rust code for R packages, see also
## Implementing new functions on the Rust side

Here are the steps required for an example contribution, where we are implementing the
[cosine expression](https://rpolars.github.io/man/Expr_cos.html):
[cosine expression](https://pola-rs.github.io/r-polars/man/Expr_cos.html):

1. Look up the [polars.Expr.cos method in py-polars documentation](https://pola-rs.github.io/polars/py-polars/html/reference/expressions/api/polars.Expr.cos.html).
2. Press the `[source]` button to see the [Python implementation](https://github.com/pola-rs/polars/blob/d23bbd2f14f1cd7ae2e27e1954a2dc4276501eef/py-polars/polars/expr/expr.py#L5892-L5914)
Expand Down
49 changes: 47 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,59 @@
# NEWS

## polars (development version)
## Polars R Package (development version)

### Breaking changes

- `pl$Struct()` now only accepts named inputs and objects of class `RPolarsField`.
For example, `pl$Struct(pl$Boolean)` doesn't work anymore and should be named
like `pl$Struct(a = pl$Boolean)` (#1053).

## polars 0.16.1
### New features

- `pl$read_ipc()` can read a raw vector of Apache Arrow IPC file (#1072).
- New method `<DataFrame>$to_raw_ipc()` to serialize a DataFrame to a raw vector
of Apache Arrow IPC file format (#1072).
- New method `<LazyFrame>$serialize()` to serialize a LazyFrame to a character
vector of JSON representation (#1073).
- New function `pl$deserialize_lf()` to deserialize a LazyFrame from a character
vector of JSON representation (#1073).
- New methods `$str$head()` and `$str$tail()` (#1074).
- New S3 methods `nanoarrow::as_nanoarrow_array_stream()` and `nanoarrow::infer_nanoarrow_schema()`
for `RPolarsSeries` (#1076).
- New method `$dt$is_leap_year()` (#1077).

## Polars R Package 0.16.3

### New features

- New method `<SQLContext>$register_globals()` (#1064).
- New experimental method `$sql()` for DataFrame and LazyFrame (#1065).

### Miscellaneous

- Move the API document website to the new place (#1067, #1068).
Access to the old website is set to redirect to the top page of the new website.
- Old URL: `https://rpolars.github.io/`
- New URL: `https://pola-rs.github.io/r-polars/`

## Polars R Package 0.16.2

### New features

- `$cut()` and `$qcut()` to bin continuous values into discrete categories (#1057).
- `pl$scan_parquet()` and `pl$read_parquet()` can read data from the internet by specifying a URL
to the first argument (#1056, @andyquinterom).
- `pl$scan_parquet()` and `pl$read_parquet()` gain an argument `storage_options`
to scan/read data via cloud storage providers (GCP, AWS, Azure). Note that this
support is experimental (#1056, @andyquinterom).
- Add support for the `Enum` datatype via `pl$Enum()` (#1061).

### Bug fixes

- In some read/scan functions, downloading files could fail if the URL was too
long. This is now fixed (#1049, @DyfanJones).

## Polars R Package 0.16.1

This is a small hot-fix release to update dependent Rust polars to 0.39.1 (#1042).

Expand Down
53 changes: 53 additions & 0 deletions R/dataframe__frame.R
Original file line number Diff line number Diff line change
Expand Up @@ -1982,6 +1982,8 @@ DataFrame_write_csv = function(
#' This functionality is considered **unstable**.
#' It may be changed at any point without it being considered a breaking change.
#' @rdname IO_write_ipc
#' @seealso
#' - [`<DataFrame>$to_raw_ipc()`][DataFrame_to_raw_ipc]
#' @examples
#' dat = pl$DataFrame(mtcars)
#'
Expand Down Expand Up @@ -2435,3 +2437,54 @@ DataFrame_clear = function(n = 0) {

out
}


# TODO: we can't use % in the SQL query
# <https://github.com/r-lib/roxygen2/issues/1616>
#' Execute a SQL query against the DataFrame
#'
#' @inherit LazyFrame_sql description details params seealso
#' @inherit pl_DataFrame return
#' @examplesIf polars_info()$features$sql
#' df1 = pl$DataFrame(
#' a = 1:3,
#' b = c("zz", "yy", "xx"),
#' c = as.Date(c("1999-12-31", "2010-10-10", "2077-08-08"))
#' )
#'
#' # Query the DataFrame using SQL:
#' df1$sql("SELECT c, b FROM self WHERE a > 1")
#'
#' # Join two DataFrames using SQL.
#' df2 = pl$DataFrame(a = 3:1, d = c(125, -654, 888))
#' df1$sql(
#' "
#' SELECT self.*, d
#' FROM self
#' INNER JOIN df2 USING (a)
#' WHERE a > 1 AND EXTRACT(year FROM c) < 2050
#' "
#' )
#'
#' # Apply transformations to a DataFrame using SQL, aliasing "self" to "frame".
#' df1$sql(
#' query = r"(
#' SELECT
#' a,
#' MOD(a, 2) == 0 AS a_is_even,
#' CONCAT_WS(':', b, b) AS b_b,
#' EXTRACT(year FROM c) AS year,
#' 0::float AS 'zero'
#' FROM frame
#' )",
#' table_name = "frame"
#' )
DataFrame_sql = function(query, ..., table_name = NULL, envir = parent.frame()) {
self$lazy()$sql(
query,
table_name = table_name,
envir = envir
)$collect() |>
result() |>
unwrap("in $sql():")
}
55 changes: 55 additions & 0 deletions R/datatype.R
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ DataType_constructors = function() {
list(
Array = DataType_Array,
Categorical = DataType_Categorical,
Enum = DataType_Enum,
Datetime = DataType_Datetime,
Duration = DataType_Duration,
List = DataType_List,
Expand Down Expand Up @@ -358,6 +359,60 @@ DataType_Categorical = function(ordering = "physical") {
.pr$DataType$new_categorical(ordering) |> unwrap()
}

#' Create Enum DataType
#'
#' An `Enum` is a fixed set categorical encoding of a set of strings. It is
#' similar to the [`Categorical`][DataType_Categorical] data type, but the
#' categories are explicitly provided by the user and cannot be modified.
#'
#' This functionality is **unstable**. It is a work-in-progress feature and may
#' not always work as expected. It may be changed at any point without it being
#' considered a breaking change.
#'
#' @param categories A character vector specifying the categories of the variable.
#'
#' @return An Enum DataType
#' @examples
#' pl$DataFrame(
#' x = c("Polar", "Panda", "Brown", "Brown", "Polar"),
#' schema = list(x = pl$Enum(c("Polar", "Panda", "Brown")))
#' )
#'
#' # All values of the variable have to be in the categories
#' dtype = pl$Enum(c("Polar", "Panda", "Brown"))
#' tryCatch(
#' pl$DataFrame(
#' x = c("Polar", "Panda", "Brown", "Brown", "Polar", "Black"),
#' schema = list(x = dtype)
#' ),
#' error = function(e) e
#' )
#'
#' # Comparing two Enum is only valid if they have the same categories
#' df = pl$DataFrame(
#' x = c("Polar", "Panda", "Brown", "Brown", "Polar"),
#' y = c("Polar", "Polar", "Polar", "Brown", "Brown"),
#' z = c("Polar", "Polar", "Polar", "Brown", "Brown"),
#' schema = list(
#' x = pl$Enum(c("Polar", "Panda", "Brown")),
#' y = pl$Enum(c("Polar", "Panda", "Brown")),
#' z = pl$Enum(c("Polar", "Black", "Brown"))
#' )
#' )
#'
#' # Same categories
#' df$with_columns(x_eq_y = pl$col("x") == pl$col("y"))
#'
#' # Different categories
#' tryCatch(
#' df$with_columns(x_eq_z = pl$col("x") == pl$col("z")),
#' error = function(e) e
#' )
DataType_Enum = function(categories) {
.pr$DataType$new_enum(categories) |> unwrap()
}


#' Check whether the data type is a temporal type
#'
#' @return A logical value
Expand Down
3 changes: 1 addition & 2 deletions R/dotdotdot.R
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,7 @@ unpack_bool_expr_result = function(...) {
if (!is.null(names(l))) {
Err_plain(
"Detected a named input.",
"This usually means that you've used `=` instead of `==`.",
"Some names seen:", head(names(l))
"This usually means that you've used `=` instead of `==`."
)
} else {
l |>
Expand Down
2 changes: 1 addition & 1 deletion R/error__rpolarserr.R
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ bad_robj = function(r) {
}

Err_plain = function(...) {
Err(.pr$Err$new()$plain(paste0(..., collapse = " ")))
Err(.pr$Err$new()$plain(paste(..., collapse = " ")))
}

# short hand for extracting an error context in unit testing, will raise error if not an RPolarsErr
Expand Down
Loading

0 comments on commit 8ca2570

Please sign in to comment.