Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library Request: cuDF #8318

Open
4 tasks done
AlexCatarino opened this issue Sep 11, 2024 · 1 comment
Open
4 tasks done

Library Request: cuDF #8318

AlexCatarino opened this issue Sep 11, 2024 · 1 comment

Comments

@AlexCatarino
Copy link
Member

cuDF (pronounced "KOO-dee-eff") is a GPU DataFrame library for loading, joining, aggregating, filtering, and otherwise manipulating data.

Test:

import cudf

tips_df = cudf.read_csv("https://github.com/plotly/datasets/raw/master/tips.csv")
tips_df["tip_percentage"] = tips_df["tip"] / tips_df["total_bill"] * 100

# display average tip by dining party size
print(tips_df.groupby("size").tip_percentage.mean())

Gives us:

No module named 'cudf'

Checklist

  • I have completely filled out this template
  • I have confirmed that this issue exists on the current master branch
  • I have confirmed that this is not a duplicate issue by searching issues
  • I have provided detailed steps to reproduce the issue
@beckernick
Copy link

Hi! I came across this issue due to the cuDF reference. I work on cuDF and other RAPIDS projects at NVIDIA.

In addition to being a GPU library, cuDF can provide zero code change GPU-acceleration for pandas and (as of yesterday) Polars.

%load_ext cudf.pandas # or via command line for Python scripts

df = pd.read_parquet(filepath)

(df[["Registration State", "Violation Description"]]
 .value_counts()
 .groupby("Registration State")
 .head(1)
 .sort_index()
 .reset_index()
)
import polars as pl

ldf = pl.LazyFrame({"a": [1.242, 1.535]})

print(
    ldf.select(
        pl.col("a").round(1)
    ).collect(engine="gpu")
)

Would love to see these capabilities available for LEAN users. Happy to try to help answer any questions that might come up if you or anyone else explores this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants