Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark dataframes of type pyspark.sql.connect.dataframe.DataFrame raises error #1677

Open
3 tasks done
invalidarg opened this issue Nov 20, 2024 · 0 comments
Open
3 tasks done

Comments

@invalidarg
Copy link

Current Behaviour

If you use Spark Connect, e.g. in databricks then dataframes are of the type pyspark.sql.connect.dataframe.DataFrame.

When df is of type pyspark.sql.connect.dataframe.DataFrame.

a = ProfileReport(df)
raises

TypeCheckError: argument "df" (pyspark.sql.connect.dataframe.DataFrame) did not match any element in the union:
  pandas.core.frame.DataFrame: is not an instance of pandas.core.frame.DataFrame
  pyspark.sql.dataframe.DataFrame: is not an instance of pyspark.sql.dataframe.DataFrame
  NoneType: is not an instance of NoneType

Related:
https://community.databricks.com/t5/data-engineering/pyspark-sql-connect-dataframe-dataframe-vs-pyspark-sql-dataframe/td-p/71055

Expected Behaviour

The type checking should allow pyspark.sql.connect.dataframe.DataFrame as well.

Data Description

N/A

Code that reproduces the bug

# on a databricks cluster with Spark Connect
import ydata_profiling

df = spark.table('mytable')
a = ProfileReport(df)

pandas-profiling version

v4.12.0

Dependencies

pyspark==3.5.3

OS

No response

Checklist

  • There is not yet another bug report for this issue in the issue tracker
  • The problem is reproducible from this bug report. This guide can help to craft a minimal bug report.
  • The issue has not been resolved by the entries listed under Common Issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants