Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests failures in collection_ops_test.py #11011

Open
Tracked by #11004
razajafri opened this issue Jun 8, 2024 · 0 comments · May be fixed by #11414
Open
Tracked by #11004

Fix tests failures in collection_ops_test.py #11011

razajafri opened this issue Jun 8, 2024 · 0 comments · May be fixed by #11414
Assignees
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues

Comments

@razajafri
Copy link
Collaborator

razajafri commented Jun 8, 2024

FAILED ../../../../integration_tests/src/main/python/collection_ops_test.py::test_sequence_too_long_sequence
@razajafri razajafri added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jun 8, 2024
@razajafri razajafri changed the title Fix tests failures in cast_test.py Fix tests failures in cmp_test.py Jun 8, 2024
@razajafri razajafri changed the title Fix tests failures in cmp_test.py Fix tests failures in collection_ops_test.py Jun 8, 2024
@razajafri razajafri added the Spark 4.0+ Spark 4.0+ issues label Jun 8, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jun 11, 2024
@mythrocks mythrocks self-assigned this Aug 27, 2024
mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Aug 30, 2024
Fixes NVIDIA#11011.

This commit fixes the failures in `collection_ops_tests` on Spark 4.0.

On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT,
an exception is thrown indicating that the collected Sequence/array is
larger than permissible. The different versions of Spark vary in the
contents of the exception message.

On Spark 4, one sees that the error message now contains more
information than all prior versions, including:
1. The name of the op causing the error
2. The errant sequence size

This commit introduces a shim to make this new information available in
the exception.

Note that this shim does not fit cleanly in RapidsErrorUtils, because
there are differences within major Spark versions. For instance, Spark
3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3.
Likewise, the differences in 3.5.0, 3.5.1, 3.5.2.
mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Aug 30, 2024
Fixes NVIDIA#11011.

This commit fixes the failures in `collection_ops_tests` on Spark 4.0.

On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT,
an exception is thrown indicating that the collected Sequence/array is
larger than permissible. The different versions of Spark vary in the
contents of the exception message.

On Spark 4, one sees that the error message now contains more
information than all prior versions, including:
1. The name of the op causing the error
2. The errant sequence size

This commit introduces a shim to make this new information available in
the exception.

Note that this shim does not fit cleanly in RapidsErrorUtils, because
there are differences within major Spark versions. For instance, Spark
3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3.
Likewise, the differences in 3.5.0, 3.5.1, 3.5.2.
mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Aug 30, 2024
Fixes NVIDIA#11011.

This commit fixes the failures in `collection_ops_tests` on Spark 4.0.

On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT,
an exception is thrown indicating that the collected Sequence/array is
larger than permissible. The different versions of Spark vary in the
contents of the exception message.

On Spark 4, one sees that the error message now contains more
information than all prior versions, including:
1. The name of the op causing the error
2. The errant sequence size

This commit introduces a shim to make this new information available in
the exception.

Note that this shim does not fit cleanly in RapidsErrorUtils, because
there are differences within major Spark versions. For instance, Spark
3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3.
Likewise, the differences in 3.5.0, 3.5.1, 3.5.2.

Signed-off-by: MithunR <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants