Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ParquetCachedBatchSerializer does not grab the GPU semaphore and does not have retry blocks #11989

Open
revans2 opened this issue Jan 21, 2025 · 2 comments · May be fixed by #11991
Open
Assignees
Labels
bug Something isn't working

Comments

@revans2
Copy link
Collaborator

revans2 commented Jan 21, 2025

Describe the bug
The ParquetCachedBatchSerializer does not have any code in it to grab the GPU semaphore before it reads in data using the GPU. GPUInMemoryTableScanExec does not do this either. This means that we can run into situations where we are increasing the load on the GPU memory.

To make things worse if we do run out of memory the serializer code has no retry block in it so we are only relying on spilling to solve all of the problems.

Steps/Code to reproduce bug

// produce 32 GiB of uncompressed data (RLE should make the cached version very small)
val df = spark.range(4294967296L).cache()
// Generate the cached data and process it...
df.filter("id > 100").selectExpr("COUNT(DISTINCT id)").show()
// First run passes, so run it again to read the cached data...
df.filter("id > 100").selectExpr("COUNT(DISTINCT id)").show()
// Failed with an OOM
25/01/21 19:09:55 ERROR Executor: Exception in task 6.0 in stage 3.0 (TID 23)
java.lang.OutOfMemoryError: Could not allocate native memory: std::bad_alloc: out_of_memory: RMM failure at:/home/roberte/src/spark-rapids-jni/target/libcudf/cmake-build/_deps/rmm-src/include/rmm/mr/device/limiting_resource_adaptor.hpp:152: Exceeded memory limit
	at ai.rapids.cudf.Table.readParquet(Native Method)
	at ai.rapids.cudf.Table.readParquet(Table.java:1433)
	at ai.rapids.cudf.Table.readParquet(Table.java:1400)
	at ai.rapids.cudf.Table.readParquet(Table.java:1413)
	at com.nvidia.spark.rapids.ParquetCachedBatchSerializer.$anonfun$convertCachedBatchToColumnarInternal$1(ParquetCachedBatchSerializer.scala:500)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at com.nvidia.spark.rapids.CollectTimeIterator.$anonfun$hasNext$1(GpuMetrics.scala:282)
	at com.nvidia.spark.rapids.CollectTimeIterator.$anonfun$hasNext$1$adapted(GpuMetrics.scala:281)
...

Expected behavior
We should be able to run even with low memory.

@revans2 revans2 added ? - Needs Triage Need team to review and classify bug Something isn't working labels Jan 21, 2025
@revans2
Copy link
Collaborator Author

revans2 commented Jan 21, 2025

Note that the retry case really only works on a GPU with less than the 32 GiB needed to hold the uncompressed data.

@revans2
Copy link
Collaborator Author

revans2 commented Jan 21, 2025

I think I have a fix for the semaphore problem, but I need some more time to evaluate it and then I will probably split this up and have a separate issue to deal with the retry code, as that looks to be quite a bit more complicated.

@revans2 revans2 self-assigned this Jan 21, 2025
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants