We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSS store partition divided by slice, default to 10GB.
slice
10GB
Avoid a single extremely huge partition waste whole node/disk io.
RSS store partition divided by slice, default to 10GB, then a 21G partition can stored to 3 slice.
https://docs.google.com/document/d/1R9LcPIkmWml0aD3rQbhKgO9qWBSE9NknRCyhljC09Uw/edit?usp=sharing
The text was updated successfully, but these errors were encountered:
[#2086] feat(spark): Support cut partition to slices and served by mu…
051a247
…ltiply server (#2093) ### What changes were proposed in this pull request? Support sliced store partition to multiply server. Limitation: - Only finished tested the netty mode. ### Why are the changes needed? Fix: #2086 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? - Start multiply servers and coordinator on local - Start a spark standalone env on local - Start spark-shell and execute `test.scala` ```Console bin/spark-shell --master spark://localhost:7077 --deploy-mode client --conf spark.rss.client.reassign.blockRetryMaxTimes=3 --conf spark.rss.writer.buffer.spill.size=30 --conf spark.rss.client.reassign.enabled=true --conf spark.shuffle.manager=org.apache.spark.shuffle.RssShuffleManager --conf spark.rss.coordinator.quorum=localhost:19999 --conf spark.rss.storage.type=LOCALFILE --conf spark.serializer=org.apache.spark.serializer.KryoSerializer --conf spark.rss.test.mode.enable=true --conf spark.rss.client.type=GRPC_NETTY --conf spark.sql.shuffle.partitions=1 -i test.scala ``` - test.scala ```scala val data = sc.parallelize(Seq(("A", 1), ("B", 2), ("C", 3), ("A", 4), ("B", 5), ("A", 6), ("A", 7),("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7), ("A", 7))); val result = data.reduceByKey(_ + _); result.collect().foreach(println); System.exit(0); ``` <img width="410" alt="image" src="https://github.com/user-attachments/assets/7c72fa3e-cfb5-4361-9875-a82b6aeeedfb">
Successfully merging a pull request may close this issue.
Code of Conduct
Search before asking
Describe the feature
RSS store partition divided by
slice
, default to10GB
.Motivation
Avoid a single extremely huge partition waste whole node/disk io.
Describe the solution
RSS store partition divided by
slice
, default to10GB
, then a 21G partition can stored to 3 slice.Additional context
https://docs.google.com/document/d/1R9LcPIkmWml0aD3rQbhKgO9qWBSE9NknRCyhljC09Uw/edit?usp=sharing
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: