WIP: Some Benchmarks for get_json_object #10729

revans2 · 2024-04-22T13:29:43Z

This depends on #10728 and I don't know if this is where or how we want to deal with benchmarks. I am also happy to move this to another repo if we want to.

Signed-off-by: Robert (Bobby) Evans <[email protected]>

abellina · 2024-04-22T14:49:11Z

benchark/get_json_object_stress_run.scala

+ * limitations under the License.
+ */
+
+val input = "/data/tmp/SCALE_FROM_JSON"


do we need this file?

abellina · 2024-04-22T14:50:07Z

benchark/get_json_object_stress_gen.scala

+val numRows = 3000000
+//val nullProbability = 0.1
+val nullProbability = 0.0001
+val output = "/data/tmp/SCALE_FROM_JSON"


both input and output assume there's a /data/ folder. That may be fine and can be improved upon later, but perhaps a note would be good?

abellina · 2024-04-22T14:52:49Z

I think it's probably simple to have set of micro benchmarks tied to data gen that in spark-rapids, especially with the spark dependencies. We should add a runner in spark-rapids-benchmarks that triggers the correct code path using a jar and given a specific spark.

ttnghia · 2024-04-22T23:37:14Z

FYI: we have related benchmark implemented spark-rapids-jni: NVIDIA/spark-rapids-jni#1952

revans2 added 2 commits April 22, 2024 08:15

Let big data gen set nullability recursively

af402d8

Signed-off-by: Robert (Bobby) Evans <[email protected]>

Add in some benchmarks for get_json_object

967001b

Signed-off-by: Robert (Bobby) Evans <[email protected]>

abellina reviewed Apr 22, 2024

View reviewed changes

sameerz added the performance A performance related task/issue label Apr 23, 2024

revans2 mentioned this pull request May 3, 2024

Adjust the launch bounds to get_json_object to avoid spilling NVIDIA/spark-rapids-jni#2015

Merged

revans2 changed the base branch from branch-24.06 to branch-24.10 July 30, 2024 18:17

revans2 changed the base branch from branch-24.10 to branch-24.12 October 9, 2024 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Some Benchmarks for get_json_object #10729

WIP: Some Benchmarks for get_json_object #10729

revans2 commented Apr 22, 2024

abellina Apr 22, 2024

abellina Apr 22, 2024

abellina commented Apr 22, 2024

ttnghia commented Apr 22, 2024

WIP: Some Benchmarks for get_json_object #10729

Are you sure you want to change the base?

WIP: Some Benchmarks for get_json_object #10729

Conversation

revans2 commented Apr 22, 2024

abellina Apr 22, 2024

Choose a reason for hiding this comment

abellina Apr 22, 2024

Choose a reason for hiding this comment

abellina commented Apr 22, 2024

ttnghia commented Apr 22, 2024