Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abstract common base of SQL micro-benchmarks to reduce boilerplate and standardize parameters #17383

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Oct 19, 2024

changes:

  • adds SqlBenchmarkDatasets which contains commonly used benchmark data generator schemas
  • adds SqlBaseBenchmark which contains common benchmark segment generation methods for any benchmark using SqlBenchmarkDatasets
  • adds SqlBaseQueryBenchmark and SqlBasePlanBenchmark for benchmarks measuring queries and planning respectively
  • migrate all existing SQL jmh benchmarks to extend SqlBaseQueryBenchmark, quite dramatically reducing the boilerplate needed to create benchmarks, and allowing the use of multiple datasources within a benchmark file
  • adjustments to data generator stuff to allow passing in an ObjectMapper so that the same mapper can be used for both benchmark queries and segment generation, avoiding the need to register stuff with both mappers for benchmarks
  • adds SqlProjectionsBenchmark and SqlComplexMetricsColumnsBenchmark for measuring projections and measuring complex metric compression respectively

Common options are:

  • schemaType - "explicit" or "auto", to test differences between columns created with explicit dimension schemas vs AutoTypeColumnSchema that is used by schema discovery (and numbers have indexes and such)
  • storageType - "MMAP", "INCREMENTAL", "FRAME_COLUMNAR", "FRAME_ROW" for testing various backing "segment" types
  • stringEncoding - "UTF8", "FRONT_CODED_DEFAULT_V1", "FRONT_CODED_16_V1", for testing different string encoding strategies (only applies to "MMAP" storageType)
  • complexMetricCompression - "none", "lz4" for testing different complex metric compression in IndexSpec (only applies to "MMAP" storageType)

Most query benchmarks also have a numbered query parameter, the exception being SqlGroupByBenchmark which instead has a groupingDimension parameter.

Example:

DRUID_BENCHMARK_CACHE_DIR=./tmp java --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED --add-exports=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=jdk.management/com.sun.management.internal=ALL-UNNAMED -server -jar benchmarks/target/benchmarks.jar org.apache.druid.benchmark.query.SqlProjectionsBenchmark -p stringEncoding=UTF8 -p schemaType=explicit =p storageType=MMAP -p complexCompression=lz4 -p query=0

…d standardize parameters

changes:
* adds `SqlBenchmarkDatasets` which contains commonly used benchmark data generator schemas
* adds `SqlBaseBenchmark` which contains common benchmark segment generation methods for any benchmark using `SqlBenchmarkDatasets`
* adds `SqlBaseQueryBenchmark` and `SqlBasePlanBenchmark` for benchmarks measuring queries and planning respectively
* migrate all existing SQL jmh benchmarks to extend `SqlBaseQueryBenchmark`, quite dramatically reducing the boilerplate needed to create benchmarks, and allowing the use of multiple datasources within a benchmark file
* adjustments to data generator stuff to allow passing in an ObjectMapper so that the same mapper can be used for both benchmark queries and segment generation, avoiding the need to register stuff with both mappers for benchmarks
* adds `SqlProjectionsBenchmark` and `SqlComplexMetricsColumnsBenchmark` for measuring projections and measuring complex metric compression respectively
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant