Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

martin2038: first submission #665

Merged
merged 12 commits into from
Feb 2, 2024
Merged

Conversation

martin1847
Copy link
Contributor

@martin1847 martin1847 commented Jan 30, 2024

Check List:

  • You have run ./mvnw verify and the project builds successfully
  • Tests pass (./test.sh <username> shows no differences between expected and actual outputs)
  • All formatting changes by the build are committed
  • Your launch script is named calculate_average_<username>.sh (make sure to match casing of your GH user name) and is executable
  • Output matches that of calculate_average_baseline.sh
  • For new entries, or after substantial changes: When implementing custom hash structures, please point to where you deal with hash collisions (line number)
  • Execution time: 5.531 total
  • Execution time of reference implementation: 3:12.23 total
  • Apple M2 Pro/32GB

@martin1847
Copy link
Contributor Author

Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-1.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-10.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-10000-unique-keys.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-2.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-20.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-3.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-boundaries.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-complex-utf8.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-dot.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-rounding.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-short.txt
Validating calculate_average_martin2038.sh -- src/test/resources/samples/measurements-shortest.txt

@martin1847
Copy link
Contributor Author

mvn formatting
has done !!
please approval , thanks~

@gunnarmorling
Copy link
Owner

Getting this error when running:

Fatal error: Failed to leave the current IsolateThread context and to detach the current thread. (code 12)

We've had it before, I think it masks an actual other exception. I don't know though what it is. It occurs when running the test on 32 cores, maybe this trips up your chunking logic somehow?

@martin1847
Copy link
Contributor Author

martin1847 commented Jan 31, 2024

Getting this error when running:

Fatal error: Failed to leave the current IsolateThread context and to detach the current thread. (code 12)

We've had it before, I think it masks an actual other exception. I don't know though what it is. It occurs when running the test on 32 cores, maybe this trips up your chunking logic somehow?

maybe , I change to the JVM model instead of Native.
pls approval again~

@gunnarmorling
Copy link
Owner

Can run it now, but it produces incorrect output for the 10K keyset test (see create_measurements_3.sh).

@martin1847
Copy link
Contributor Author

has fixed the round(mean), that makes the avg value not the same at the Last digit.
pls try again~ THANKS.
Looking forward to running into closing time~

@gunnarmorling
Copy link
Owner

Still failing. Getting this exception for the 10K 1B rows set:

Exception in thread "main" java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE
	at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1293)
	at dev.morling.onebrc.CalculateAverage_martin2038.lambda$main$0(CalculateAverage_martin2038.java:96)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1709)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:960)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:934)
	at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:759)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:676)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:264)
	at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:677)
	at dev.morling.onebrc.CalculateAverage_martin2038.main(CalculateAverage_martin2038.java:117)

@martin1847
Copy link
Contributor Author

previously I use the availableProcessors to split the file , when cpu number too few (<8) , then chunk size greater than Integer.MAX_VALUE (2GB, 10K 1B rows about 16GB, my M2/mac has 10 cpus)
has fixed !!!
expect give me a last chance!

@gunnarmorling gunnarmorling merged commit f02279d into gunnarmorling:main Feb 2, 2024
1 check passed
@gunnarmorling
Copy link
Owner

Looking good now. 00:09.725.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants