-
Notifications
You must be signed in to change notification settings - Fork 0
/
version_timing.txt
85 lines (69 loc) · 2.8 KB
/
version_timing.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
Here are some basic performance numbers, when running the program with no
arguments on a Radeon VII GPU. The most-recent measurements are at the bottom
of this file.
Performance at commit a112cb108b2a9eb3ef8f64138c93b7696262079b:
(This is the version with the "old" complicated kernel.)
Creating 1000x1000 image, 50 samples per thread, 100 max iterations.
Calculating image...
Approximate memory needed: 9.129 MiB GPU, 9.537 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
3007 Buddhabrot passes took 10.003783 seconds.
Max value: 102438, scale: 0.639753
Done! Saving image.
Output image saved: output.pgm
Performance at commit 5d15d53696d30bc984b87ff3860c9af1eacb041e:
(This is the version with the "new" simplified kernel.)
Creating 1000x1000 image, 50 samples per thread, 100 max iterations.
Calculating image...
Approximate memory needed: 9.129 MiB GPU, 9.537 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
2877 Buddhabrot passes took 10.004008 seconds.
Max value: 100974, scale: 0.649028
Done! Saving image.
Output image saved: output.pgm
Performance at commit b95671388df259cc7f4c7514e9e907d2f88cb8a6:
(This is the version with inlined functions and small optimizations.)
Creating 1000x1000 image, 50 samples per thread, 100 max iterations.
Calculating image...
Approximate memory needed: 9.129 MiB GPU, 9.537 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
2904 Buddhabrot passes took 10.003585 seconds.
Max value: 101955, scale: 0.642784
Done! Saving image.
Output image saved: output.pgm
Performance at commit f407a418cca13dd59c518d2193d9f38098016693:
(This is the version where "#pragma unroll 4" was added to the Mandelbrot-set
loops.)
Creating 1000x1000 image, 50 samples per thread, 100 max iterations.
Calculating image...
Approximate memory needed: 9.129 MiB GPU, 9.537 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
2976 Buddhabrot passes took 10.004047 seconds.
Max value: 104509, scale: 0.627075
Done! Saving image.
Output image saved: output.pgm
Performance at commit 58fbe99907105142497322425ea8c73241d0b500:
(This is the version after increasing the block count to 512.)
Creating 1000x1000 image, 100 max iterations.
Calculating image...
Approximate memory needed: 19.629 MiB GPU, 9.537 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
812 Buddhabrot passes took 10.010447 seconds.
Max value: 227153, scale: 0.288506
Done! Saving image.
Output image saved: output.pgm
Performance at commit 456d23546f7ebcdfb92aa2512bf379d9732be11b:
(This is the version after reducing the internal pixel buffer to 32 bit.)
Calculating image...
Approximate memory needed: 15.815 MiB GPU, 5.722 MiB CPU
Calculating Buddhabrot.
Running for 10.000 seconds.
818 Buddhabrot passes took 10.009091 seconds.
Max value: 228951, scale: 0.286240
Done! Saving image.
Output image saved: output.pgm