Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test vectorization #61

Open
tkoskela opened this issue May 19, 2020 · 0 comments
Open

Test vectorization #61

tkoskela opened this issue May 19, 2020 · 0 comments

Comments

@tkoskela
Copy link
Member

tkoskela commented May 19, 2020

There is a 2.5x difference in performance of Particle State Update between Haswell and Skylake processors of the same clockspeed. One explanation could be the use of AVX512 vector instructions on Skylake. It would be interesting to show whether this is the case.

Single thread Haswell Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz:

 julia> tdac(TDAC.tdac_params(; nprt = 64, nobs = 64, enable_timers = true));
────────────────────────────────────────────────────────────────────────────────
                                         Time                   Allocations      
                                 ──────────────────────   ───────────────────────
        Tot / % measured:             77.2s / 100%            11.4GiB / 100%     
 Section                 ncalls     time   %tot     avg     alloc   %tot      avg
 ────────────────────────────────────────────────────────────────────────────────
 Particle State Update       20    45.5s  59.0%   2.28s   3.00MiB  0.03%   154KiB
 Process Noise            1.28k    27.8s  36.0%  21.7ms   10.7GiB  93.7%  8.55MiB
 Initialization               1    1.47s  1.90%   1.47s    698MiB  5.98%   698MiB
 True State Update           20    931ms  1.20%  46.5ms   42.8KiB  0.00%  2.14KiB
 Resample                    20    774ms  1.00%  38.7ms   12.2KiB  0.00%     624B
 Particle Variance           20    343ms  0.44%  17.2ms   36.6MiB  0.31%  1.83MiB
 Particle Mean               20    181ms  0.23%  9.05ms     0.00B  0.00%    0.00B
 State Copy                  20    126ms  0.16%  6.32ms      640B  0.00%    32.0B
 Weights                     20   20.8ms  0.03%  1.04ms   2.53MiB  0.02%   130KiB
 Observations             1.30k   15.7ms  0.02%  12.1μs    280KiB  0.00%     221B
 Observation Noise        1.28k   2.50ms  0.00%  1.96μs   60.0KiB  0.00%    48.0B
 ────────────────────────────────────────────────────────────────────────────────

Single thread Skylake Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz

julia> tdac(TDAC.tdac_params(; nprt = 64, nobs = 64, enable_timers = true));
────────────────────────────────────────────────────────────────────────────────
                                         Time                   Allocations      
                                 ──────────────────────   ───────────────────────
        Tot / % measured:             46.2s / 100%            11.4GiB / 100%     
 Section                 ncalls     time   %tot     avg     alloc   %tot      avg
 ────────────────────────────────────────────────────────────────────────────────
 Process Noise            1.28k    25.1s  54.2%  19.6ms   10.7GiB  93.5%  8.55MiB
 Particle State Update       20    17.8s  38.5%   890ms   4.48MiB  0.04%   229KiB
 Initialization               1    2.13s  4.61%   2.13s    698MiB  5.97%   698MiB
 Resample                    20    382ms  0.83%  19.1ms   12.2KiB  0.00%     624B
 True State Update           20    300ms  0.65%  15.0ms   42.8KiB  0.00%  2.14KiB
 Particle Variance           20    208ms  0.45%  10.4ms   36.6MiB  0.31%  1.83MiB
 State Copy                  20    130ms  0.28%  6.48ms      640B  0.00%    32.0B
 Particle Mean               20   91.8ms  0.20%  4.59ms     0.00B  0.00%    0.00B
 Observations             1.30k   17.6ms  0.04%  13.5μs    280KiB  0.00%     221B
 Weights                     20   3.94ms  0.01%   197μs   2.53MiB  0.02%   130KiB
 Observation Noise        1.28k   2.73ms  0.01%  2.13μs   60.0KiB  0.00%    48.0B
 ────────────────────────────────────────────────────────────────────────────────
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant