University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking
Dear Grader!
As you may not know, I registered late for the class(Actually registered on the last day of registration), and I negotiated with Professor Shehzan about due dates via email and my Project 1 due date is Sept 22. However as you can see, I was late for about 26 hours. I know there are 4 late days without penalty but I only want to use ONE on this project. I didn't mean to waste a late day only for the extra 2 hours. You can deduct points for being late for one day (hopefully 2 hours!)
Thanks
- Xiao Wei
- (TODO) LinkedIn, personal website, twitter, etc.
- Tested on: Windows 10, i9-9900k @ 3.6GHz 16.0GB, RTX 2080 SUPER 16GB
Simulation gif
For each implementation, how does changing the number of boids affect performance? Why do you think this is?
Performance drops as we increase the number of boids. Each boid in the flocking takes resource(thread) and computational resource #to get their velocity and postion. More boids, more work
For each implementation, how does changing the block count and block size affect performance? Why do you think this is?
Changing block size actually does not change performance drastically. Acquiring more block only needs simple operation
For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?
Yes, the performance improved. I expected the outcome since we don't bother to access dev_particleArrayIndices
Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!
27 Cells actually is faster somehow, I guess the smaller 27 cells give more granularity for paralleling.