You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let me preface my comments with a declaration that this is just my understanding how things work. I haven't inspected or even run this code myself.
To keep the focus on IO and to make it possible to test without GPUs, the benchmark has no requirement for a real GPU. This makes sense, however, there is a sleep() involved to simulate the GPU computation time, which seems unnecessary. It appears that there is no overlapping IO and GPU emulation, so it could be removed. Otherwise, it is just a waste of time, because while it may be necessary to simulate the IO-compute flow of a real code, it isn’t necessary to simulate the wallclock runtime. Even if there is some IO and GPU emulation overlap, it seems that long-duration sleeps could be short-circuited once the IO for the stage was complete. Then, the metrics for compute, IO time, fill time, and samples/sec can be calculated as if the same wallclock time was spent, if necessary.
The text was updated successfully, but these errors were encountered:
Let me preface my comments with a declaration that this is just my understanding how things work. I haven't inspected or even run this code myself.
To keep the focus on IO and to make it possible to test without GPUs, the benchmark has no requirement for a real GPU. This makes sense, however, there is a sleep() involved to simulate the GPU computation time, which seems unnecessary. It appears that there is no overlapping IO and GPU emulation, so it could be removed. Otherwise, it is just a waste of time, because while it may be necessary to simulate the IO-compute flow of a real code, it isn’t necessary to simulate the wallclock runtime. Even if there is some IO and GPU emulation overlap, it seems that long-duration sleeps could be short-circuited once the IO for the stage was complete. Then, the metrics for compute, IO time, fill time, and samples/sec can be calculated as if the same wallclock time was spent, if necessary.
The text was updated successfully, but these errors were encountered: