Remove the use of CUDA API wrappers #386

makortel · 2019-09-11T15:14:18Z

This issue is to track progress for removing the use of CUDA API wrappers. The library turned out to not be that useful (see some discussion in #279 (comment))).

We are currently using the following components (extracted with git grep, may be incomplete)

The text was updated successfully, but these errors were encountered:

makortel · 2019-09-11T15:17:46Z

@fwyzard About the cuda::throw_if_error(), given that we are mostly using cudaCheck(), how about changing cudaCheck() such that there would be a compile-time flag whether to abort() or throw an exception? (such that in CMSSW master it would throw an exception, patatrack and private forks could still use the abort())

makortel · 2019-09-13T16:11:03Z

I'll start with removing the streams and events.

fwyzard · 2019-09-14T06:34:54Z

OK.
I'll have a look at providing an equivalent of cuda::launch().

fwyzard · 2019-09-14T08:36:32Z

cuda::launch() does not support functions objects or lambdas - just plain functions, right ?

makortel · 2019-09-16T13:56:58Z

cuda::launch() does not support functions objects or lambdas - just plain functions, right ?

I believe it is just a wrapper for the kernel launch <<<...>>>, so the passed function has to be __global__ function (or whatever CUDA kernel launch supports).

makortel · 2019-09-17T18:10:22Z

First part of streams and events is done in #389.

fwyzard · 2019-09-17T23:20:57Z

@makortel I have prepared an alternative for cuda::launch().

Do you prefer to keep the same syntax, e.g.

launch(kernel, {gridDim, blockDim, sharedMem = 0, stream = nullptr}, args...);

or something more like what Cupla uses, e.g.

launch(kernel)(gridDim, blockDim, sharedMem = 0, stream = nullptr)(args...);

?

makortel · 2019-09-18T01:54:04Z

@fwyzard I don't have a clear preference (I would have probably gone with the bare kernel launch syntax, i.e. <<<...>>>, but am really fine with ~any syntax since the use is not that widespread).

fwyzard · 2019-09-18T05:44:30Z

OK, I'll stick to the current syntax then, since it is simpler to implement.

The advantage over the kernel<<<...>>>(...) syntax is that launch(kernel, {...}, ...) can be used also in the host compiler.

fwyzard · 2019-10-24T13:31:12Z

I have asked @waredjeb to look into the cuda::memory operations, replacing

cuda::memory::device::make_unique() with cudautils::make_device_unique()
cuda::memory[::async]::copy() with cudaMemcpy[Async]()
cuda::memory[::async]::zero() and ::set() with cudaMemset[Async]()

makortel · 2019-10-24T14:03:00Z

cuda::memory::device::make_unique() with cudautils::make_device_unique()

The cudautils::make_device_unique() is not a direct replacement as it (currently) requires the CUDA stream (although I'm thinking to add a variant that caches the allocation but is not "tied" to a CUDA stream).

IIRC all calls to cuda::memory::device::make_unique() are in unit tests. I was thinking that maybe the API wrappers could be acceptable there. On the other hand, using the CUDAStreamCache to get the stream for the allocator is not too hard either.

fwyzard · 2019-10-24T14:43:01Z

Would it work to explicitly pass stream 0 (actually nullptr) to use the default stream ?

makortel · 2019-10-24T14:54:07Z

Would it work to explicitly pass stream 0 (actually nullptr) to use the default stream ?

Probably, since all *Async API calls accept that (right?).

fwyzard · 2019-10-24T14:55:20Z

I think they do.

fwyzard · 2019-10-28T14:53:28Z

@fwyzard About the cuda::throw_if_error(), given that we are mostly using cudaCheck(), how about changing cudaCheck() such that there would be a compile-time flag whether to abort() or throw an exception? (such that in CMSSW master it would throw an exception, patatrack and private forks could still use the abort())

Sorry @makortel looks like I never answered you: given that an uncaught exception results anyway in an abort() can we just make the replacement for good ?

makortel · 2019-10-28T14:58:51Z

@fwyzard

given that an uncaught exception results anyway in an abort() can we just make the replacement for good ?

I didn't follow if you meant "cudaCheck() should throw exception" or "cudaCheck() should abort"?

(also, exceptions from places where we call CUDA APIs should get caught by the framework)

fwyzard · 2019-10-28T15:00:10Z

I meant, cudaCheck() should just throw exceptions.

exceptions from places where we call CUDA APIs should get caught by the framework

Right... but they would still result in a stack trace and in the job ending, shouldn't they ?

Edit: see #398 .

makortel · 2019-11-07T20:30:20Z

#404 completes the work for streams and events.

makortel · 2019-11-07T22:47:15Z

I can take care of cuda::device::current::scoped_override_t<> after #404 gets merged.

waredjeb · 2019-11-07T22:55:13Z

I can work on the remaining cuda::devicecalls!

fwyzard · 2019-11-27T14:51:50Z

The last references to the CUDA API Wrappers were removed via #417 .

fwyzard added the task label Sep 14, 2019

makortel mentioned this issue Sep 17, 2019

Replace use of API wrapper stream and event with plain CUDA, part 1 #389

Merged

fwyzard assigned makortel, fwyzard and waredjeb and unassigned makortel and fwyzard Oct 24, 2019

waredjeb mentioned this issue Oct 26, 2019

Replace CUDA API wrapper memory operations with native CUDA calls #395

Merged

waredjeb mentioned this issue Oct 30, 2019

Replace use of CUDA API wrapper unique_ptrs with CUDAUtilities unique_ptrs #396

Merged

makortel mentioned this issue Nov 7, 2019

Replace use of API wrapper stream and event with plain CUDA, part 2 #404

Merged

makortel mentioned this issue Nov 8, 2019

Replace use of cuda::device::current::scoped_override_t with plain CUDA #407

Merged

waredjeb mentioned this issue Nov 9, 2019

Replace remaining cuda::device operations with native CUDA calls. #408

Merged

fwyzard added the fixed label Nov 27, 2019

fwyzard closed this as completed Nov 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the use of CUDA API wrappers #386

Remove the use of CUDA API wrappers #386

makortel commented Sep 11, 2019 •

edited by fwyzard

Loading

makortel commented Sep 11, 2019

makortel commented Sep 13, 2019

fwyzard commented Sep 14, 2019

fwyzard commented Sep 14, 2019

makortel commented Sep 16, 2019

makortel commented Sep 17, 2019

fwyzard commented Sep 17, 2019

makortel commented Sep 18, 2019

fwyzard commented Sep 18, 2019

fwyzard commented Oct 24, 2019

makortel commented Oct 24, 2019

fwyzard commented Oct 24, 2019

makortel commented Oct 24, 2019

fwyzard commented Oct 24, 2019

fwyzard commented Oct 28, 2019

makortel commented Oct 28, 2019

fwyzard commented Oct 28, 2019 •

edited

Loading

makortel commented Nov 7, 2019

makortel commented Nov 7, 2019

waredjeb commented Nov 7, 2019

fwyzard commented Nov 27, 2019

Remove the use of CUDA API wrappers #386

Remove the use of CUDA API wrappers #386

Comments

makortel commented Sep 11, 2019 • edited by fwyzard Loading

makortel commented Sep 11, 2019

makortel commented Sep 13, 2019

fwyzard commented Sep 14, 2019

fwyzard commented Sep 14, 2019

makortel commented Sep 16, 2019

makortel commented Sep 17, 2019

fwyzard commented Sep 17, 2019

makortel commented Sep 18, 2019

fwyzard commented Sep 18, 2019

fwyzard commented Oct 24, 2019

makortel commented Oct 24, 2019

fwyzard commented Oct 24, 2019

makortel commented Oct 24, 2019

fwyzard commented Oct 24, 2019

fwyzard commented Oct 28, 2019

makortel commented Oct 28, 2019

fwyzard commented Oct 28, 2019 • edited Loading

makortel commented Nov 7, 2019

makortel commented Nov 7, 2019

waredjeb commented Nov 7, 2019

fwyzard commented Nov 27, 2019

makortel commented Sep 11, 2019 •

edited by fwyzard

Loading

fwyzard commented Oct 28, 2019 •

edited

Loading