CUDA.@profile: DataFrames post-processing needs to be optimized #2567
Labels
enhancement
New feature or request
help wanted
Extra attention is needed
performance
How fast can we go?
Sanity checks (read this first, then remove this section)
Make sure you're reporting a bug; for general questions, please use Discourse or
Slack.
If you're dealing with a performance issue, make sure you disable scalar iteration
(
CUDA.allowscalar(false)
). Only file an issue if that shows scalar iteration happeningin CUDA.jl or Base Julia, as opposed to your own code.
If you're seeing an error message, follow the error message instructions, if any
(e.g.
inspect code with @device_code_warntype
). If you can't solve the problem usingthat information, make sure to post it as part of the issue.
Always ensure you're using the latest version of CUDA.jl, and if possible, please
check the master branch to see if your issue hasn't been resolved yet.
If your bug is still valid, please go ahead and fill out the template below.
Describe the bug
Wrapping a computation that takes 20 seconds with
CUDA.@profile
requires multiple minutes of post processing in my case.Trace:
To reproduce
N/A
Expected behavior
A clear and concise description of what you expected to happen.
Version info
Details on Julia:
Details on CUDA:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: