-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement interface for data transfer across GPU devices. #2308
Conversation
Also, things seem to be working fine here, except when I try to directly print the new model: julia> using Flux, CUDA
julia> m = Dense(2 => 3);
julia> device1 = Flux.get_device(CUDABackend, UInt(1));
julia> m = m |> device1
Dense(2 => 3) # 9 parametersError showing value of type Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}:
ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/tVtYo/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/tVtYo/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuMemcpyDtoHAsync_v2
@ ~/.julia/packages/CUDA/tVtYo/lib/utils/call.jl:26 [inlined]
[4] #unsafe_copyto!#8
@ ~/.julia/packages/CUDA/tVtYo/lib/cudadrv/memory.jl:397 [inlined]
[5] (::CUDA.var"#1014#1015"{Bool, Vector{Bool}, Int64, CuArray{Bool, 2, CUDA.Mem.DeviceBuffer}, Int64, Int64})()
@ CUDA ~/.julia/packages/CUDA/tVtYo/src/array.jl:482
[6] #context!#887
@ ~/.julia/packages/CUDA/tVtYo/lib/cudadrv/state.jl:170 [inlined]
[7] context!
@ ~/.julia/packages/CUDA/tVtYo/lib/cudadrv/state.jl:165 [inlined]
[8] unsafe_copyto!(dest::Vector{Bool}, doffs::Int64, src::CuArray{Bool, 2, CUDA.Mem.DeviceBuffer}, soffs::Int64, n::Int64)
@ CUDA ~/.julia/packages/CUDA/tVtYo/src/array.jl:475
[9] copyto!
@ ~/.julia/packages/CUDA/tVtYo/src/array.jl:429 [inlined]
[10] getindex
@ ~/.julia/packages/GPUArrays/5XhED/src/host/indexing.jl:12 [inlined]
[11] macro expansion
@ ~/.julia/packages/GPUArraysCore/uOYfN/src/GPUArraysCore.jl:136 [inlined]
[12] _mapreduce(f::ComposedFunction{typeof(!), typeof(iszero)}, op::typeof(|), As::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}; dims::Colon, init::Nothing)
@ GPUArrays ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:73
[13] _mapreduce
@ ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:35 [inlined]
[14] #mapreduce#29
@ ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:31 [inlined]
[15] mapreduce
@ ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:31 [inlined]
[16] any
@ ~/.julia/packages/GPUArrays/5XhED/src/host/mapreduce.jl:82 [inlined]
[17] _any
@ ~/fluxml/Flux.jl/src/layers/show.jl:129 [inlined]
[18] (::Flux.var"#330#331"{ComposedFunction{typeof(!), typeof(iszero)}})(x::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Flux ~/fluxml/Flux.jl/src/layers/show.jl:131
[19] _any(f::Flux.var"#330#331"{ComposedFunction{typeof(!), typeof(iszero)}}, itr::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}}, #unused#::Colon)
@ Base ./reduce.jl:1215
[20] any
@ ./reduce.jl:1210 [inlined]
[21] _any
@ ~/fluxml/Flux.jl/src/layers/show.jl:131 [inlined]
[22] _all
@ ~/fluxml/Flux.jl/src/layers/show.jl:135 [inlined]
[23] _nan_show(io::IOContext{Base.TTY}, x::Zygote.Params{Zygote.Buffer{Any, Vector{Any}}})
@ Flux ~/fluxml/Flux.jl/src/layers/show.jl:120
[24] _layer_show(io::IOContext{Base.TTY}, layer::Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, indent::Int64, name::Nothing)
@ Flux ~/fluxml/Flux.jl/src/layers/show.jl:86
[25] _layer_show
@ ~/fluxml/Flux.jl/src/layers/show.jl:75 [inlined]
[26] show(io::IOContext{Base.TTY}, m::MIME{Symbol("text/plain")}, x::Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
@ Flux ~/fluxml/Flux.jl/src/layers/show.jl:67
[27] (::REPL.var"#55#56"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:276
[28] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:557
[29] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:262
[30] display
@ ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:281 [inlined]
[31] display(x::Any)
@ Base.Multimedia ./multimedia.jl:340
[32] #invokelatest#2
@ ./essentials.jl:816 [inlined]
[33] invokelatest
@ ./essentials.jl:813 [inlined]
[34] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:305
[35] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:287
[36] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:557
[37] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:285
[38] (::REPL.var"#do_respond#80"{Bool, Bool, REPL.var"#93#103"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:899
[39] #invokelatest#2
@ ./essentials.jl:816 [inlined]
[40] invokelatest
@ ./essentials.jl:813 [inlined]
[41] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
@ REPL.LineEdit ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/LineEdit.jl:2647
[42] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
@ REPL ~/julia-1.9.2/share/julia/stdlib/v1.9/REPL/src/REPL.jl:1300
[43] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
@ REPL ./task.jl:514 I'm not exactly sure where the illegal memory access is happening. If I try to manually try to run the code for |
I believe you have to switch to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Care to try this for the other GPU backends as well?
Yes, will update the PR. Thanks for the suggestions. |
I think the interface should be simpler. I suggest device = get_device("CUDA", 1) |
Hi @ToucheSir. I can surely attempt to implement this for metal and AMD, but I don't have access to these GPUs (so I won't be able to test on them). Just wondering: is there any way for me to test AMD/Metal code? |
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## master #2308 +/- ##
==========================================
- Coverage 79.47% 75.23% -4.24%
==========================================
Files 31 31
Lines 1749 1805 +56
==========================================
- Hits 1390 1358 -32
- Misses 359 447 +88
☔ View full report in Codecov by Sentry. |
Probably not locally, but since you'll be writing tests those should be auto-tested by CI. Also happy to help test the AMDGPU routines locally once those are in a good place. |
I've implemented the same interface for Any nice way to get the ordinal for a metal device? Maybe we can match our device in the |
I wasn't aware Metal.jl had this limitation. If there's no clean way to do it now, we can always disable specific device selection for it. |
Okay, I don't see a clean way to do this at the moment. So I'll skip adding this functionality to metal for now. |
Added some tests for CUDA and AMD. How do they look? Next I'll add some documentation. |
Getting a weird error in buildkite, which says: |
fafef86
to
de8f957
Compare
ordinal information through `FluxAMDAdaptor`.
de8f957
to
f1ab569
Compare
ext/FluxCUDAExt/functor.jl
Outdated
adapt_storage(to::FluxCUDAAdaptor, x) = CUDA.cu(x) | ||
function adapt_storage(to::FluxCUDAAdaptor, x::AbstractArray) | ||
typeof(to.ordinal) <: Nothing && return CUDA.cu(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeof(to.ordinal) <: Nothing && return CUDA.cu(x) | |
to.ordinal === nothing && return CUDA.cu(x) |
This seems good to go. You could remove the draft status. |
Sure. I'll add some documentation. |
documentation in the guide.
Completed everything. Please let me know if something needs to be changed. |
This PR addresses issue #2302. Here, the goal is to pass device information in adaptors (
FluxCUDAAdaptor
,FluxAMDAdaptor
andFluxMetalAdaptor
respectively), which will then be used to do data transfer at the parameter level only.Also, new
Flux.get_device
methods will be added in the extensions which can return aFlux.AbstractDevice
with a specific backend and ordinal.Example for data movement with CUDA
cc @CarloLucibello @ToucheSir @darsnack.
PR Checklist
adapt_storage
for data types other thanAbstractArray
?