You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running the Dagger benchmarks on an AMD Zen system with 64 cores. The "raw" benchmarks are working fine, but the "dagger" benchmarks abort with an error (see below).
I am using Julia 1.9:
julia> versioninfo()
Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × AMD EPYC 7302 16-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, znver2)
Threads: 1 on 32 virtual cores
Environment:
LD_LIBRARY_PATH = /cm/shared/apps/slurm/current/lib64/slurm:/cm/shared/apps/slurm/current/lib64
This is likely due to JuliaLang/Distributed.jl#73, as you're mixing multiple Distributed processes with usage of multithreading. You might be able to try re-running with Julia master and JuliaLang/Distributed.jl#4 to avoid this issue. Note that I haven't yet tested Dagger on the latest Julia builds, so things may be broken elsewhere.
Even with those PRs, I still get a variety of fun errors and runtime assertions. Some appear to be Dagger bugs, so I'll try to narrow those down first.
Ok, I have working benchmarks with JuliaLang/Distributed.jl#4 and a local branch based on Dagger master (but master itself should be fine). As long as the head node/worker 1 is only running with the 1 default thread, all is well, at least on a small scale of 2 workers with 2 threads each. Using multiple threads on worker 1 causes me to hit assertions in julia-debug, but those are very likely Julia bugs only encountered by Dagger (relating to WeakRef usage, which is known to be buggy).
I am running the Dagger benchmarks on an AMD Zen system with 64 cores. The "raw" benchmarks are working fine, but the "dagger" benchmarks abort with an error (see below).
I am using Julia 1.9:
and a recent version of Dagger (
Dagger v0.18.3
).The error message is:
The text was updated successfully, but these errors were encountered: