Replies: 1 comment 1 reply
-
I compiled my RPC servers in debugging mode, and noticed the first specified RPC backend in the list of RPC servers will crash because the number of nodes changes, any idea why this would happen? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I'm having some issues with
llama-server
benchmarking withrpc backends
. When runninglocal GPUs
there's only some issues, but whenever the llama-server is running with rpc, after the second iteration therpc backend
will crash with asegmentation fault
. Any advice on how to get the segmentation faults to stop?I'm running the line below for the
RPC backends
I've tried messing around with the defragmentation and smaller batch sizes with
llama-server
but it doesn't seem to help.The line I use to run my server is:
Below is a screenshot of my "successful" run using 8 local AMD GPUs. I noticed that the data sent and received is
0 B/s
. This is something I'd like to confirm with anyone, to see if they had similar results.The only way I could get this to work was modifying
script.js
and adding a logic handling forsse.open
for whenllama-server
returns [DONE]Below is a few lines that generate when using
llama-server
withoutdefragmentation
. This is with using the 8 local GPUs onlyBeta Was this translation helpful? Give feedback.
All reactions