Negative values for tokens per sec for Mixtral benchmarsk #1856
Unanswered
prasad-nair-amd
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using the nvidia provided benchmark scripts running Mixtral 8x7B model on H100 why its reporting negative values for throughput ?
[BENCHMARK] num_samples 1
[BENCHMARK] num_error_samples 0
[BENCHMARK] num_samples 1
[BENCHMARK] total_latency(ms) 4610.37
[BENCHMARK] seq_throughput(seq/sec) 0.22
[BENCHMARK] token_throughput(token/sec) -2385.71
[BENCHMARK] avg_sequence_latency(ms) 4608.56
[BENCHMARK] max_sequence_latency(ms) 4608.56
[BENCHMARK] min_sequence_latency(ms) 4608.56
[BENCHMARK] p99_sequence_latency(ms) 4608.56
[BENCHMARK] p90_sequence_latency(ms) 4608.56
[BENCHMARK] p50_sequence_latency(ms) 4608.56
[BENCHMARK] avg_time_to_first_token(ms) 173.11
[BENCHMARK] max_time_to_first_token(ms) 173.11
[BENCHMARK] min_time_to_first_token(ms) 173.11
[BENCHMARK] p99_time_to_first_token(ms) 173.11
[BENCHMARK] p90_time_to_first_token(ms) 173.11
[BENCHMARK] p50_time_to_first_token(ms) 173.11
[BENCHMARK] avg_inter_token_latency(ms) 0.00
[BENCHMARK] max_inter_token_latency(ms) 0.00
[BENCHMARK] min_inter_token_latency(ms) 0.00
[BENCHMARK] p99_inter_token_latency(ms) 0.00
[BENCHMARK] p90_inter_token_latency(ms) 0.00
[BENCHMARK] p50_inter_token_latency(ms) 0.00
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
Beta Was this translation helpful? Give feedback.
All reactions