Improve docs

Signed-off-by: Igor Gitman <[email protected]>
NVIDIA · Dec 12, 2024 · 58f3b90 · 58f3b90
1 parent 66d2899
commit 58f3b90
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/docs/pipelines/llm-as-a-judge.md b/docs/pipelines/llm-as-a-judge.md
@@ -26,8 +26,10 @@ ns generate \
     ++input_dir=/workspace/test-eval/eval-results/math
 ```
 
-This will run the judge pipeline on all data inside `eval-results/math` folder and judge solutions from `output.jsonl` file.
+This will run the judge pipeline on the data inside `eval-results/math` folder and judge solutions from `output.jsonl` file.
 If you ran the benchmark with N samples (e.g. using `math:8`) and want to judge all of them, add `--num_random_seeds=8`.
+Note that if you want to judge both greedy generations and samples, you'd need to run the command two times.
+
 In this example we use gpt-4o from OpenAI, but you can use Llama-405B (that you can host on cluster yourself) or any
 other models. If you have multiple benchmarks, you would need to run the command multiple times.
 After the judge pipeline has finished, you can see the results by running