Skip to content

Actions: stanford-crfm/helm

Scenario tests

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
82 workflow runs
82 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Comments addressed for MMLU-PRO Non COT
Scenario tests #59: Pull request #3125 synchronize by siyagoel
October 31, 2024 23:57 11m 23s siyagoel/mmluprofinal
October 31, 2024 23:57 11m 23s
Comments addressed for MMLU-PRO Non COT
Scenario tests #58: Pull request #3125 synchronize by siyagoel
October 31, 2024 23:46 8m 25s siyagoel/mmluprofinal
October 31, 2024 23:46 8m 25s
Changed MMLU Pro for Non-COT Version (#3108)
Scenario tests #57: Commit b92b93f pushed by siyagoel
October 31, 2024 21:35 8m 16s main
October 31, 2024 21:35 8m 16s
Adding the IFEval scenario
Scenario tests #56: Pull request #3122 synchronize by liamjxu
October 31, 2024 20:08 7m 47s jialiang/ifeval
October 31, 2024 20:08 7m 47s
Adding the IFEval scenario
Scenario tests #55: Pull request #3122 synchronize by liamjxu
October 31, 2024 19:46 5m 8s jialiang/ifeval
October 31, 2024 19:46 5m 8s
Scenario tests
Scenario tests #54: Scheduled
October 31, 2024 15:35 11m 26s main
October 31, 2024 15:35 11m 26s
Adding the IFEval scenario
Scenario tests #53: Pull request #3122 synchronize by liamjxu
October 31, 2024 06:48 5m 27s jialiang/ifeval
October 31, 2024 06:48 5m 27s
Adding the IFEval scenario
Scenario tests #52: Pull request #3122 opened by liamjxu
October 31, 2024 06:46 5m 6s jialiang/ifeval
October 31, 2024 06:46 5m 6s
Scenario tests
Scenario tests #51: Scheduled
October 30, 2024 15:35 10m 13s main
October 30, 2024 15:35 10m 13s
Scenario tests
Scenario tests #50: Scheduled
October 29, 2024 15:36 10m 56s main
October 29, 2024 15:36 10m 56s
Changed MMLU Pro for Non-COT Version
Scenario tests #49: Pull request #3108 opened by siyagoel
October 28, 2024 21:41 8m 27s siyagoel/mmluprofinal
October 28, 2024 21:41 8m 27s
Scenario tests
Scenario tests #48: Scheduled
October 28, 2024 15:35 11m 58s main
October 28, 2024 15:35 11m 58s
Scenario tests
Scenario tests #47: Scheduled
October 27, 2024 15:34 11m 5s main
October 27, 2024 15:34 11m 5s
Scenario tests
Scenario tests #46: Scheduled
October 26, 2024 15:34 13m 41s main
October 26, 2024 15:34 13m 41s
Scenario tests
Scenario tests #45: Scheduled
October 25, 2024 15:35 11m 12s main
October 25, 2024 15:35 11m 12s
Scenario tests
Scenario tests #44: Scheduled
October 24, 2024 15:35 11m 18s main
October 24, 2024 15:35 11m 18s
Scenario tests
Scenario tests #43: Scheduled
October 23, 2024 15:35 11m 10s main
October 23, 2024 15:35 11m 10s
Adding train examples to GPQA (#3078)
Scenario tests #42: Commit d21e0f5 pushed by yifanmai
October 22, 2024 20:29 10m 17s main
October 22, 2024 20:29 10m 17s
Add Gold Commodity News scenario (#3065)
Scenario tests #41: Commit 364be45 pushed by yifanmai
October 22, 2024 18:22 9m 37s main
October 22, 2024 18:22 9m 37s
Add Gold Commodity News scenario
Scenario tests #40: Pull request #3065 synchronize by yifanmai
October 22, 2024 16:47 9m 48s yifanmai/fix-gold-commodity-news
October 22, 2024 16:47 9m 48s
Few-shot run_specs for GPQA, CoT not included yet
Scenario tests #39: Pull request #3079 synchronize by liamjxu
October 22, 2024 05:30 9m 33s jialiang/gpqa_run_specs
October 22, 2024 05:30 9m 33s
Few-shot run_specs for GPQA, CoT not included yet
Scenario tests #38: Pull request #3079 opened by liamjxu
October 22, 2024 05:21 7m 42s jialiang/gpqa_run_specs
October 22, 2024 05:21 7m 42s
Adding train examples to GPQA
Scenario tests #37: Pull request #3078 opened by liamjxu
October 21, 2024 23:15 9m 33s jialiang/gpqa_include_train
October 21, 2024 23:15 9m 33s
GPQA scenario and tests added (#3068)
Scenario tests #36: Commit f5ca627 pushed by yifanmai
October 21, 2024 17:49 10m 27s main
October 21, 2024 17:49 10m 27s
GPQA scenario and tests added
Scenario tests #35: Pull request #3068 synchronize by liamjxu
October 21, 2024 07:41 10m 5s jialiang/add_gpqa
October 21, 2024 07:41 10m 5s