Results and log of LLM-KG-Bench runs described in article "Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering", Meyer et al., to appear in SEMANTICS 2023 poster track proceedings.
We collected data in multiple runs, each resulting in files with date and time of experiment start in filename:
- result files (
.json
,.txt
,.yaml
, same info in different serializations) containing task, response and evaluation data - model log files (
.jsonnl
in directorymodelLog
) containing details on LLM interaction - full log files (
.log
in directorylog
) containing debug log for runs