The collaborative AI arena is intended to serve as a basis for evaluation of collaboration between AI and Humans. The idea is to design and evaluate tasks which are performed with input from both sides, and figure out if the collaboration was successful. Users should be presented with a task that they perform together with the AI and metric should be used to evaluate the different
This repository contains most of the code necessary to add a Model or a Task.
There are two tasks, poetry
and tangram
currently available as examples. Poetry has a frontend created in ReactJS and tangram in vueJS using a godot game.
The prompt for the tasks are located at task_template/app/task_examples
For poetry, the task is in poetry.py
for the tangram task it is located in tangram.py
Although the current prompts serve as a starting point for the tasks, they are by no means optimized. Because of that, we highly recommend you to experiment and do some prompt engineering for it to fit whatever task you have in mind.
There are two example frontends. One is for a poetry task done in ReactJS, the other is for a tangram game using vueJS and a godot game.
The main idea of the task is that the user submits the theme/objective for the poem and then they collaborate with the model to complete it. However, This task is not limited to poem writing only as you can freely switch between a line-by-line rendering (poem) or a continuous rendering (paragraph). For a more detailed explanation, please go here
The idea is that the human and the AI work together to build something with a tangram game. At the moment the game provided is restricted to two pieces, so there iis only a limited amount of options. The game can be replaced by whatever the user wants, and just represents one possibility. For a more detailed explanation, please go here
At the moment, the system supports two models gpt4-turbo and gpt4-o, which are located inside the folder model_template/models.
You can change between the two models by changing the value of the variable ai_model
between OpenAIImageModel()
and OpenAIModel()
in the file model_template/model.py.
If you have time and want to add your own models to the system, feel free to do so by following the template located in the file basemodel.py and using the already existing model files as guidance.
The task_template
and model_template
folders contain template applications for deployment on the AI Builder infrastructure.
They indicate how to implement a model and a task and supply most infrastructure necessary to minimise the requirements of a user to adopt their code.
Details are provided in the respective README files.
To test things locally and see if they work, we provide a docker compose file along with a simple orchestrator.
To run locally you will need docker installed!
To run, you need docker installed.
You also need the following environment variables set:
OPENAI_API_KEY
- a openAI access key.SSL_KEY
- a valid openssl certficate keySSL_CERTIFICATE
- a valid openssl certificate
The template model currently uses an Aalto specific endpoint for computation.
You will likely need to change the model used in model_template/model.py
to an OpenAI model and use that for testing.
We provide two tasks that can be used, either a poetry task or a tangram task. To run them call:
docker compose -f docker-compose_tangram.yaml up --build
for the tangram task and
docker compose -f docker-compose_poetry.yaml up --build
for the poetry task respectively
only one of the tasks can be run at the same time.
After that, the frontend should be accessible via https://localhost:8062.