Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and static in their ability to handle diverse tasks.
We are excited to introduce Transformer², a novel self-adaptation framework that adapts LLMs for unseen tasks in real-time by selectively adjusting only the singular components of their weight matrices. During inference, Transformer² employs a two-pass mechanism: first, a dispatch system identifies the task properties, and then task-specific "expert" vectors, trained using reinforcement learning, are dynamically mixed to obtain targeted behavior for the incoming prompt.
git clone https://github.com/SakanaAI/self-adaptive-llms
cd self-adaptive-llms
conda create -n t2 python=3.11 -y
conda activate t2
pip install --upgrade pip
pip install -r requirements.txt
cd evaluation/fishfarm
pip install -e .
We provide example scripts for both training and evaluation.
Please change the argument in the provided script to choose among models and tasks
bash scripts/train_task_expert.sh
Classification experts can be loaded by specifying the CLS_EXPERT_PATH in the script.
bash scripts/eval_prompt_based.sh
bash scripts/eval_few_shot.sh
If you find Transformer^2 useful for your research, please cite using this BibTeX:
@misc{sun2025texttransformer2selfadaptivellms,
title={$\text{Transformer}^2$: Self-adaptive LLMs},
author={Qi Sun and Edoardo Cetin and Yujin Tang},
year={2025},
eprint={2501.06252},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2501.06252},
}