Refactor PEER #123

oleksost · 2024-10-18T15:56:36Z

make sure PEER can be saved in a library
PEER contained is also a modifier

TODO:

make sure the router is also saved
make sure training config is also saved
make sure it can also be loaded from a library

sordonia · 2024-10-24T12:59:42Z

mttl/models/containers/peer_container.py

+@dataclass
+class PEERConfig(ModifierConfig):
+    n_heads: int = 8
+    moe__num_experts: int = 100


sordonia · 2024-10-24T13:00:21Z

mttl/models/lightning/expert_module.py

+    def experts_names(self):
+        return self.model.experts_names
+
+    def get_expert_instance(self, name):


this is only available for multiexpert, i am not sure to add this dependency here

sordonia · 2024-10-24T13:01:06Z

projects/modular_llm/eval_library.py

@@ -186,6 +190,10 @@ def run_eval(args: EvaluationConfig):
        module = MultiExpertModule(**vars(expert.training_config)).to("cuda")
        module.add_expert_instance(expert, is_default=True)

+    elif args.merge_or_route in ["peer"]:


can you explain what you are trying to do?

sordonia · 2024-10-24T13:02:03Z

projects/modular_llm/train_experts.py

@@ -200,7 +201,8 @@ def upload_library(expert_library, module):
                if isinstance(module, MoEModule):
                    with expert_library.batched_commit():
                        for expert_name in module.experts_names:
-                            expert = module.get_expert_instance(expert_name)
+                            expert: Expert = module.get_expert_instance(expert_name)
+                            expert.expert_info.training_config = args


remove this, not a good idea to store complex object in training_config, it will be transformed to a Dict in the next PR :)

sordonia · 2024-10-24T13:03:20Z

mttl/models/containers/base.py

@@ -35,6 +35,11 @@ def __init__(self, config, layer, selector=None):
        self.selector = selector or TaskNameSelector()
        self._default_expert_name = None
        self.expert_infos = {}
+        self.experts = nn.ModuleDict({})


not all containers have "experts", see my last PR on making LoRA faster

sordonia · 2024-10-24T13:03:32Z

mttl/models/containers/base.py

+        self.experts = nn.ModuleDict({})
+
+    @property
+    def num_experts(self):


let's use len(self)

oleksost added 2 commits October 18, 2024 10:59

peer contained is modifier

0f2c6f9

make sure peer can be stored to library

2ff6d89

oleksost requested a review from sordonia October 18, 2024 15:56

oleksost added 4 commits October 18, 2024 11:57

black

8abb4de

black

9082f0a

save and load peer expert

8937c66

comment

002cb28

sordonia reviewed Oct 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor PEER #123

Refactor PEER #123

oleksost commented Oct 18, 2024 •

edited

Loading

sordonia Oct 24, 2024

sordonia Oct 24, 2024

sordonia Oct 24, 2024

sordonia Oct 24, 2024

sordonia Oct 24, 2024

sordonia Oct 24, 2024

Refactor PEER #123

Are you sure you want to change the base?

Refactor PEER #123

Conversation

oleksost commented Oct 18, 2024 • edited Loading

sordonia Oct 24, 2024

Choose a reason for hiding this comment

sordonia Oct 24, 2024

Choose a reason for hiding this comment

sordonia Oct 24, 2024

Choose a reason for hiding this comment

sordonia Oct 24, 2024

Choose a reason for hiding this comment

sordonia Oct 24, 2024

Choose a reason for hiding this comment

sordonia Oct 24, 2024

Choose a reason for hiding this comment

oleksost commented Oct 18, 2024 •

edited

Loading