You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I find there are a number of issues in running the code. Here is a list of what I found so far.
Importing
Several files within the experiments folder, e.g., experiments/agents/ppo/base_ppo_agent.py, call imports such as, from experiments.agents.etc import etc. These imports fail as the files themselves are already inside the experiments folder, as per my tests. Simply eliminating the experiments name at the beginning of the import call solves these issues.
Class Naming
Some classes fail to import as, I believe, have changed names over time. This is the case for, e.g., ValueModuleFn (imported in post-training_tests.py) which I assume has changed name to ValueHeadModuleFn.
Training Using Lamorel (on one GPU)
After fixing these minor issues, I managed to launch a training with the default single-GPU config, using the Flan-T5-small LLM and using Lamorel in two separate server-client instances as described in #11 -- obtaining weird results (reward flat to 0.0 for all episodes). I attach my logs std.txt and final results log.csv for clarity.
Testing Using Lamorel (on one GPU)
I could not manage to run the post-training_tests.py script in the same way as the training script, as I find several issues with it. Using once again the standard Lamorel config for one GPU, two server-client instances and the previously trained model, errors occur when initializing the LLMPPOAgent named algo at line 390. Once again, I believe there was some change in the initialization process as there are several parameters that require specification and are missing, such as num_frames_per_proc (default is None, but if kept at default an error occurs when initializing self.obss in the BasePPOAgent superclass initialization). Required logging information (saving_path_logs, saving_path_model, id_expe) is not passed as well. Also, some optional parameters (reshape_reward, subgoals) are not explicitly named (ex.: LLMPPOAgent(..., reshape_reward=reshape_reward)), thus their values are stored in different variables when the object is initialized. Again, these were easy fixes. However, I later find myself stuck at an error occurring at line 310, which is reported below:
/Absolute/Path/To/Grounding_LLMs_with_online_RL/lamorel/lamorel/src/lamorel_launcher/launch.py:15: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path='', config_name='')
/Absolute/Path/To/miniconda3/envs/dlp/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[DATE TIME][root][INFO] - Using nproc_per_node=1.
/Absolute/Path/To/Grounding_LLMs_with_online_RL/experiments/train_language_agent.py:373: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path='config', config_name='config')
/Absolute/Path/To/Grounding_LLMs_with_online_RL/experiments/post-training_tests.py:281: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path='config', config_name='config')
/Absolute/Path/To/miniconda3/envs/dlp/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[DATE TIME][lamorel_logger][INFO] - Init rl group for process 1
[DATE TIME][lamorel_logger][INFO] - Init llm group for process 1
[DATE TIME][lamorel_logger][INFO] - Init rl-llm group for process 1
[DATE TIME][lamorel_logger][INFO] - Using CPU on process 1 (index 0)
Parallelising HF LLM on 1 devices
Loading model t5-small
/Absolute/Path/To/miniconda3/envs/dlp/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Error executing job with overrides: ['rl_script_args.path=/Absolute/Path/To/Grounding_LLMs_with_online_RL/experiments/post-training_tests.py', 'lamorel_args.accelerate_args.machine_rank=1']
Traceback (most recent call last):
File "/Absolute/Path/To/Grounding_LLMs_with_online_RL/experiments/post-training_tests.py", line 310, in main
lm_server = Caller(config_args.lamorel_args, custom_updater=LoadSpecificWeightsUpdater,
File "/Absolute/Path/To/Grounding_LLMs_with_online_RL/lamorel/lamorel/src/lamorel/caller.py", line 53, in __init__
Server(
File "/Absolute/Path/To/Grounding_LLMs_with_online_RL/lamorel/lamorel/src/lamorel/server/server.py", line 60, in __init__
self._updater.set_llm_module(
TypeError: BaseUpdater.set_llm_module() missing 1 required positional argument: 'llm_module'
As running using the Lamorel Launcher is advised, I find it difficult to solve this error with my usual procedure involving a debugger.
If you were able to reproduce the issue and help me solve it, I would be very thankful. I find your paper very insightful.
Thank you for your time!
The text was updated successfully, but these errors were encountered:
Thanks for reaching out and spotting all these issues!
I opened a PR.
Importing
It should be fixed now.
Class Naming
It should also be fixed.
Training Using Lamorel (on one GPU)
I fixed lamorel's initialization in train_language_agent.py so that the client and server could be launched together (see here).
Regarding your results, you seem to have performed 1k steps only. I am not surprised the agent has not solved a single episode, given our results in the paper (Figure 14), where Flan-T5 small reached a 0.2 success rate after approximately 10k steps. Secondly, we used 32 environments in our experiments (leading to much more diversity in batches).
Dear @flowersteam,
trying to reproduce your results for coursework.
I find there are a number of issues in running the code. Here is a list of what I found so far.
Importing
Several files within the
experiments
folder, e.g.,experiments/agents/ppo/base_ppo_agent.py
, call imports such as,from experiments.agents.etc import etc
. These imports fail as the files themselves are already inside theexperiments
folder, as per my tests. Simply eliminating theexperiments
name at the beginning of the import call solves these issues.Class Naming
Some classes fail to import as, I believe, have changed names over time. This is the case for, e.g.,
ValueModuleFn
(imported inpost-training_tests.py
) which I assume has changed name toValueHeadModuleFn
.Training Using Lamorel (on one GPU)
After fixing these minor issues, I managed to launch a training with the default single-GPU config, using the Flan-T5-small LLM and using Lamorel in two separate server-client instances as described in #11 -- obtaining weird results (reward flat to 0.0 for all episodes). I attach my logs
std.txt and final results log.csv for clarity.
Testing Using Lamorel (on one GPU)
I could not manage to run the
post-training_tests.py
script in the same way as the training script, as I find several issues with it. Using once again the standard Lamorel config for one GPU, two server-client instances and the previously trained model, errors occur when initializing theLLMPPOAgent
namedalgo
at line 390. Once again, I believe there was some change in the initialization process as there are several parameters that require specification and are missing, such asnum_frames_per_proc
(default isNone
, but if kept at default an error occurs when initializingself.obss
in theBasePPOAgent
superclass initialization). Required logging information (saving_path_logs
,saving_path_model
,id_expe
) is not passed as well. Also, some optional parameters (reshape_reward
,subgoals
) are not explicitly named (ex.:LLMPPOAgent(..., reshape_reward=reshape_reward)
), thus their values are stored in different variables when the object is initialized. Again, these were easy fixes. However, I later find myself stuck at an error occurring at line 310, which is reported below:As running using the Lamorel Launcher is advised, I find it difficult to solve this error with my usual procedure involving a debugger.
If you were able to reproduce the issue and help me solve it, I would be very thankful. I find your paper very insightful.
Thank you for your time!
The text was updated successfully, but these errors were encountered: