Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment of observation conditioned decoder #6

Open
ssam2s opened this issue Oct 28, 2024 · 3 comments
Open

Experiment of observation conditioned decoder #6

ssam2s opened this issue Oct 28, 2024 · 3 comments

Comments

@ssam2s
Copy link

ssam2s commented Oct 28, 2024

Thank you for your great work!

I'm conducting various experiments to condition the decoder on observations.

In your ablation study for the observation-conditioned decoder, were all hyperparameters same with the released code? Also, how were the observation tokens constructed?

In some conditioning experiments, I've observed cases where the autoencoder's grad_norm increases. Could this indicate potential issues with training?

Congratulations on having your paper accepted at a top conference!

@atharvamete
Copy link
Collaborator

Thank you for your kind words!

were all hyperparameters same with the released code?

Yes, to ensure a fair comparison we used same hyperparameters as the ones reported in Appendix B1. In our current implementation we construct one token per observation timestep in stage 2, so for this ablation we just append that to the skill tokens from the encoder and let the decoder cross attend to all tokens (obs+skill tokens) combined.

I've observed cases where the autoencoder's grad_norm increases. Could this indicate potential issues with training?

Can you confirm if this is during stage 0,1 or 2 training? Also by increase do you mean it's blowing up to some very high value?
If you want to train/finetune the autoencoder then increase in grad norm is expected.
In current codebase, stage 0 is autoencoder only training, stage 1 is prior training and stage 2 is finetuning the prior along with autoencoder depending on whether you have l1_loss_scale set to non-zero value. You can also explicitly freeze the autoencoder params if you don't want to train it.

@ssam2s
Copy link
Author

ssam2s commented Oct 31, 2024

Thank you for your answering !

Can you confirm if this is during stage 0,1 or 2 training?

For various experiments, I conditioned the decoder on observations during the stage 0 training process, and I ended up with results that contradicted those obtained using the provided code. Additionally, when I used this pretrained autoencoder in stage 1, the success rate was nearly zero. There could be multiple reasons behind this, but I wanted to ask if you might have any insights into possible causes.

image

@atharvamete
Copy link
Collaborator

Something feels off. Could you please elaborate more on what results exactly contradicted? If success rate is zero, you should checkout the rollout videos, by default it should save some in the evaluation run directory, or else try setting n_video param under rollout in eval config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants