Experiment of observation conditioned decoder #6

ssam2s · 2024-10-28T07:03:33Z

Thank you for your great work!

I'm conducting various experiments to condition the decoder on observations.

In your ablation study for the observation-conditioned decoder, were all hyperparameters same with the released code? Also, how were the observation tokens constructed?

In some conditioning experiments, I've observed cases where the autoencoder's grad_norm increases. Could this indicate potential issues with training?

Congratulations on having your paper accepted at a top conference!

atharvamete · 2024-10-30T18:46:34Z

Thank you for your kind words!

were all hyperparameters same with the released code?

Yes, to ensure a fair comparison we used same hyperparameters as the ones reported in Appendix B1. In our current implementation we construct one token per observation timestep in stage 2, so for this ablation we just append that to the skill tokens from the encoder and let the decoder cross attend to all tokens (obs+skill tokens) combined.

I've observed cases where the autoencoder's grad_norm increases. Could this indicate potential issues with training?

Can you confirm if this is during stage 0,1 or 2 training? Also by increase do you mean it's blowing up to some very high value?
If you want to train/finetune the autoencoder then increase in grad norm is expected.
In current codebase, stage 0 is autoencoder only training, stage 1 is prior training and stage 2 is finetuning the prior along with autoencoder depending on whether you have l1_loss_scale set to non-zero value. You can also explicitly freeze the autoencoder params if you don't want to train it.

ssam2s · 2024-10-31T04:19:19Z

Thank you for your answering !

Can you confirm if this is during stage 0,1 or 2 training?

For various experiments, I conditioned the decoder on observations during the stage 0 training process, and I ended up with results that contradicted those obtained using the provided code. Additionally, when I used this pretrained autoencoder in stage 1, the success rate was nearly zero. There could be multiple reasons behind this, but I wanted to ask if you might have any insights into possible causes.

atharvamete · 2024-11-08T03:54:55Z

Something feels off. Could you please elaborate more on what results exactly contradicted? If success rate is zero, you should checkout the rollout videos, by default it should save some in the evaluation run directory, or else try setting n_video param under rollout in eval config.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment of observation conditioned decoder #6

Experiment of observation conditioned decoder #6

ssam2s commented Oct 28, 2024

atharvamete commented Oct 30, 2024

ssam2s commented Oct 31, 2024

atharvamete commented Nov 8, 2024

Experiment of observation conditioned decoder #6

Experiment of observation conditioned decoder #6

Comments

ssam2s commented Oct 28, 2024

atharvamete commented Oct 30, 2024

ssam2s commented Oct 31, 2024

atharvamete commented Nov 8, 2024