You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per this, my understanding is that the gas config in neox doesn't do anything, and shouldn't be used, and should be removed. We should be using gradient_accumulation_steps instead.
It appears that all existing pythia configs set gas to 1, which is the default for gradient_accumulation_steps anyway, so this will not matter. Per that same search some of the old eval results specifically show gas at 2, which would be a bad error and would halve effective batch size if the expectation was that gas did something.
I am not putting in a PR to replace gas with gradient_accumulation_steps because these configs are references for the settings of existing artifacts, so it's not clear to me that they should be fixed to be "correct", or if they are, what the correct steps would be to make sure that they're preserved as references on those artifacts if the configuration is fixed going forward.
The text was updated successfully, but these errors were encountered:
Per this, my understanding is that the
gas
config in neox doesn't do anything, and shouldn't be used, and should be removed. We should be usinggradient_accumulation_steps
instead.It appears that all existing pythia configs set gas to 1, which is the default for
gradient_accumulation_steps
anyway, so this will not matter. Per that same search some of the old eval results specifically show gas at 2, which would be a bad error and would halve effective batch size if the expectation was thatgas
did something.I am not putting in a PR to replace
gas
withgradient_accumulation_steps
because these configs are references for the settings of existing artifacts, so it's not clear to me that they should be fixed to be "correct", or if they are, what the correct steps would be to make sure that they're preserved as references on those artifacts if the configuration is fixed going forward.The text was updated successfully, but these errors were encountered: