You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tsmixer was original reported as two different models, tsmixer-basic (which allows for past covariates and is called simply tsmixer in the paper) and tsmixer-ext, which allows for past, future, and static covariates. All results in the paper except for the m5 dataset used tsmixer-basic. The darts implementation is based on tsmixer-ext.
However, tsmixer-ext isn't identical to tsmixer-basic when there are no static or future covariates. The key difference is that while tsmixer-basic projects to output_chunk_length in the final layer, effectively encoding the historical data while maintaining it's time dimension, tsmixer-ext projects the historical and static data to output_chunk_length in the first layer. I don't think this is optimal as this will limit the usefulness of the residual connections. My testing with the original google-research source code shows a performance regression of about 10% higher MAE and MSE with the weather dataset when moving the temporal project step to the top of the model.
If the maintainers think this would be valuable I can implement this. I think the most sensible way to do so would be to add a project_first=True keyword.
The text was updated successfully, but these errors were encountered:
If you think that you can elegantly make the tsmixer-basic architecture easily available through the TSMixerModel API/constructor, it would be for sure valuable to have a variation of this model that performs better when no future covariates are available, which can occur in many situations. I would maybe just call the argument first_layer_projection instead of just project_first, but we can discuss it in your PR.
You will also need to add checks in the fit() method so that an error is raised if first_layer_projection=False and future/static covariates are provided.
tsmixer was original reported as two different models, tsmixer-basic (which allows for past covariates and is called simply tsmixer in the paper) and tsmixer-ext, which allows for past, future, and static covariates. All results in the paper except for the m5 dataset used tsmixer-basic. The darts implementation is based on tsmixer-ext.
However, tsmixer-ext isn't identical to tsmixer-basic when there are no static or future covariates. The key difference is that while tsmixer-basic projects to
output_chunk_length
in the final layer, effectively encoding the historical data while maintaining it's time dimension, tsmixer-ext projects the historical and static data tooutput_chunk_length
in the first layer. I don't think this is optimal as this will limit the usefulness of the residual connections. My testing with the original google-research source code shows a performance regression of about 10% higher MAE and MSE with theweather
dataset when moving the temporal project step to the top of the model.If the maintainers think this would be valuable I can implement this. I think the most sensible way to do so would be to add a
project_first=True
keyword.The text was updated successfully, but these errors were encountered: