Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] LinearRegression get_estimator method's required arguments appear to do nothing #2494

Open
a097123 opened this issue Aug 9, 2024 · 3 comments
Labels
improvement New feature or improvement

Comments

@a097123
Copy link

a097123 commented Aug 9, 2024

Describe the bug

No matter what is passed into the get_estimator method of the LinearRegression class the same model object is returned.

To Reproduce

from darts import TimeSeries
from darts.models.forecasting.linear_regression_model import RegressionModel
from darts.utils.timeseries_generation import linear_timeseries

import numpy as np

trend = linear_timeseries(start_value=0, end_value=100, length=100)
noise = TimeSeries.from_times_and_values(
    trend.time_index, np.random.normal(0, 5, size=trend.values().shape)
)

series = trend + noise

model = RegressionModel(lags=3, output_chunk_length=2)
model.fit(series)

print(id(model.get_estimator(1, 1)))
print(id(model.get_estimator(2, 2)))
print(id(model.get_estimator(-1e7, -1e7)))
print(id(model.get_estimator(1e7, 1e7)))
13699865568
13699865568
13699865568
13699865568

Expected behavior
My assumption was that a Darts model would build 1 underlying model per future time period, i.e. "direct forecasting". get_regressor takes 2 arguments:

    def get_estimator(self, horizon: int, target_dim: int):
        """Returns the estimator that forecasts the `horizon`th step of the `target_dim`th target component.

        The model is returned directly if it supports multi-output natively.

        Parameters
        ----------
        horizon
            The index of the forecasting point within `output_chunk_length`.
        target_dim
            The index of the target component.
        """

No matter the arguments passed in the method will...

  1. return an object without an error, even if the inputs are ridiculous
  2. return the same model object (not sure which horizon's model this is).

Both are unexpected to me.

System (please complete the following information):

  • Python version: 3.12
  • darts version: 0.30.0

Additional context
New to darts.

@a097123 a097123 added bug Something isn't working triage Issue waiting for triaging labels Aug 9, 2024
@a097123
Copy link
Author

a097123 commented Aug 9, 2024

Update: I am now seeing that it is due to this if statements. It's currently unclear to me what makes a model object subclass MultiOutputRegressor.

I still feel that logging.info is invisible to almost all users and having require args that are meaningless is misleading.

@madtoinou madtoinou removed bug Something isn't working triage Issue waiting for triaging labels Aug 12, 2024
@madtoinou
Copy link
Collaborator

Hi @a097123,

Darts relies on sklearn implementation for all the regression models. This MultiOuputRegressor class is implemented at this higher level, we do not have control over it but sometimes makes "single output" models supports multivariates series/ output_chunk_length > 1 series by wrapping them in this class.

In your code snippet, you implicitly use the sklearn's LinearRegression model which inherits from the MultiOutputMixin class. Because this model support multioutputs out of the box, following the information stated in the docstring, the model is directly returned (hence the unique ID).

If you look at model.model.coef_, you will see that there is one set of coefficient for each position in output_chunk_length (in accordance with your assumption since multi_models=True by default) but there is no straightforward way to access a specific estimator from the model.

We could move the sanity check one level in the method to avoid the unexpected behavior you reported.

Changing the logging message from info to warning is also a possibility, did not want to make this look too alarming but it seems to be counter-intuitive.

Note: Darts LinearRegressionModel class would be more appropriate if this is indeed the model you want to use since it supports some additional features such as probabilistic forecast.

@madtoinou madtoinou added the improvement New feature or improvement label Aug 12, 2024
@a097123
Copy link
Author

a097123 commented Aug 12, 2024

@madtoinou Thank you for the detailed response! This helps me understand what the class is doing much better. Also appreciate the suggestion of LinearRegressionModel and why it might be better.

I personally think that warning is better here but I am new to the lib and you might know better than me what a darts user might expect there. This might just be a weird edge case only a noob would hit before taking a more conventional approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement New feature or improvement
Projects
None yet
Development

No branches or pull requests

2 participants