Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to plot segments with fit_breaks information #96

Open
esakhib opened this issue Apr 22, 2022 · 4 comments
Open

How to plot segments with fit_breaks information #96

esakhib opened this issue Apr 22, 2022 · 4 comments

Comments

@esakhib
Copy link

esakhib commented Apr 22, 2022

Hello. Thank you for your awesome library! I have a question about how to plot segments with fit_breaks.

I have the signal [x_in; y_in] (the 1st pic) and then use

my_pwlf = pwlf.PiecewiseLinFit(x_in, y_in)
my_pwlf.fitfast(breaks_num, pop=50)
y_out = my_pwlf.predict(x_in, beta=my_pwlf.beta, breaks=my_pwlf.fit_breaks)

then I compute segments lines like this:

x_line = []
y_line = []
for i in np.arange(my_pwlf.n_segments):
    x_line_idxs = np.where(np.logical_and(my_pwlf.fit_breaks[i] <= x_in, x_in<= my_pwlf.fit_breaks[i + 1]))[0]

    x_line.append(x_in[x_line_idxs])

    y_line.append(get_y_lines(my_pwlf, i + 1, x_in[x_line_idxs]))

where get_y_lines is:

def get_y_lines(pwlf_, segment_number, x):
    """https://jekel.me/2018/Continous-piecewise-linear-regression/.
    """

    for line in np.arange(segment_number):
        if line == 0:
            y_values = pwlf_.beta[0] + (pwlf_.beta[1]) * (x - pwlf_.fit_breaks[0])
        else:
            y_values += (pwlf_.beta[line + 1]) * (x - pwlf_.fit_breaks[line])

    return y_values

The question is why betwenn first two breaks (the 2nd pic) I don't have any signal's points? Is it correct or I do it the wrong way?

the first picture:
изображение

the second picture
изображение

@cjekel
Copy link
Owner

cjekel commented Apr 22, 2022

The question is why betwenn first two breaks (the 2nd pic) I don't have any signal's points? Is it correct or I do it the wrong way?

It's probably a bad local minima. You can try seeing if you increase the initial population if you get the same result.

my_pwlf.fitfast(breaks_num, pop=200)

Your intuition is correct though, you would expect a line to at minimum connect two data points.

Is it possible that your breaks_num is one more than what is needed to fit your data?

What are the red breakpoints in your plot? A known solution?


You don't need to specify beta and breakpoints after you perform a fit. It's optional, because you may have saved parameters from a previous fit.

my_pwlf = pwlf.PiecewiseLinFit(x_in, y_in)
my_pwlf.fitfast(breaks_num, pop=50)
y_out = my_pwlf.predict(x_in)

@esakhib
Copy link
Author

esakhib commented Apr 22, 2022

@cjekel
Is it possible that your breaks_num is one more than what is needed to fit your data?
Yes, it is. I have a lot of examples like this signal with different shapes and I want to simplify this signal by segments (I set always breaks_num=10 because don't compute it).

What are the red breakpoints in your plot? A known solution?
The red breakpoints are the first and the last values from x_line and y_line above.

I mentioned that if I predict on np.linspace(min(x_input), max(x_input), 1000) I get better results (don't have case like on the 2nd pic above).

@cjekel
Copy link
Owner

cjekel commented Apr 23, 2022

If you are curious as to why there is a pwlf breakpoint here:
image
but without a slope change, it's because of how you calculate the first and last line segment.

x_line = []
y_line = []
x_hat = np.linspace(min(x_in), max(x_in), n_samples)
for i in np.arange(my_pwlf.n_segments):
    x_line_idxs = np.where(np.logical_and(my_pwlf.fit_breaks[i] <= x_hat, x_hat<= my_pwlf.fit_breaks[i + 1]))[0]
    x_line.append(x_in[x_line_idxs])
    y_line.append(get_y_lines(my_pwlf, i + 1, x_in[x_line_idxs]))

will converge to the pwlf breakpoints as n_samples -> infinity. This is just because you are searching for line start and end points using the discretized data, and pwlf breakpoints occur as a contionus variable from x_in.min() to x_in.max()


In your application, is it important for breakpoints to only occur at data points? If so I have a branch somewhere that has an algorithm for this.

Can you show me just the raw signal of data points, and pwlf predict as a line with np.linspace(min(x_input), max(x_input), 1000)?


It looks like the pwlf fit is giving you (near) zero error with that potential single fictitious breakpoint. Additionally, it looks like you could move that breakpoint throughout the problem, and still have a fit that results in near zero error. This would imply that there are more than one (non-unique) solution for that specific number of line segments. I don't know what your application is, but I'm incline to say I don't think this is a big issue as long as you have more data points than beta parameters.

@esakhib
Copy link
Author

esakhib commented Apr 24, 2022

@cjekel it's clear now why there's the breakpoint where no points. thank you very much!

  1. yes, it is important. it would be nice if you can help me with this algorithm
  2. yes, sure
    image
    image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants