Cyclical features when variables are negative - does it make sense? #578

solegalli · 2022-12-13T12:21:49Z

Not sure we can make a variable with negative values cyclical with the sine and cosine. We need to investigate more and if it does not make sense, we need to put a safeguard.

VascoSch92 · 2023-08-27T13:28:27Z

I think the main problem is: what happen if the max value is 0? An error is thrown?

In this case what we could also do is taking the max of the absolute value of the column and still having an encoding which make sense.

solegalli · 2023-08-28T06:55:10Z

Good point.

The absolute maximum would work OK only when the entire variable is negative. If it has positive and negative numbers it will mess up with the cyclicity.

VascoSch92 · 2023-08-28T07:38:16Z

Yes yes you are right. I was just speaking about the case where the max of the column was 0.

From what i understand: you would like to find a solution that doesn't differentiate cases. Right?

I'm thinking that you could always translate the column by the max of the absolute value of the columns. This should not change the cyclical embedding as sin and cos are periodic functions.

But i have to check this last sentence ;-)

VascoSch92 · 2023-08-28T20:02:18Z

think about the problem, and this is my conclusion:

There is a good and intelligent way to eliminate the problem of negative values. It suffices to take
max_value = | min(column) - max(column) |

Because we are searching the length of the domain of the cyclic feature such that we can embed it in the 2-dim circle using the periodic function (sin(...), cos(...)). The length of the domain is necessary to embed the column bijectively.

In fact, this Is what we were already doing, taking the max of the column. We were supposing that the minimum was 0.

With this approach, we can elegantly tackle all the cases :-)
What do you think?

Let me know if my explanation was clear or if you need more details.

solegalli · 2023-09-12T14:26:50Z

Hey @VascoSch92

I am not sure I understand the solution.

We basically want to squeze the variable between 0 and 1, so that when we multiply by 2pi it is between 0 and 2pi.

So far, the transformer assumes that the minimum is always 0 and the maximum is always positive, so we divide by the max, that squeezes the variable between 0 and 1, and then we multiply by 2Pi.

If the variable has negative and positive values, like say it ranges from -5 to 10, then we want -5 to be taken to 0 and 10 to be taken to 1, in a way that does not alter the distribution.

One option could be to add the minimum value to the entire distribution and then find the maximum and use the transformer as it is used now.

Another option could be to scale to the min and max like the MinMaxScaler:

(X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)

Whatever, we do, ideally, we do not want to break backwards compatibility.

Does your solution return the results I expect / describe here as well?

VascoSch92 · 2023-09-12T15:31:56Z

Hey @solegalli

We basically want to squeze the variable between 0 and 1, so that when we multiply by 2pi it is between 0 and 2pi.

So far, the transformer assumes that the minimum is always 0 and the maximum is always positive, so we divide by the max, that squeezes the variable between 0 and 1, and then we multiply by 2Pi.

Exactly, we want to do that because the interval [0, 2Pi] is the period of sin and cos.

If the variable has negative and positive values, like say it ranges from -5 to 10, then we want -5 to be taken to 0 and 10 to be taken to 1, in a way that does not alter the distribution.

If you look at my suggestion we have that: | min(column) - max(column) | = - 5 - 10 = 5. Therefore, -5 goes to 0 and 10 goes to 2. Therefore, 10 will be mapped to (sin(4Pi), cos(4Pi)) = (sin(2Pi), cos(2Pi)) because of periodicity (and the same for other values)

Whatever, we do, ideally, we do not want to break backwards compatibility.

Does your solution return the results I expect / describe here as well?

I understand your point and, of course, the idea is to have a user-friendly (not overcomplicated) transformer.

Therefore, my suggestion is just to check if the columns is positive and raise an error otherwise

solegalli mentioned this issue May 18, 2024

cyclical features may not always get the right period #765

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cyclical features when variables are negative - does it make sense? #578

Cyclical features when variables are negative - does it make sense? #578

solegalli commented Dec 13, 2022

VascoSch92 commented Aug 27, 2023

solegalli commented Aug 28, 2023

VascoSch92 commented Aug 28, 2023

VascoSch92 commented Aug 28, 2023 •

edited

Loading

solegalli commented Sep 12, 2023

VascoSch92 commented Sep 12, 2023

Cyclical features when variables are negative - does it make sense? #578

Cyclical features when variables are negative - does it make sense? #578

Comments

solegalli commented Dec 13, 2022

VascoSch92 commented Aug 27, 2023

solegalli commented Aug 28, 2023

VascoSch92 commented Aug 28, 2023

VascoSch92 commented Aug 28, 2023 • edited Loading

solegalli commented Sep 12, 2023

VascoSch92 commented Sep 12, 2023

VascoSch92 commented Aug 28, 2023 •

edited

Loading