You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Long story short, extract_features is not working as expected, it produces a lot of invalid values > [np.nan, np.inf, -np.inf].
Steps to reproduce the issue:
Have an input dataframe in the format used by tsfresh, with multiple time series of different variables. Also have another df with classes for the IDs in the input dataframe (I'm working on a multivariate time series classification problem). So, X_train_ts and y_train_ts
Apply tsfresh code to extract relevant features for classification: features_filtered_direct = extract_relevant_features(X_train_ts, y_train_ts,column_id='ID', column_sort='week')
Extract the settings object from the calculated relevant features: chosen_features = tsfresh.feature_extraction.settings.from_columns(features_filtered_direct)
Now use this extracted settings on the same input X_train_ts to try and get the same features_filtered_direct object: features = extract_features(X_train_ts, column_id='ID',column_sort='week', kind_to_fc_parameters=chosen_features)
The above command produces different df with a lot of invalid values: features.isin([np.nan, np.inf, -np.inf]).sum().sort_values()
The text was updated successfully, but these errors were encountered:
@ognjenantonijevic I noticed that the extract_relevant_features calls extract_features with impute_function=impute. This may be the difference that you are observing. See if it changes if you add that to your step #4. I got the same error as you when trying the steps on robot_execution_failures dataset and the mismatch went away when I added this step.
Long story short, extract_features is not working as expected, it produces a lot of invalid values > [np.nan, np.inf, -np.inf].
Steps to reproduce the issue:
features_filtered_direct = extract_relevant_features(X_train_ts, y_train_ts,column_id='ID', column_sort='week')
chosen_features = tsfresh.feature_extraction.settings.from_columns(features_filtered_direct)
features = extract_features(X_train_ts, column_id='ID',column_sort='week', kind_to_fc_parameters=chosen_features)
features.isin([np.nan, np.inf, -np.inf]).sum().sort_values()
The text was updated successfully, but these errors were encountered: