description
Coming up with features is difficult, time-consuming, requires expert knowledge. "Applied machine learning" is basically feature engineering. — Andrew Ng

Feature Engineering

Featuretools/featuretools: automated feature engineering
Best Practices for Feature Engineering
Feature Engineering 相關文章推薦
Feature Engineering 特徵工程中常見的方法
Feature Engineering - Handling Cyclical Features
使用sklearn做單機特徵工程 - jasonfreak - 博客園
Discover Feature Engineering, How to Engineer Features and How to Get Good at It
机器学习中，有哪些特征选择的工程方法？ - 知乎
target encoding for categorical features
Python target encoding for categorical features | Kaggle
如何有效處理特徵範圍差異大且類型不一的數據？ - 知乎
An Introduction to Deep Learning for Tabular Data · fast.ai
How to deal with Features having high cardinality
Transform anything into a vector – Insight Data
Dimensionality Reduction Algorithms: Strengths and Weaknesses
Three Effective Feature Selection Strategies – AI³ | Theory, Practice, Business – Medium
Feature Engineering and Selection: A Practical Approach for Predictive Models
Feature Engineering for Machine Learning | Udemy
Open Machine Learning Course. Topic 6. Feature Engineering and Feature Selection
A Complete Machine Learning Walk-Through in Python: Part One
Automated Feature Engineering in Python – Towards Data Science
A Feature Selection Tool for Machine Learning in Python
數據科學中的陷阱：定性變量的處理 | 機器之心
plasticityai/magnitude: A fast, efficient universal vector embedding utility package.
featuretools-workshop/featuretools-workshop.ipynb at master · fred-navruzov/featuretools-workshop
Featuretools for Good | Kaggle
Feature Engineering 相關文章推薦 – Rick Liu – Medium
Why Automated Feature Engineering Will Change the Way You Do Machine Learning
Deep Feature Synthesis: How Automated Feature Engineering Works | Feature Labs
https://www.slideshare.net/DataRobot/featurizing-log-data-before-xgboost

Useful Approaches

Aggregating Numeric Columns
Aggregating Categorical Columns

Automated

A Hands on Guide to Automated Feature Engineering using Featuretools
automated-feature-engineering/Automated_Feature_Engineering.ipynb at master · WillKoehrsen/automated-feature-engineering
如何用Python做自动化特征工程 | 机器之心

Scaling

scikit-learn:

Indeed many estimators are designed with the assumption that each feature takes values close to zero or more importantly that all features vary on comparable scales. In particular, metric-based and gradient-based estimators often assume approximately standardized data (centered features with unit variances). A notable exception are decision tree-based estimators that are robust to arbitrary scaling of the data.

machinelearningmastery.com:

Decompose Categorical Attributes

Imagine you have a categorical attribute, like “Item_Color” that can be Red, Blue or Unknown.

Unknown may be special, but to a model, it looks like just another colour choice. It might be beneficial to better expose this information.

You could create a new binary feature called “Has_Color” and assign it a value of “1” when an item has a color and “0” when the color is unknown.

Going a step further, you could create a binary feature for each value that Item_Color has. This would be three binary attributes: Is_Red, Is_Blue and Is_Unknown.

These additional features could be used instead of the Item_Color feature (if you wanted to try a simpler linear model) or in addition to it (if you wanted to get more out of something like a decision tree).

2nd place of a Kaggle competition:

I calculated the lag of "date_first_booking" and "date_account_created" and divided this lag feature into four categories (0, [1, 365], [-349,0), NA).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature-engineering.md

feature-engineering.md

Feature Engineering

Useful Approaches

Automated

Scaling

Decompose Categorical Attributes

Files

feature-engineering.md

Latest commit

History

feature-engineering.md

File metadata and controls

Feature Engineering

Useful Approaches

Automated

Scaling

Decompose Categorical Attributes