Skip to content

Commit 26a647f

Browse files
author
peicheng
committed
updated
1 parent 5c705f8 commit 26a647f

File tree

1 file changed

+10
-12
lines changed

1 file changed

+10
-12
lines changed

FeatureEngineering.md

+10-12
Original file line numberDiff line numberDiff line change
@@ -8,28 +8,26 @@ We can combine tables, transform features to create new features.
88
We can even do so with help. Let's see [featuretools](https://www.featuretools.com/)
99

1010
After we created "meaningful" features, we are now ready to transform their format.
11-
### OneHot Encoding: Dealing with discrete features
1211

13-
### Mathematical Operations: Creating New Features
1412

15-
#### Single Column
16-
Apply functions such as log, sqrt, pow, or other functions that take 1 input.
17-
18-
#### Two Columns
19-
Apply product,ratio, or other transformations that take 2 or more inputs.
20-
21-
### Bucketing: Dealing with continuous feature
13+
## Dimension Reduction (Yin-side)
2214

23-
### NLP: dealing with text feature
15+
There are many available tools for this.
2416

25-
## Dimension Reduction (Yin-side)
17+
### Statistics: remove highly correlated data
18+
We can do this automatically using [featuretools](https://www.featuretools.com/).
19+
Or we can remove them by hand.
2620

2721
### Principal Component Analysis (PCA)
22+
This is a classic and fast method, but it has it's limitations.
23+
Remember standardize your data before you do this.
24+
We can do this using sklearn very fast.
2825

2926
### auto-encoding using deep neural networks
27+
We need sufficient data to do this well, more complicated than PCA>
3028

31-
### Feature selection
3229

30+
### Feature selection
3331
Before the final task, we could try to solve a representive task. Use feature importance for a model, usually trees, to select features. Use SelectKBest (e.g. [here](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html)).
3432
In addition, we can do selection while training - LASSO regularization etc.
3533

0 commit comments

Comments
 (0)