File tree Expand file tree Collapse file tree 1 file changed +16
-2
lines changed Expand file tree Collapse file tree 1 file changed +16
-2
lines changed Original file line number Diff line number Diff line change 1
1
How to select the right algorithm for the task
2
2
==============================================
3
3
4
+ To conclude this session here are some practical hints for selecting
5
+ the right algorithm when facing a practical problem.
4
6
5
- Some practical hints for selecting the right algorithm when facing
6
- a practical problem .
7
+ - If the data is high dimensional and sparse (text data), most of the time
8
+ linear classifiers with a bit of regularization will work well .
7
9
10
+ - If the data is dense, low to medium dimensional: try to further reduce the
11
+ dimensionality with PCA for instance and try both linear and non linear
12
+ models (e.g. SVC with RBF kernel).
13
+
14
+ - ``SVC `` with gaussian RBF kernel and ``KMeans `` clustering can
15
+ benefit a lot from data normalization with (``PCA `` or ``RandomizedPCA ``
16
+ with ``whiten=True ``). Try various values for ``n_components `` with grid
17
+ search to be sure no to truncate the data too hard.
18
+
19
+ - There is no free lunch: the best algorithm is data dependant. If
20
+ you try many different models, reserve a held out evaluation set
21
+ that is not used during the model selection process.
8
22
You can’t perform that action at this time.
0 commit comments