Skip to content

Commit c9e087b

Browse files
committed
much friendlier to read now... add actual words instead of just code
1 parent d4afaed commit c9e087b

File tree

3 files changed

+244
-176
lines changed

3 files changed

+244
-176
lines changed

sklearn-101/README.md

+6-28
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,7 @@
33
2014-04-04, Josh Montague
44

55

6-
A short and basic introduction to the ``sklearn`` API interface and a couple of very simple examples of using an estimator on some built-in sample data.
7-
8-
The capability of the full [``sklearn`` package](http://scikit-learn.org/stable/index.html) is pretty mind-blowing; this Notebook aims for the lowest hanging fruit, because the same framework is used for the advanced use-cases. This is certainly one of the strengths of ``sklearn``. Note that these materials do not go into explaining *what* the various estimators are doing or how the algorithm works. For those discussions, definitely see the other materials in [this repository](https://github.com/DrSkippy27/Data-Science-45min-Intros) and the [official documentation](http://scikit-learn.org/stable/documentation.html).
6+
A short and basic introduction to the ``sklearn`` API interface and a couple of very simple examples of using an estimator on some built-in sample data (k-nearest neighbors and linear regression).
97

108
This session was built using:
119

@@ -15,34 +13,14 @@ This session was built using:
1513
- numpy 1.8
1614
- sklearn 0.14
1715

18-
----
19-
20-
## Overview
21-
22-
The majority of this material was collected by combining pieces of the official docs (which are possibly the pinnacle of package documentation) and assorted other online materials. To get some of the terminology down, the general framework is as follows...
16+
-----
2317

24-
Generally, we have $n$ samples of data, and trying to predict properties of unknown data. If each sample is more than a single number, it has *features*
2518

26-
### supervised learning
27-
28-
- regression
29-
30-
- ex
31-
32-
- classification
33-
34-
- ex
35-
19+
The capability of the full [``sklearn`` package](http://scikit-learn.org/stable/index.html) is pretty mind-blowing; this Notebook aims for the lowest hanging fruit, because the same framework is used for the advanced use-cases. This is certainly one of the strengths of ``sklearn``. Note that these materials do not go into explaining *what* the various estimators are doing or how the algorithm works. For those discussions, definitely see the other materials in [this repository](https://github.com/DrSkippy27/Data-Science-45min-Intros) and the [official documentation](http://scikit-learn.org/stable/documentation.html).
3620

37-
### unsupervised learning
21+
The majority of this material was collected by combining pieces of the official docs (which are possibly the pinnacle of package documentation) and assorted other online materials. Instead of replicating a bunch of awesome information here, I'll suggest you read the [Quick Start](http://scikit-learn.org/stable/tutorial/basic/tutorial.html) and as much of the [tutorial](http://scikit-learn.org/stable/tutorial/statistical_inference/index.html) as you like before getting started with this.
3822

39-
- clustering
23+
If you want to explore the IPython Notebook without running Python on your own machine, you can also view it at [nbviewer]().
4024

41-
- ex
42-
43-
- dimensionality reduction
25+
Enjoy!
4426

45-
- ex
46-
47-
48-
training v. testing sets

sklearn-101/iris_knn.png

43.3 KB
Loading

sklearn-101/sklearn-101.ipynb

+238-148
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)