Skip to content

Commit 5339710

Browse files
committed
many README updates
1 parent 0ca3cbb commit 5339710

File tree

2 files changed

+12
-36
lines changed

2 files changed

+12
-36
lines changed

README.md

+12-36
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,8 @@
11
## Tutorial: Machine Learning with Text in scikit-learn
22

3-
Presented by [Kevin Markham](http://www.dataschool.io/about/) at PyCon 2016 (Portland, Oregon)
3+
Presented by [Kevin Markham](http://www.dataschool.io/about/) at PyCon on May 28, 2016. Watch the complete [tutorial video](https://www.youtube.com/watch?v=ZiKMIuYidY0&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=10) on YouTube.
44

5-
### Files
6-
7-
* Tutorial: [notebook](tutorial.ipynb), [notebook with output](tutorial_with_output.ipynb), [script](tutorial.py), [SMS dataset](data/sms.tsv)
8-
* Exercise: [notebook](exercise.ipynb), [notebook with solution](exercise_solution.ipynb), [script](exercise.py), [script with solution](exercise_solution.py), [Yelp dataset](data/yelp.csv)
9-
10-
### Welcome!
11-
12-
This repository contains the data files and the notebooks/scripts that you will need for the tutorial.
13-
14-
A detailed description of the tutorial is below, including a list of **required software** and **knowledge prerequisites**. If you need a refresher on any of the prerequisite material, I have listed my recommended resources.
15-
16-
Due to slow Internet connections at the conference, you should plan to download this repository and install the required software **before arriving at the conference**.
17-
18-
I look forward to meeting you on **Saturday, May 28 at 9:00am**! Please email me at [[email protected]](mailto:[email protected]) if you have any questions at all.
5+
[![Watch the complete tutorial video on YouTube](youtube.jpg)](https://www.youtube.com/watch?v=ZiKMIuYidY0&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=10 "Machine Learning with Text in scikit-learn - PyCon 2016")
196

207
### Description
218

@@ -31,6 +18,12 @@ Attendees will need to bring a laptop with [scikit-learn](http://scikit-learn.or
3118

3219
I will be leading the tutorial using the IPython/Jupyter notebook, and have added a pre-written notebook to this repository. I have also created a Python script that is identical to the notebook, which you can use in the Python environment of your choice.
3320

21+
### Tutorial Files
22+
23+
* IPython/Jupyter notebooks: [tutorial.ipynb](tutorial.ipynb), [tutorial_with_output.ipynb](tutorial_with_output.ipynb), [exercise.ipynb](exercise.ipynb), [exercise_solution.ipynb](exercise_solution.ipynb)
24+
* Python scripts: [tutorial.py](tutorial.py), [exercise.py](exercise.py), [exercise_solution.py](exercise_solution.py)
25+
* Datasets: [data/sms.tsv](data/sms.tsv), [data/yelp.csv](data/yelp.csv)
26+
3427
### Prerequisite Knowledge
3528

3629
Attendees to this tutorial should be comfortable working in Python, should understand the basic principles of machine learning, and should have at least basic experience with both pandas and scikit-learn. However, no knowledge of advanced mathematics is required.
@@ -60,27 +53,10 @@ In this tutorial, we'll answer all of those questions, and more! We'll start by
6053

6154
Kevin Markham is the founder of [Data School](http://www.dataschool.io/) and the former lead instructor for [General Assembly's Data Science course](https://github.com/justmarkham/DAT8) in Washington, DC. He is passionate about teaching data science to people who are new to the field, regardless of their educational and professional backgrounds, and he enjoys teaching both online and in the classroom. Kevin's professional focus is supervised machine learning, which led him to create the popular [scikit-learn video series](https://github.com/justmarkham/scikit-learn-videos) for Kaggle. He has a degree in Computer Engineering from Vanderbilt University.
6255

63-
### Tutorial Introduction
64-
65-
* Required files for today:
66-
* Clone or download this repository: [http://bit.ly/pycon2016](http://bit.ly/pycon2016)
67-
* IPython/Jupyter notebooks ([tutorial.ipynb](tutorial.ipynb), [exercise.ipynb](exercise.ipynb)) or Python scripts ([tutorial.py](tutorial.py), [exercise.py](exercise.py))
68-
* Datasets in the `data` subdirectory ([sms.tsv](data/sms.tsv), [yelp.csv](data/yelp.csv))
69-
* Required software for today:
70-
* [scikit-learn](http://scikit-learn.org/stable/install.html) and [pandas](http://pandas.pydata.org/pandas-docs/stable/install.html) (and their dependencies)
71-
* [Anaconda distribution of Python](https://www.continuum.io/downloads) is an easy way to install both of these
72-
* Both Python 2 and 3 are welcome
73-
* Flash drives are available with Anaconda installers and tutorial files
74-
* About me:
75-
* Founder of Data School: [blog](http://www.dataschool.io/), [YouTube](https://youtube.com/user/dataschool)
76-
* Twitter: [@justmarkham](https://twitter.com/justmarkham)
77-
78-
* How the tutorial will work
79-
* What we'll be learning today
80-
* What I expect you already know
81-
* Agenda
82-
83-
### Related Resources
56+
57+
* Twitter: [@justmarkham](https://twitter.com/justmarkham)
58+
59+
### Recommended Resources
8460

8561
**Text classification:**
8662
* Read Paul Graham's classic post, [A Plan for Spam](http://www.paulgraham.com/spam.html), for an overview of a basic text classification system using a Bayesian approach. (He also wrote a [follow-up post](http://www.paulgraham.com/better.html) about how he improved his spam filter.)

youtube.jpg

79 KB
Loading

0 commit comments

Comments
 (0)