Skip to content
zstumgoren edited this page Mar 11, 2011 · 31 revisions

Welcome to the LearningPython wiki!

Basic materials

We'll be using Learning Python, 4th edition (important!) by Mark Lutz. It's about $35 and 1,100 pages. If you've ordered the book but need to get started faster, you can check out the Kindle version (Kindle app available for most computer operating systems), which has an extensive sample.

We're also using Python 2.7, which is similar to the version 2.6 discussed in the book. Version 3.x from the book doesn't have libraries that many of us need.

Besides simply working through the book, we'll be working on special projects, including cleaning data and Web scraping. We'll keep in touch through the mailing list, which is tagged with [PythonJournos]. You can also check the mailing list archives for more information.

Install and Configuration

Package Installation

There are a number of ways to install 3rd-party libraries for Python. Generally, the easiest method is to use a package manager that can also handle the downloading/installation of dependencies (additional libraries that a particular module might depend on). Below are the two standard package managers for Python:

Learning Python, Discussion Leaders

  • Chapters 1-3: On your own
  • Chapter 4 - Dave Gulliver (reading due Jan. 10)
  • Chapter 5 - Chris Schnaars (reading due Jan. 17)
  • Chapter 6 - Ron Campbell (reading due Jan. 24)
  • Chapter 7 - Juan-Pablo Velez (reading due Jan. 31)
  • Chapter 8 - Michelle Minkoff (reading due Feb. 7)
  • Chapter 9 - Derek Willis (reading due Feb. 14)
  • Chapter 10 - Anthony DeBarros (reading due Feb. 21)
  • Chapter 11 - Brian Bowling (reading due March 7 (bumped a week due to NICAR conference))
  • Chapter 12 - Jamie Smith Hopkins (reading due March 14)
  • Chapter 13 - Serdar Tumgoren (reading due March 21)
  • Chapter 14 - TBA (reading due March 28)
  • Chapter 15 - TBA (reading due April 4)

Tutorials

  • Install/Config (Serdar)
  • Python's Interactive Interpreter: A Sandbox for Exploring Data and Code (Serdar)
  • Working with CSVs (Serdar)
  • Working with Excel (Serdar)
  • Working with Databases (Serdar)
  • Data Wrangling 101: Scrape, Clean, Output (Serdar) -- Demonstrate how to scrape data from a multi-level website, perform some basic data cleaning and processing, and then output data as CSV and/or insert into a database. Aron Pilhofer has volunteered a test site with data for scraping

Resources

Database Access

Python interacts with database backends using various database interfaces, or APIs. There are different database APIs for each type of database, whether you're dealing with MySQL, Postgres, Oracle, SQLite, etc. Most of these APIs conform to a common standard known as the Python DB-API 2.0. This standardization means that you can use the same Python syntax to query databases, regardless of which API and database back end you're using. There are some minor exceptions, but on the whole the syntax is pretty uniform across the board.

If the developers of the official database API don't offer a version that works with your operating system and/or version of Python, you can try searching for unofficial binaries at sites like the below. Be warned that these are not officially supported, documented, etc.

MySQL

python-mysqldb is Python's primary database adapter for MySQL. Unfortunately, as of February 2011, it only supported Python versions 2.3-2.6, as of February 2011. There are various unofficial binaries floating around for newer Python versions, so those might be worth a try.

Installing python-mysqldb can be tricky, especially on Windows. Below are some links that might be useful.

Web Scraping

Clone this wiki locally