-
Notifications
You must be signed in to change notification settings - Fork 14
abresler/NBA-Data-Stuff
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repository is a general set of tools for parsing and analyzing NBA play-by-play data; methods are primarly those from 82games.com and basketballvalue.com, among others; data is primarily obtained from basketballvalue.com, though apparently espn has all of the data available online as well; ########################################################################################### Use: 24/03/12 Update: Just added some files to pull pages from ESPN corresponding to games' play-by-play data and box score data (mainly to get the starters, though can also use to confirm successful extraction of play-by-play data by comparing player stats). Can either provide the actual ESPN game ids, or can just provide a date (YYYYMMDD), and the code grabs the pages for games that occured on that date; more to come; 20/03/12 Update: Starting to make changes for interfacing w/ a local MySQL database. Also, going to extend the "player" class to account for "player actions", i.e. time-stamped game actions. Players will be provided with a unique ID for each game they participate in. These will be linked back to a "global" player ID for tracking over season / life (need to check into ESPN's ID assignment for players; I'm sure they have unique codings, and it would be reasonable to use these, for obvious reasons). This, I think, will reduce the effort required for more in-depth stat and performance analysis. The database for the NBA data will consist of 5 main tables initially, with other possible additions: a) Complete copy of as-is ESPN play-by-play lines with dummy index b) Modified play-by-play data for each game-player with dummy index so that given a player-game id, one can easily access the corresponding set of actions for that individual; also set up such that one can easily obtain set of all actions for a game from the table; entries are parsed such that events can easily be separated by type, etc., with no processing, only queries; c) A [player-game id, game-id, actions reference, player-global-id, player-game-stats] table to relate each player-game-id to the global player, the corresponding in-game actions, the corresponding game, and get a game summary of performance d) A [player-global-id, stats-life, player-game-id-list] table e) 02/03/12 Update: Just finished with some changes to the code, and it's up and running w/ a new 'player' and 'game' class to carry around / organize data while parsing the play-by-play (will upload new stuff soon). Still need to work out a minutes / time tracker for players and a "who's on the floor" tracker as well. 29/02/12 The NBA_runparse.py used to (yesterday, before I started changing stuff around and adding a player and game class to track data over a season) run the play-by-play parsing for a single game, obtaining the typical set of game stats (except for minutes played. I still need to create that module. Also, I need to add a module for tracking who is on the court at a given time. This would be trivial if the play-by-play listed the starting 5, but no). In other words, there is no way to run the code in the manner intended at the moment. ########################################################################################### General Resources: data downloads: http://basketballvalue.com/downloads.php espn play-by-play w/ shot distance: # not quite the correct form for this... http://sports.espn.go.com/nba/gamepackage/data/shot?gameId= phpbb: http://apbr.org/metrics/ mit sloan bb papers: http://www.sloansportsconference.com/?page_id=462 82Games.com http://82games.com/index.htm http://www.82games.com/comm30.htm These seem useful too: http://www.basketball-reference.com/ http://www.sportscity.com/
About
basic parsing and analysis of NBA data
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published