Skip to content

Code and data

Aaditya Dar edited this page Jun 30, 2019 · 7 revisions

Rules copied from https://web.stanford.edu/~gentzkow/research/CodeAndData.pdf

  • Automation

    • Automate everything that can be automated
    • Write a single script that executes all code from beginning to end
  • Directories

    • Separate directories by function
    • Separate files into inputs and outputs
    • Make directories portable
  • Keys

    • Store cleaned data in tables with unique, non-missing keys
    • Keep data normalized as far into your code pipeline as you can
  • Abstraction

    • Abstract to eliminate redundancy
    • Abstract to improve clarity
    • Otherwise, don’t abstract
  • Documentation

    • Don’t write documentation you will not maintain
    • Code should be self-documenting
  • Management

    • Manage tasks with a task management system
    • E-mail is not a task management system
  • Code Style

    • Make your functions shy
    • Order your functions for linear reading
    • Use descriptive names
    • Pay special attention to coding algebra
    • Make logical switches intuitive
    • Be consistent
    • Check for errors

Source: Code and Data for the Social Sciences: A Practitioner's Guide