pairing.tex

\documentclass[letterpaper]{article}
\usepackage[pass]{geometry}
\usepackage{amsthm}
\usepackage{csquotes}
\usepackage{hyperref}
\usepackage[indent=16pt]{parskip}
\usepackage[table]{xcolor}

\renewcommand{\arraystretch}{2}

\newtheorem{theorem}{Theorem}[section]
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{corollary}[theorem]{Corollary}

\theoremstyle{definition}
\newtheorem{definition}[theorem]{Definition}

\title{Why Pairing is Faster}
\author{Michael Barinek, Tyson Gern, \& Darren Platt}
\date{\today}

\begin{document}
    \maketitle

    \begin{abstract}
        We have long discussed how pair programming, among several benefits, accelerates knowledge sharing, ensures
        collective code ownership, and increases overall developer productivity~\cite{fowler:pairing}.
        These benefits outweigh the previously assumed decrease in velocity and high cost of having two people do
        the work of one.
        Recently, we have been able to show that pairing does not negatively impact velocity.
        On the contrary, pairing can improve velocity significantly even in the short term.
        We complete more features faster by pairing because the code reviews happen in real time rather than
        asynchronously.
    \end{abstract}


    \section{Introduction}\label{sec:introduction}

    Learning any new product domain or language takes time.
    Collaborating with individuals that have a solid understanding of product domain and programming language can be
    highly beneficial for newcomers to a product.
    Pair programming, or having the two developers work side-by-side on a common problem or set of features, is one form
    of this collaboration.

    However, once developers have a solid understanding of product domain and technology stack, the question changes.
    What happens when individuals understand both product domain and technology?
    Should they still pair?

    The simple answer might be yes, because both domain and technology are constantly changing and there is always more
    to learn.
    In addition, the benefits of collective ownership alone might lead individuals to continue to pair.

    \begin{displayquote}[\cite{fowler:pairing}][\textit{Birgitta Böckeler and Nina Siessegger}]
        ``{[Pairing]} increases the chances that anyone on the team feels comfortable changing the code almost anywhere.''
    \end{displayquote}

    Many organizations attempt to layer on processes to ensure code quality and security, however in practice this
    creates an increase in the length of feedback cycles that has the opposite effect.
    In fact, research has found that lightweight change approval processes are the best way to ensure stability in a
    software product.

    \begin{displayquote}[\cite{forsgren2018}][\textit{Nicole Forsgren, Jez Humble, and Gene Kim}]
    ``external approvals were negatively correlated with lead time, deployment frequency, and restore time, and had no
    correlation with change fail rate.
    In short, approval by an external body (such as a manager or CAB) simply doesn't work to increase the stability of
    production systems, measured by the time to restore service and change fail rate.
    However, it certainly slows things down.
    It is, in fact, worse than having no change approval process at all.''
    \end{displayquote}

    In effect, pair programming promotes this concept of lightweight change approval process with fast feedback cycles.
    This alone would be an argument to continue to pair, however we have recently discovered that the largest impact
    comes from the ability to work fast with fewer parallel workstreams.


    \section{Discussion}\label{sec:discussion}

    \subsection{Serial work}\label{subsec:serial-work}

    Consider a typical 8-hour day, and assume that we have two features: $F1$, which takes four hours to complete, and
    $F2$, which takes three hours to complete.
    As is common, these features must be completed sequentially ($F1$ must be successfully completed before starting
    work on $F2$).

    \subsubsection{Pairing}\label{subsubsec:serial-pairing}

    When pairing, the day follows the pattern outlined Table~\ref{tab:serial-pair}.

    \begin{table}[h]
        \centering
        \tiny
        \begin{tabular}{ |l|l|l|l|l|l|l|l|l|l| }
            \hline
            & 8am                     & 9am                     & 10am                    & 11am                    & Lunch & 1pm                    & 2pm                    & 3pm                    & 4pm \\
            \hline
            Phil  & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ &       &      &      &      &     \\
            \hline
            Gerry & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ &       &      &      &      &     \\
            \hline
            Phil  &                         &                         &                         &                         &       & \cellcolor{red!10}$F2$ & \cellcolor{red!10}$F2$ & \cellcolor{red!10}$F2$ &     \\
            \hline
            Gerry &                         &                         &                         &                         &       & \cellcolor{red!10}$F2$ & \cellcolor{red!10}$F2$ & \cellcolor{red!10}$F2$ &     \\
            \hline
        \end{tabular}
        \caption{A typical day pairing}
        \label{tab:serial-pair}
    \end{table}

    Each team member has spent seven hours working, for a total of 14 person-hours.
    The actual time needed to complete the two features is seven hours.
    Review is thorough and quality is high.

    \subsubsection{Soloing}\label{subsubsec:serial-soloing}

    When soloing with pull-requests, the day follows the pattern outlined Table~\ref{tab:serial-solo}.

    \begin{table}[h]
        \centering
        \tiny
        \begin{tabular}{ |l|l|l|l|l|l|l|l|l|l| }
            \hline
            & 8am                     & 9am                     & 10am                    & 11am                    & Lunch & 1pm                            & 2pm                            & 3pm                            & 4pm \\
            \hline
            Phil  & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ &       &        & \cellcolor{blue!10}Rework $F1$ &        &     \\
            \hline
            Gerry &                         &                         &                         &                         &       & \cellcolor{blue!10}Review $F1$ &                                & \cellcolor{blue!10}Review $F1$ &     \\
            \hline
        \end{tabular}
        \caption{A typical day soloing}
        \label{tab:serial-solo}
    \end{table}

    Again, review is thorough and quality is high.
    Each team member has spent seven hours working, so $F1$ has taken 14 person-hours to complete.
    Unfortunately, the actual time needed to complete just one feature is seven hours, so roughly half the work was
    completed.
    We have found that, in practice, developers are not ready to review work immediately upon request,
    so the timeline in Table~\ref{tab:serial-solo-realistic} is more realistic.

    \begin{table}[h]
        \centering
        \tiny
        \begin{tabular}{ |l|l|l|l|l|l|l|l|l|l|l| }
            \hline
            & 8am                     & 9am                     & 10am                    & 11am                    & Lunch & 1pm & 2pm                            & 3pm & 4pm                            & Evening                        \\
            \hline
            Phil  & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ &       &     &        &     & \cellcolor{blue!10}Rework $F1$ &         \\
            \hline
            Gerry &                         &                         &                         &                         &       &     & \cellcolor{blue!10}Review $F1$ &     &                                & \cellcolor{blue!10}Review $F1$ \\
            \hline
        \end{tabular}
        \caption{A typical day soloing}
        \label{tab:serial-solo-realistic}
    \end{table}

    In this case between eight and ten hours of actual time are needed to complete a story.
    We have also found evening review is very common;
    an attempt to keep work flowing at the risk of reducing quality.

    \subsection{Parallel work}\label{subsec:parallel-work}

    While some work must be done serially, it is also common to have features in a backlog that can be worked on in
    parallel.
    Consider the same 8-hour day with two features: $F1$, which takes four hours to complete, and
    $F2$, which takes three hours to complete.
    This time, assume that the features can be completed in parallel.

    \subsubsection{Pairing}\label{subsubsec:parallel-pairing}

    When pairing, the day follows the same pattern outlined Table~\ref{tab:serial-pair}.

    \subsubsection{Soloing}\label{subsubsec:parallel-soloing}

    When soloing with pull-requests, the day follows the pattern outlined Table~\ref{tab:parallel-solo}.

    \begin{table}[h]
        \centering
        \tiny
        \begin{tabular}{ |l|l|l|l|l|l|l|l|l|l| }
            \hline
            & 8am                     & 9am                     & 10am                    & 11am                    & Lunch & 1pm                            & 2pm                            & 3pm                            & 4pm \\
            \hline
            Phil  & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ & \cellcolor{blue!10}$F1$ &       & \cellcolor{red!10}Review $F2$ & \cellcolor{blue!10}Rework $F1$ & \cellcolor{red!10}Review $F2$ &     \\
            \hline
            Gerry & \cellcolor{red!10}$F2$  & \cellcolor{red!10}$F2$  & \cellcolor{red!10}$F2$  & &       & \cellcolor{blue!10}Review $F1$ & \cellcolor{red!10}Rework $F2$ & \cellcolor{blue!10}Review $F1$ &     \\
            \hline
        \end{tabular}
        \caption{A typical day soloing}
        \label{tab:parallel-solo}
    \end{table}

    Here, the results are more similar to pairing.
    Each team member has spent seven hours working, for a total of 14 person-hours.
    The actual time needed to complete the two features is seven hours.

    \subsection{Comparison}\label{subsec:comparison}

    When paring the serial or parallel nature of features has no impact on the actual time needed to
    complete them.
    When soloing, features must be parallelizable to achieve the same productivity.
    Each communication point has the risk of being delayed or a potential misunderstanding.

    So why is pairing faster?
    Because work is not always parallelizable, and attempting to parallelize too much comes with significant risk.
    Simply put: Pairing reduces technical execution risk.
    This includes both the risks associated with non-parallelizable features and the risks of imperfect communication
    and coordination.

    To go fast soloing with pull-requests you constantly need one stream of work for every developer.
    Based on the past few decades of building software for startups and enterprise companies, we are unable to name a
    single product that has had absolutely no dependencies between features.


    \section{Formal proof}\label{sec:formal-proof}

    \subsection{Workflows}\label{subsec:workflows}

    We begin with a series of fundamental definitions to help us define the pairing and pull request workflows.

    \subsubsection{User Stories}\label{subsubsec:user-stories}

    \begin{definition}
        A \textit{user story} is an independently-deployable unit of work that delivers business value.
    \end{definition}

    User stories are the smallest unit of work tracked by software teams.
    By keeping user stories small, teams ensure that business owners have full visibility and control over the order
    in which stories are delivered.
    This ensures that teams are always working on the most important features.

    \begin{definition}
        A \textit{well-functioning team} splits their work into the smallest user stories as possible in order to stay
        flexible and deliver incremental value.
        The code for each user story should be able to be completed within four to eight hours.
    \end{definition}

    In order to track the size of user stories, well-functioning teams estimate story size.

    \begin{definition}
        The \textit{point estimate} of a user story represents the well-functioning team's judgment of the story's
        complexity.
    \end{definition}

    While point estimates are not tied directly to time, the number of points that a well-functioning team can complete
    in a week is relatively constant.
    For the sake of simplicity we'll refer to $n$-hour stories (which take developers $n$ hours to code), assuming that
    the time/complexity correlation has been calculated.

    It is important to call out that a user story is not finished until other developers can build off its results and
    it is deployed to a production-like environment.
    This occurs when the code is merged into the mainline branch.

    \begin{definition}
        A user story is \textit{finished} once the code is integrated into the mainline branch.
    \end{definition}

    \subsubsection{Pair Programming}\label{subsubsec:pairing}

    Pair programming is style of work where two developer work together on a user story while controlling the same machine.
    Code review happens in real time during pair programming;
    as one developer authors the code the other constantly evaluates design decisions and watches for errors.
    This short feedback loop allows for high quality code and rapid knowledge transfer.

    \begin{definition}
        The \textit{pairing workflow} consists of developers working in pairs, conducting live code review, and
        committing to the mainline branch once the code is complete.
    \end{definition}

    As code review happens in real time, the code written for a story is ready to integrate into the mainline as soon as
    it is complete.

    \begin{lemma}
        \label{lemma:pair}
        Two developers working in the pairing workflow will finish an $n$-hour story in $n$ hours.
    \end{lemma}
    \begin{proof}
        Each $n$-hour story takes $n$ hours to code.
        Since code review is completed live while pairing, the code is merged into the mainline branch in $n$ hours.
    \end{proof}

    \subsubsection{Pull Requests}\label{subsubsec:pull-requests}

    In a pull request workflow, developers work individually, completing the code for a given user story before a formal
    code review.
    Once the code is complete, it must be approved by another developer on the team before it can be merged into the
    mainline branch.

    \begin{definition}
        In the \textit{pull request workflow} each developer works individually and submits the code for each user story
        asynchronous review by others through pull requests.
        Another developer thoroughly reviews the pull request and suggests changes.
        The author reviews these changes, updates the pull request, and merges the code into the mainline branch.
    \end{definition}

    Other developers on the team are often busy with other tasks, so there is often a delay before the code is reviewed.

    \begin{lemma}
        \label{lemma:solo}
        A developer working in the pull request workflow will complete an $n$-hour story in $2n$ hours.
    \end{lemma}
    \begin{proof}
        In order to complete a thorough code review, the reviewer spend at least $\frac{n}{4}$ hours reviewing the pull
        request, and the author must the same amount of time reviewing the changes and applying updates.
        To allow for the asynchronous nature of the pull request workflow, we must allow for at least $\frac{n}{2}$
        hours in waiting and communication time.
    \end{proof}

    \subsection{Parallelization}\label{subsec:parallelization}

    We next examine the shape of a team's work, or how developers complete a series of user stories over time.

    \begin{definition}
        A \textit{backlog} is an ordered collection of user stories.
    \end{definition}

    In any non-trivial project there are dependencies between stories that prevent one story from being started until
    another story has been completed.

    \begin{definition}
        A user story $\alpha_1$ \textit{depends on} a user story $\alpha_2$ if developers must wait to start work on
        $\alpha_1$ until $\alpha_2$ is finished.
    \end{definition}

    The pace at which a team moves through a backlog is directly tied to the amount of business value that the team
    delivers.
    Teams strive to deliver value at the fastest pace that is sustainable over an extended period of time.

    \begin{definition}
        A team's \textit{velocity} is the speed at which it delivers user value.
        We measure velocity as the rate at which the teams finishes stories, where each story is weighted by its
        estimate.
        If, in a given day, a team completes stories $\{\alpha_1, \alpha_2,\dots,\alpha_k\}$ with estimates
        $\{n_1, n_2,\dots,n_k\}$ then the team's daily velocity is $\sum_1^k n_k$.
    \end{definition}

    In a team of multiple developers, velocity is maximized if all developers are able to stay busy with story work.
    This is easier said than done.
    Teams must minimize the cases where the complex web of dependencies between stories in a backlog block some
    developers from story work.

    \begin{definition}
        A backlog is \textit{parallelizable} into $n$ workstreams (at a given point in time) if there are $n$ stories
        able to be started, reviewed, or revised.
    \end{definition}

    In the fortunate case that stories are highly parallelizable, teams move quickly regardless of their working model.

    \begin{lemma}
        \label{lemma:parallel}
        A team of $2n$ developers with a backlog parallelizable into at least $2n$ workstreams has an average daily velocity of $4n$,
        regardless of working model (assuming an 8-hour day).
    \end{lemma}
    \begin{proof}
        By Lemma~\ref{lemma:pair} each pair will complete $8n$ worth of stories in an 8-hour day and by
        Lemma~\ref{lemma:solo} each developer will complete $4n$ worth of stories in an 8-hour day, giving each team an
        average daily velocity of $4n$.
    \end{proof}

    The power of pairing is the ability for a team to work quickly on a backlog, even when parallelization is limited.

    \begin{theorem}
        A well-functioning team of developers working as pairs is as fast or faster than a team of solo developers using
        pull requests.
    \end{theorem}
    \begin{proof}
        Suppose a team of $2n$ developers has a backlog that is parallelizable into $m$ workstreams.
        We will examine three different cases for the size $m$, and show that the team's velocity working in pairs is at
        least as fast as the team's velocity using pull requests.

        Suppose that $m\geq 2n$.
        Then by Lemma~\ref{lemma:parallel} the team achieves an equal velocity regardless of working model.

        Next, suppose that $n\leq m < 2n$.
        In a team working in a pairing model, each of the $n$ pairs are able to work unrestricted, so the team has a
        daily velocity of $4n$.
        In a team working with pull requests, only $m$ developers are busy at any point in time, so the team has a daily
        velocity of $2m$.
        Using the original assumption that $m < 2n$ we see that $2m < 4n$, so the team working in the pairing model is
        faster.

        Finally, suppose that $0 < m < n$
        In a team working in a pairing model, $m$ pairs are able to work unrestricted, thus the team has a daily
        velocity of $4m$.
        In a team working with pull requests, only $m$ developers are busy at any point in time, so the team has a daily
        velocity of $2m$.
        As before, the team working in a pairing model is faster.
    \end{proof}


    \section{Summary}\label{sec:summary}

    When paring, dependencies between stories in a backlog have less of an impact on team velocity.
    The impact of dependencies is much higher in a pull request workflow, meaning stories must be highly parallelizable
    to achieve the same productivity.

    \bibliographystyle{plain}
    \bibliography{refs}
\end{document}