chapter_notes/BDA_notes_ch12.tex

\documentclass[a4paper,11pt,english]{article}

%\usepackage{babel}
\usepackage[utf8]{inputenc}
%\usepackage[T1]{fontenc}
\usepackage{newtxtext} % times
\usepackage{microtype}
\usepackage{amsmath}
\usepackage[varqu,varl]{inconsolata} % typewriter
\usepackage{url}
\urlstyle{same}
\usepackage{xcolor}

\usepackage[bookmarks=false]{hyperref}
\hypersetup{%
  bookmarksopen=true,
  bookmarksnumbered=true,
  pdftitle={Bayesian data analysis},
  pdfsubject={Comments},
  pdfauthor={Aki Vehtari},
  pdfkeywords={Bayesian probability theory, Bayesian inference, Bayesian data analysis},
  pdfstartview={FitH -32768},
  colorlinks=true,
  linkcolor=blue,
  citecolor=blue,
  filecolor=blue,
  urlcolor=blue
}


% if not draft, smaller printable area makes the paper more readable
\topmargin -4mm
\oddsidemargin 0mm
\textheight 225mm
\textwidth 160mm

%\parskip=\baselineskip

\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\var}{var}
\DeclareMathOperator{\Sd}{Sd}
\DeclareMathOperator{\sd}{sd}
\DeclareMathOperator{\Bin}{Bin}
\DeclareMathOperator{\Beta}{Beta}
\DeclareMathOperator{\Invchi2}{Inv-\chi^2}
\DeclareMathOperator{\NInvchi2}{N-Inv-\chi^2}
\DeclareMathOperator{\logit}{logit}
\DeclareMathOperator{\N}{N}
\DeclareMathOperator{\U}{U}
\DeclareMathOperator{\tr}{tr}
%\DeclareMathOperator{\Pr}{Pr}
\DeclareMathOperator{\trace}{trace}

\pagestyle{empty}

\begin{document}
\thispagestyle{empty}

\section*{Bayesian data analysis -- reading instructions 12} 
\smallskip
{\bf Aki Vehtari}
\smallskip

\subsection*{Chapter 12}

Outline of the chapter 12
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item {\color{gray}12.1 Efficient Gibbs samplers (not part of the course)}
\item {\color{gray}12.2 Efficient Metropolis jump rules (not part of the course)}
\item {\color{gray}12.3 Further extensions to Gibbs and Metropolis (not part of the course)}
\item 12.4 Hamiltonian Monte Carlo (used in Stan)
\item 12.5 Hamiltonian dynamics for a simple hierarchical model (read through)
\item 12.6 Stan: developing a computing environment (read through)
\end{list}

\noindent
There are only 8 pages to read (sections 12.4-12.6) what is inside Stan.\\


\noindent
R and Python demos at \url{https://avehtari.github.io/BDA_course_Aalto/demos.html}
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item demo12\_1 (in R): The No-U-Turn Sampler (NUTS) / Dynamic Hamiltonian Monte Carlo
\item CmdStanR Demos \url{http://avehtari.github.io/BDA_R_demos/demos_rstan/cmdstanr_demo.html}
\item PyStan demos \url{https://github.com/avehtari/BDA_py_demos/blob/master/demos_pystan/pystan_demo.ipynb}
\end{list}


\subsection*{MCMC animations}

These don't include the specific version of dynamic HMC in Stan, but are useful illustrations anyway.
 \begin{itemize}
\item demo12\_1 (in R): The No-U-Turn Sampler (NUTS) / Dynamic Hamiltonian Monte Carlo
 \item Markov Chains: Why Walk When You Can Flow?\\ \url{http://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/}
 \item MCMC animation site by Chi Feng \url{https://chi-feng.github.io/mcmc-demo/}
 \end{itemize}


\subsection*{Hamiltonian Monte Carlo}

An excellent review of static HMC (the number of steps in dynamic
simulation are not adaptively selected) is
\begin{itemize}
\item Radford Neal (2011). MCMC using Hamiltonian dynamics. In Brooks
  et al (ed), {\em Handbook of Markov Chain Monte Carlo}, Chapman \&
  Hall / CRC Press. Preprint \url{https://arxiv.org/pdf/1206.1901.pdf}.
\end{itemize}

Stan uses a variant of dynamic Hamiltonian Monte Carlo (using adaptive
number of steps in the dynamic simulation), which has been further
developed since BDA3 was published. The first dynamic HMC variant was NUTS
\begin{itemize}
\item Matthew D. Hoffman, Andrew Gelman (2014). The No-U-Turn Sampler:
  Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. {\em
    JMLR},
  15:1593--1623 \url{http://jmlr.org/papers/v15/hoffman14a.html}.
\end{itemize}
The No-U-Turn Sampler gave the name NUTS which you can see often
associated with Stan, but the current variant implemented
in Stan has some further developments described (mostly) in
\begin{itemize}
\item Michael Betancourt (2018). A Conceptual Introduction to
  Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434
  \url{https://arxiv.org/abs/1701.02434}.
\end{itemize}

\subsection*{Divergences and BFMI}

\begin{itemize}
\item Divergence diagnostic checks whether the discretized dynamic
simulation has problems due to fast varying density. See more in a
case study
\url{http://mc-stan.org/users/documentation/case-studies/divergences_and_bias.html}
\item Brief Guide to Stan's Warnings
  \url{https://mc-stan.org/misc/warnings.html} provides summary of
  available convergence diagnostics in Stan,d how to interpret them,
  and suggestions for solving sampling problems.
\end{itemize}

\subsection*{Further information about Stan}

\begin{itemize}
\item \url{http://mc-stan.org/} \& \url{http://mc-stan.org/documentation/}
  \begin{itemize}
  \item I recommend to start with these
    \begin{itemize}
    \item Bob Carpenter, Andrew Gelman, Matt Hoffman, Daniel Lee, Ben
      Goodrich, Michael Betancourt, Marcus A. Brubaker, Jiqiang Guo,
      Peter Li, and Allen Riddell (2015) In press for Journal of
      Statistical Software. Stan: A Probabilistic Programming
      Language. \url{http://www.stat.columbia.edu/~gelman/research/published/stan-paper-revision-feb2015.pdf}
    \item Andrew Gelman, Daniel Lee, and Jiqiang Guo (2015) Stan: A
      probabilistic programming language for Bayesian inference and
      optimization. In press, Journal of Educational and Behavior
      Science. \url{http://www.stat.columbia.edu/~gelman/research/published/stan_jebs_2.pdf}
    \end{itemize}
  \item Basics of Bayesian inference and Stan, parts 1+2 Jonah Gabry \& Lauren Kennedy \url{https://www.youtube.com/playlist?list=PLuwyh42iHquU4hUBQs20hkBsKSMrp6H0J}
  \item Modeling Language User's Guide and Reference Manual (more complete reference with lot's of examples) \url{https://github.com/stan-dev/stan/releases/download/v2.17.0/stan-reference-2.17.0.pdf}
  \end{itemize}
\end{itemize}

\subsection*{Compiler and transpiler}

This is a minor comment on the terminology. As a shorthand it's common to see mentioned just Stan compiler, but sometimes the transpiler term is also mentioned as in the slides for this part.

Wikipedia (\url{https://en.wikipedia.org/wiki/Source-to-source_compiler}):\\
\textit{A source-to-source translator, source-to-source compiler (S2S compiler), transcompiler, or transpiler[1] is a type of translator that takes the source code of a program written in a programming language as its input and produces an equivalent source code in the same or a different programming language. A source-to-source translator converts between programming languages that operate at approximately the same level of abstraction, while a traditional compiler translates from a higher level programming language to a lower level programming language.}

So it is more accurate to say that the Stan model code is first transpiled to a C++ code, and then that C++ code is compiled to machine code to create an executable program. Cool thing about the new stanc3 transpiler (\url{https://github.com/stan-dev/stanc3}) is that it can create also, for example, LLVM IR or Tensorflow code.

Using transpiler and compiler allows to develop Stan language to be good for writing models, but get the benefit of speed and external libraries of C++, Tensorflow, and whatever comes in the future.

\end{document}

%%% Local Variables: 
%%% TeX-PDF-mode: t
%%% TeX-master: t
%%% End: