You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bayesian statistics is conceptually very simple; we have the knows and the unknowns; we use Bayes' theorem to condition the latter on the former. If we are lucky, this process will reduce the uncertainty about the unknowns. Generally, we refer to the knowns as data and treat it like a constant, and the unknowns as parameters and treat them as probability distributions.
PyMC3 primer
a library for probabilistic programming
uses NumPy and Theano
Theano is a deep learning algorithm that supplies the automatic differentiation required for sampling by PyMC3
Theano also compiles the code to C for faster execution
Theano is no longer developed, but the PyMC devs are currently maintaining it
the next version of PyMC will use a different backend
Flipping coins the PyMC3 way
make synthetic coin flipping data, but we know the real $\theta$ value
np.random.seed(123)
trials=4theta_real=0.35data=stats.bernoulli.rvs(p=theta_real, size=trials)
data
array([1, 0, 0, 0])
Model specification
need to specify the likelihood function and prior probability distribution
likelihood: binomial distribution with $n=1$ and $p=\theta$
prior: beta distribution with $\alpha=1$ and $\beta=1$
this beta distribution is equivalent to a uniform distirbution from $[0,1]$
$$
\theta \sim \text{Beta}(\alpha, \beta) \\
y \sim \text{Bern}(p=\theta)
$$
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [θ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:05<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 16 seconds.
Pushing the inference button
the last line of the model specification above is "pressing the inference button"
asks for 1,000 samples from the posterior distribution
Summarizing the posterior
use plot_trace() to see the distribution of sampled values for $\theta$ and the MCMC chains
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:04<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 12 seconds.
array([[<AxesSubplot:title={'center':'µ'}>,
<AxesSubplot:title={'center':'µ'}>],
[<AxesSubplot:title={'center':'σ'}>,
<AxesSubplot:title={'center':'σ'}>]], dtype=object)
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/pymc3/sampling.py:1707: UserWarning: samples parameter is smaller than nchains times ndraws, some draws and/or chains may not be represented in the returned posterior predictive sample
warnings.warn(
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [100/100 00:00<00:00]
arviz.data.io_pymc3 - WARNING - posterior predictive variable y's shape not compatible with number of chains and draws. This can mean that some draws or even whole chains are not represented.
from the plot, see that the mean is shifted to the right and variance is quite high
see a better model for this data in the next section
Robust inferences
Student's t-distribution
instead of removing outliers, replace the Gaussian with a Student's t-distribution
has 3 parameters: mean, scale, and degrees of freedom ($\nu \in [0, \infty]$)
call the $\nu$ the normality parameter because it controls how "normal-like" the distribution is
$\nu = 1$: heavy tails (Cauchy or Lorentz distributions)
as $\nu$ gets larger, it appraoches a Gaussian
for $\nu \leq 1$: no defined mean
for $\nu \leq 2$: no defined variance (or std. dev.)
withpm.Model() asmodel_t:
µ=pm.Uniform("µ", 40, 75)
σ=pm.HalfNormal("σ", sd=10)
ν=pm.Exponential("ν", 1/30) # d.o.f. is parameterized as the inverse meany=pm.StudentT("y", mu=µ, sd=σ, nu=ν, observed=data)
trace_t=pm.sample(1000)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [ν, σ, µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:07<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 14 seconds.
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/pymc3/sampling.py:1707: UserWarning: samples parameter is smaller than nchains times ndraws, some draws and/or chains may not be represented in the returned posterior predictive sample
warnings.warn(
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [100/100 00:00<00:00]
arviz.data.io_pymc3 - WARNING - posterior predictive variable y's shape not compatible with number of chains and draws. This can mean that some draws or even whole chains are not represented.
Groups comparison
this author emphasizes useing effect size and uncertainty over just presenting a p-value (or some other yes/no indicator)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [12000/12000 00:11<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 5_000 draw iterations (2_000 + 10_000 draws total) took 20 seconds.
defined 2 hhyper-priors that will influence the beta prior
instead of putting hyper-priors on the parameters $\alpha$ and $\beta$, indirectly define them with $\mu$ (mean of the beta distribution)and $\kappa$ (the precision of the beta distribution; effectively the inverse of std. dev.)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [θ, κ, µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [6000/6000 00:08<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 15 seconds.
diff=cs_data.theo.values-cs_data.exp.valuesidx=pd.Categorical(cs_data["aa"]).codesgroups=len(np.unique(idx))
print(f"There are {groups} different amino acids in the data.")
There are 19 different amino acids in the data.
for comparison, we will build both a non-hierarchical model and a hierarchical model
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:09<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 15 seconds.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [σ, µ, σ_µ, µ_µ]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:10<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 17 seconds.
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/arviz/data/io_pymc3.py:87: FutureWarning: Using `from_pymc3` without the model will be deprecated in a future release. Not using the model will return less accurate and less useful results. Make sure you use the model argument or call from_pymc3 within a model context.
warnings.warn(