You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
classify setosa vs. versicolor using sepal_length as the only predictor
encode the species as 0 and 1, respectively
df=iris.query("species == ('setosa', 'versicolor')")
y_0=pd.Categorical(df["species"]).codesx_n="sepal_length"x_0=df[x_n].valuesx_c=x_0-x_0.mean() # center the data
two deterministic variables in this model:
θ: ouput of the logistic function applied to µ
bd: the "boundary decision" is the value of the predictor variable used to separate classes
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [β, α]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:04<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 17 seconds.
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/arviz/stats/stats.py:484: FutureWarning: hdi currently interprets 2d data as (draw, shape) but this will change in a future release to (chain, draw) for coherence with other functions
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, alpha]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [6000/6000 00:23<00:00 Sampling 2 chains, 7 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 40 seconds.
There were 7 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 25% for some parameters.
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/arviz/stats/stats.py:484: FutureWarning: hdi currently interprets 2d data as (draw, shape) but this will change in a future release to (chain, draw) for coherence with other functions
Interpreting the coefficients of a logisitic regression
because the model uses a non-linear link function, the effect of $\beta$ is a non-linear function on $x$
if $\beta$ is positive, then as $x$ increases, so to does $\Pr(y=1)$, but non-linearly
some algebra to understand the effect of a coefficient
$$
\theta = \text{logistic}(\alpha + X \beta) \quad \text{logit}(z) = \log(\frac{z}{1-z}) \\
\text{logit}(\theta) = \alpha + X \beta \\
\log(\frac{\theta}{1-\theta}) = \alpha + X \beta \\
\log(\frac{\Pr(y=1)}{1-\Pr(y=1)}) = \alpha + X \beta
$$
recall: $\frac{\Pr(y=1)}{1 - \Pr(y=1)}$ = "odds"
"In a logistic regression, the $\beta$ coefficient encodes the increase in log-odds units by unit increase of the $x$ variable."
Dealing with correlated variables
author recommends scale and standardize all non-categorical variables then use a Student's t-distribution for the prior
Dealing with unbalanced classes
logistic regression had difficulty finding the boundary when the classes are unbalanced
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, alpha]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [6000/6000 00:14<00:00 Sampling 2 chains, 102 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 23 seconds.
There were 68 divergences after tuning. Increase `target_accept` or reparameterize.
There were 34 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 25% for some parameters.
/usr/local/Caskroom/miniconda/base/envs/bayesian-analysis-with-python_e2/lib/python3.9/site-packages/arviz/stats/stats.py:484: FutureWarning: hdi currently interprets 2d data as (draw, shape) but this will change in a future release to (chain, draw) for coherence with other functions
options to fix the problem of unbalanced data:
collect equal amounts of all classes (not always possible)
add prior information to help constrain the model
check the uncertainty of the model and run PPCs to see if the results are useful
create alternative models (explained later in this chapter)
Softmax regression
softmax regression is one way to generalize logistic regression to more than two classes
Discrimitive and generative models
a discriminating model can also be made by finding the means of both data sets and taking the average
Poisson regression
useful for count data
discrete, non-negative integers
Poisson distribution
the number of expected events within a given amount of time
assumes events occur independently of each other and at a fixed rate
parameterized using one value $\mu$ (often $\lambda$ is used, too)
probability mass function of Poisson distribution:
$\mu$: average number of events per unit time/space
Poisson distribution is a special case of binomial distribution when the number of trials $n$ is very large and the probability of success $p$ is very low
The zero-inflated Poisson model
use if extra 0's due to missing data - not real 0's
mixture of 2 processes:
one modeled by a Poisson with probability $\psi$
one giving extra zeros with probability $1 - \psi$
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [theta, psi]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [6000/6000 00:06<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 17 seconds.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [beta, alpha, psi]
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [6000/6000 00:11<00:00 Sampling 2 chains, 0 divergences]
Sampling 2 chains for 1_000 tune and 2_000 draw iterations (2_000 + 4_000 draws total) took 17 seconds.
The acceptance probability does not match the target. It is 0.8982315566653549, but should be close to 0.8. Try to increase the number of tuning steps.
<style>
/*Turns off some styling*/
progress {
/*gets rid of default border in Firefox and Opera.*/
border: none;
/*Needs to be in here for Safari polyfill so background images work as expected.*/
background-size: auto;
}
.progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar {
background: #F44336;
}
</style>
100.00% [4000/4000 00:03<00:00]