A Simple Illustration of the Most Probable Point

A Simple Illustration of the Most Probable Point, and an example use case to calculate the quantile of an output function.

Source files to knit the R-markdown is in the RMD folder. The knitted html is here.

The following is a trimmed-down version of the RMD for proper display in github. Please see this for the complete version.

Despite criticisms and called by different names, Most Probable Point (MPP) is a fascinating concept, arisen from the field of structural reliability. Other names include Most Probable Failure Point, Most Central Limit-state Point, Most Central Failure Point, Design Point. The last one is deemed the most satisfactory. Nevertheless, we call it MPP in this markdown.

In structural reliability, a performance function $G(\mathbf{X})$ characterizes the response of the system: with a negative value indicating failure and a positive value implying that the system is safe. The border between the negative and positive domains is called the limit state: $G(\mathbf{X})=0$ . Reliability concerns with the probability of not failing, ie. the complement of the probability of failure : $\text{Reliability} = 1 - P_f$ .

The most common method to calculate the probability of failure is the Monte Carlo simulation. However, the required number of samples increases drastically when is getting small. MPP provides an approximation to , with computation cost that does not depend on the magnitude of , because finding MPP is an optimization problem.

Hasofer-Lind prescribed a procedure to obtain the MPP:

Transform all of the input random variables from the physical space into uncorrelated standardized normal random variables.
Find the MPP. The distance from the origin to the MPP in the standardized uncorrelated space is called the Hasofer-Lind index $\beta$ , also called the reliability index.

With First Order Reliability Method (FORM) approximation of the limit state, the probability of failure $Prob(G(\mathbf{X})<0)$ is given by $\Phi(-\beta)$ , where $\Phi$ is the cumulative distribution function (CDF) of the standard normal distribution.

In a very restrictive case, if the performance function $G(\mathbf{X})$ is linear, and therefore the limit state $G(\mathbf{X})=0$ is a hyper-plane, then the probability of failure is given exactly by $\Phi(-\beta)$ , where $\beta$ is the distance from the origin $\mathbf{O}$ to MPP $\mathbf{U}^{*}$ on the hyper-plane of the linear limit-state function $G(\mathbf{U})=0$ in the uncorrelated standardized normally distributed input space. In more relaxed cases, when the performance function is non-linear, or even when the performance function is linear, transformation of input variables can result in non-linear limit-state function, $\Phi(-\beta)$ becomes only an approximation to .

Since is simply a CDF, which is (approximately) $\Phi(-d)$ , and since is the length of the hyper-sphere of the standardized input random variables $\mathbf{U}$ , another subtle use case of MPP is that we can find the quantile of the output function by calculating $G(\mathbf{U}^{*})$ . For example, if we want to find the output quantile at 4-sigma (ie. ), we can find the MPP point $\mathbf{U}^{*}$ on the $G(\mathbf{U})$ hyper-plane that coincides with the hyper-sphere of the standardized input random variables, then the output quantile at 4-sigma is $G(\mathbf{U}^{*})$ . The example given in the last section is solving output quantile instead of calculating .

Example

We picked a 2D input example for easy visualization. Consider a non-linear performance function with 2D inputs, one being log-normally distributed and a second one being normally distributed, with a known correlation coefficient among the two: , with $X_1 \sim \text{Lognormal}(1.5, 0.5^2)$ , $X_2 \sim \mathcal{N}(7,1.5^2)$ , and $\rho_{X_1 X_2}$ = 0.65. Find the $\pm4\sigma$ quantile of using MPP and verify it with Monte Carlo simulations.

The first step of the Hasofer-Lind procedure is to express as with , being uncorrelated normally distributed inputs.

We starts backward from $U_1 \sim \mathcal{N}(0,1^2)$ , and $U_2 \sim \mathcal{N}(0,1^2)$ .
Correlate $\mathbf{U}$ to get $\mathbf{Z}$ , with $\mathbf{Z}$ still being normally distributed $\mathbf{Z} \sim \mathcal{N}(0,1^2)$ , using yet to be determined $\rho_{0, X_1 X_2}$ :

$z_2=\rho_{0, X_1 X_2} u_1 + \sqrt{1-\rho_{0, X_1 X_2}^2} u_2$
Scale and shift $\mathbf{Z}$ to get $\mathbf{V}$ , with $V_1 \sim \mathcal{N}(1.5,0.5^2)$ , and $V_2 \sim \mathcal{N}(7,1.5^2)$ :
Transform normal into lognormal:

$x_1= \exp(v_1)$

Given $X_1 \sim \text{Lognormal}(1.5, 0.5^2)$ , we can calculate its mean and variance:

$\mu_{X_1} = \exp\left(\mu + \frac{\sigma^2}{2}\right) = \exp (1.5 + 0.5^2/2 ) = 5.078419$

$\sigma_{X_1}^2 = \left[\exp(\sigma^2)-1\right] \exp(2\mu+\sigma^2)$

$\sigma_{X_1}^2 = \left[\exp(0.5^2)-1\right] \exp(2 \times 1.5 + 0.5^2) = 7.325112$

Using the semi-empirical equations provided in Kiureghian and Liu 1986:

$\delta_{X_1}=\sigma_{X_1}/\mu_{X_1}=\frac{\sqrt{7.325112}}{5.078419}=0.5329404$

$F(\delta_{X_1})=\frac{\delta_{X_1}}{\sqrt{\ln(1+\delta_{X_1}^2)}}=\frac{0.5329404}{\sqrt{\ln(1+0.5329404^2)}}=1.065881$

Then, we can calculate $\rho_{0, X_1 X_2}$ to be substituted into step 2 above:

$\rho_{0, X_1 X_2}=F \times \rho_{X_1 X_2} = 1.065881 \times 0.65 = 0.6928225$

Following step 1 to 4 outlined above, becomes:

$G(v_1, v_2)= (\exp(v_1)+v_2)^2+3(-\exp(v_1)+v_2)^2$

$G(z_1, z_2)= (\exp(0.5 z_1 + 1.5)+(1.5 z_2 + 7))^2 + ...$

$+3(-\exp(0.5 z_1 + 1.5)+(1.5 z_2 + 7))^2$

$G(u_1, u_2)= (\exp(0.5 u_1 + 1.5)+(1.5 (0.6928225 u_1 + 0.7211082 u_2) + 7))^2+...$

$+3(-\exp(0.5 u_1 + 1.5)+(1.5 (0.6928225 u_1 + 0.7211082 u_2) + 7))^2$

The second step of the Hasofer-Lind procedure is to find MPP. The following code shows a crude iterative method to find the MPP point on the 4-sigma hyper-sphere of the inputs, such that the vector of origin $\mathbf{O}$ to MPP $\mathbf{U}^{*}$ is parallel (for $+4\sigma$ , or anti-parallel for $-4\sigma$ ) with the gradient at the MPP $\nabla G(\mathbf{U}^{*})$ .

We can calculate $\nabla G(u_1, u_2)$ by hand:

$\nabla G(u_1, u_2) = \hat{u_1} \{ 2 [\exp (0.5u_1+1.5) + 1.5(\rho_0 u_1 + \sqrt{1-\rho_0^2} u_2) + 7] \times ...$

$\times [0.5 \exp (0.5u_1+1.5) + 1.5 \rho_0] + ...$

$+6 [-\exp (0.5u_1+1.5) + 1.5(\rho_0 u_1 + \sqrt{1-\rho_0^2} u_2) + 7] \times ...$

$\times [-0.5 \exp (0.5u_1+1.5) + 1.5 \rho_0] \} +...$

$+\hat{u_2} \{ 2 [\exp (0.5u_1+1.5) + 1.5(\rho_0 u_1 + \sqrt{1-\rho_0^2} u_2) + 7] \times ...$

$\times [1.5 \sqrt{1-\rho_0^2} ] + ...$

$+6 [-\exp (0.5u_1+1.5) + 1.5(\rho_0 u_1 + \sqrt{1-\rho_0^2} u_2) + 7] \times [1.5 \sqrt{1-\rho_0^2} ] \}$

Let us calculate $\rho_0$ for the log-normal variable:

#lognormal:  m1=1.5, s1=0.5
m1= 1.5; s1= 0.5

m_x1= exp(m1 + 0.5*s1^2)
var_x1= (exp(s1^2)-1)*exp(2*m1+s1^2)
delta_x1= sqrt(var_x1)/m_x1
f_x1= delta_x1/sqrt(log(1+delta_x1^2))

rho= 0.65
rho_0= f_x1* rho
print (rho_0)

## [1] 0.6928225

We will setup the functions to calculate $G(\mathbf{U})$ and $\nabla G(\mathbf{U})$ .

# calculate L2-norm
f_norm <- function(x) {
  return ( ((x %*% x)[1, 1])^0.5 )
}
# g
f_g <- function(u) {
  u1= u[1]
  u2= u[2]
  return ( (exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))^2 + 3*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))^2 )
}
# dg returns the u1 and u2 components of the gradient g
f_dg <- function(u) {
  u1= u[1]
  u2= u[2]
  return (c(
    (2*(exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(0.5*exp(0.5*u1+1.5)+1.5*rho_0) + 6*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(-0.5*exp(0.5*u1+1.5)+1.5*rho_0)) ,
    (2*(exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(1.5*sqrt(1-rho_0^2)) + 6*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(1.5*sqrt(1-rho_0^2)) )) )
}

Finding $+4\sigma$ MPP point:

r= 4
# u1 and u2 are initial guess of points along the 4sigma circle
u1= 4
u2= 0
u_ = c(u1,u2)
for (i in 1:10) {

  g= f_g(u_)
  dg_ = f_dg(u_)
  
  dot= ((u_/r) %*% (dg_/f_norm(dg_) ) )[1,1]
  print (c(u_, g, dot))
  
  #un are new points
  #There are two MPP i.e. at +/-4sigma, we run this algorithm at +/- the gradient
  #un = u_/r +/- dg_/f_norm(dg_)
  un = u_/r + dg_/f_norm(dg_)
  u_ = un/f_norm(un) * r
}

## [1]    4.0000000    0.0000000 3406.5736598    0.9999159
## [1]    3.99991587   -0.02594262 3407.48601719    0.99997857
## [1]    3.99980952   -0.03903605 3407.71848632    0.99999454
## [1]    3.99973957   -0.04564397 3407.77770205    0.99999861
## [1]    3.99970012   -0.04897873 3407.79278408    0.99999965
## [1]    3.99967916   -0.05066164 3407.79662523    0.99999991
## [1]    3.99966831   -0.05151093 3407.79760350    0.99999998
## [1]    3.99966277   -0.05193952 3407.79785264    0.99999999
## [1]    3.99965996   -0.05215581 3407.79791608    1.00000000
## [1]    3.99965853   -0.05226496 3407.79793224    1.00000000

From the code chunk above, the MPP point for $+4\sigma$ is $\mathbf{U}^{*}=(u_1, u_2) = (3.99965853, -0.05226496)$ , and $G(\mathbf{U}^{*})= 3407.79793224$ .

Finding $-4\sigma$ MPP point:

r= 4
# u1 and u2 are initial guess of points along the 4sigma circle
u1= 4
u2= 0
u_ = c(u1,u2)
for (i in 1:10) {

  g= f_g(u_)
  dg_ = f_dg(u_)
  
  dot= ((u_/r) %*% (dg_/f_norm(dg_) ) )[1,1]
  print (c(u_, g, dot))
  
  #un are new points
  #There are two MPP i.e. at +/-4sigma, we run this algorithm at +/- the gradient
  #un = u_/r +/- dg_/f_norm(dg_)
  un = u_/r - dg_/f_norm(dg_)
  u_ = un/f_norm(un) * r
}

## [1]    4.0000000    0.0000000 3406.5736598    0.9999159
## [1]   0.02594262   3.99991587 391.87421829   0.82363021
## [1] -3.8117762  1.2125850 65.8802208 -0.3835513
## [1] -3.8435730 -1.1076763 10.0469124 -0.8515261
## [1] -3.3963520 -2.1130057  4.4191063 -0.9812408
## [1] -3.1757451 -2.4320039  4.0103315 -0.9995167
## [1] -3.1375572 -2.4810753  4.0016311 -0.9999968
## [1] -3.134428 -2.485027  4.001576 -1.000000
## [1] -3.134211 -2.485301  4.001575 -1.000000
## [1] -3.134196 -2.485320  4.001575 -1.000000

From the code chunk above, the MPP point for $-4\sigma$ is $\mathbf{U}^{*}=(u_1, u_2) = (-3.134196, -2.485320)$ , and $G(\mathbf{U}^{*})= 4.001575$ .

To recap the $\pm 4 \sigma$ points for , approximated by the MPP method are: 4.001575 and 3407.79793224.

Let us now turn to Monte-Carlo simulation (MCS) to verify them.

We pick the number of samples = $3 \times 10^7$ to have enough points to calculate the $\pm 4 \sigma$ quantiles. And follow the transformation of variables in step 1-4 of Hasofer-Lind procedure outlined before. Please be aware that the results may change when you re-run the code chunk

# number of samples
N= 3e7
# u-space: standardized, uncorrelated
u1= rnorm(N)
u2= rnorm(N)

#lognormal:  m1=1.5, s1=0.5
#normal: m2=7, s2=1.5
m1= 1.5; s1= 0.5
m2= 7; s2= 1.5

m_x1= exp(m1 + 0.5*s1^2)
var_x1= (exp(s1^2)-1)*exp(2*m1+s1^2)
delta_x1= sqrt(var_x1)/m_x1
f_x1= delta_x1/sqrt(log(1+delta_x1^2))

rho= 0.65
rho_0= f_x1* rho

#z-space: standardize, correlated
z1= u1
z2= rho_0 * u1 + sqrt(1-rho_0^2) * u2

#v-space: scaled and shifted
v1= s1*z1 + m1
v2= s2*z2 + m2

#x-space:
x1= exp(v1)
x2= v2

Let us verify that we have $X_1 \sim \text{Lognormal}(1.5, 0.5^2)$ , $X_2 \sim \mathcal{N}(7,1.5^2)$ , and $\rho_{X_1 X_2}$ = 0.65 after the variable transformations.

We are plotting the histogram of and below, overlaid with the corresponding distributions (red curve), ie: $X_1 \sim \text{Lognormal}(1.5, 0.5^2)$ , $X_2 \sim \mathcal{N}(7,1.5^2)$ :

library(ggplot2)
library(ggpubr)

## Warning: package 'ggpubr' was built under R version 4.1.2

plothistx1 <- ggplot(data= data.frame(X1 = x1)) +
  geom_histogram(mapping=aes(x=X1, y=..density..), fill="steelblue", color="black", bins=50) +
  stat_function(fun=dlnorm, args=list(mean=m1, sd=s1), color="red", size=1) +
  xlim(0, 20)
plothistx2 <- ggplot(data= data.frame(X2 = x2)) +
  geom_histogram(mapping=aes(x=X2, y=..density..), fill="steelblue", colour="black", bins=50) +
  stat_function(fun=dnorm, args=list(mean=m2, sd=s2), color="red", size=1) +
  xlim(m2-6*s2, m2+6*s2)
ggarrange(plothistx1, plothistx2, ncol=2, nrow=1, labels=c("X1", "X2"))

## Warning: Removed 41703 rows containing non-finite values (stat_bin).

## Warning: Removed 2 rows containing missing values (geom_bar).

## Warning: Removed 2 rows containing missing values (geom_bar).

Let us verify that $\rho_{X_1 X_2} \approx 0.65$ :

cor(x1,x2)

## [1] 0.6499332

Let us calculate the quantiles:

# calculate g(x)
g= (x1+x2)^2 + 3*(-x1+x2)^2
quantile(g, pnorm(-4))

## 0.003167124% 
##      4.76026

quantile(g, pnorm(4))

## 99.99683% 
##  3410.564

I am getting $-4 \sigma$ quantile of to be 4.631767, and $+4 \sigma$ quantile of to be 3402.384 in one of the runs. And of course the results may change as you re-run the code chunk.

gdf= data.frame(G=g)
qqall <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line(color="red")
qqn4 <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line() + xlim(-4.1, -3.9) + ylim(3.5, 5)
qqp4 <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line() + xlim(3.9, 4.1) + ylim(3300, 3500)
ggarrange(qqall, qqn4, qqp4, ncol=3, nrow=1)

## Warning: Removed 29999543 rows containing missing values (geom_point).

## Warning: Removed 2 row(s) containing missing values (geom_path).

## Warning: Removed 29999736 rows containing missing values (geom_point).

## Warning: Removed 2 row(s) containing missing values (geom_path).

Notice that the $+4 \sigma$ approximated from MPP is relatively close to that from (ideal) MCS, whereas the $-4 \sigma$ from MPP approximation is not that accurate. We will see why now.

# contour
zxy = expand.grid(seq(-6,6,0.1),seq(-6,6,0.1))
g = matrix( (exp(s1*zxy$Var1+m1)+(s2*rho_0*zxy$Var1+s2*sqrt(1-rho_0^2)*zxy$Var2+m2))^2 + 3*(- exp(s1*zxy$Var1+m1) + (s2*rho_0*zxy$Var1+s2*sqrt(1-rho_0^2)*zxy$Var2+m2))^2 , length(seq(-6,6,0.1)), length(seq(-6,6,0.1)) )
contour(seq(-6,6,0.1),seq(-6,6,0.1),g, levels=c(2, 4.001575, 30, 100, 300, 1000, 3407.79793777, 5000), asp=1, xlim=c(-6,6), ylim=c(-6,6))
# plotting the 4 sigma circle
r= 4
t= seq(-pi,pi,2*pi/360)
xr= r*cos(t)
yr= r*sin(t)
lines(xr, yr, col= "red")
# MPP points
points(c(-3.134196, 3.99965853),c(-2.485320, -0.05226496), pch=19, col="blue")
abline(v=seq(-6,6), h=seq(-6,6), lty=2, col='lightgray')

From the contour plot above, and since this is a simple 2-D input example, we can tell that the contour (ie. state-limit) at $+4\sigma$ point (G=3407.798) is closer to a linear hyper-plane, whereas the contour $-4\sigma$ point (G=4.001575) is way curvier and won’t be approximated well by a linear hyper-plane.

In summary, we illustrated Hasofer-Lind MPP procedure can be used to approximate the output quantile, as long as the limit-state can be approximated by a hyper-plane.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
RMD		RMD
mpp_github_files/figure-gfm		mpp_github_files/figure-gfm
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Simple Illustration of the Most Probable Point

Example

About

Releases

Packages

Languages

hpandana/most-probable-point

Folders and files

Latest commit

History

Repository files navigation

A Simple Illustration of the Most Probable Point

Example

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages