Skip to content

Commit

Permalink
Merge pull request #485 from spsanderson/development
Browse files Browse the repository at this point in the history
towards #471
  • Loading branch information
spsanderson authored May 4, 2024
2 parents 07bda16 + 5f05725 commit 5ed579a
Show file tree
Hide file tree
Showing 50 changed files with 430 additions and 26 deletions.
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,9 @@ export(util_uniform_stats_tbl)
export(util_weibull_aic)
export(util_weibull_param_estimate)
export(util_weibull_stats_tbl)
export(util_zero_truncated_poisson_param_estimate)
export(util_ztn_binomial_param_estimate)
export(util_ztp_aic)
importFrom(data.table,.SD)
importFrom(data.table,as.data.table)
importFrom(data.table,melt)
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ None
## New Features
1. #468 - Add function `util_negative_binomial_aic()` to calculate the AIC for the negative binomial distribution.
2. #470 - Add function `util_ztn_binomial_param_estimate()` and `util_rztnbinom_aic()` to estimate the parameters and calculate the AIC for the zero-truncated negative binomial distribution.
3. #467 - Add function `util_zero_truncated_poisson_param_estimate()` to estimate
the parameters of the zero-truncated Poisson distribution.

## Minor Improvements and Fixes
1. Fix #468 - Update `util_negative_binomial_param_estimate()` to add the use of
Expand Down
127 changes: 127 additions & 0 deletions R/est-param-ztpois.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
#' Estimate Zero Truncated Poisson Parameters
#'
#' @family Parameter Estimation
#' @family Poisson
#'
#' @author Steven P. Sanderson II, MPH
#'
#' @details
#'
#' This function estimates the parameter lambda of a Zero-Truncated Poisson distribution
#' based on a vector of non-negative integer values `.x`. The Zero-Truncated Poisson
#' distribution is a discrete probability distribution that models the number of events
#' occurring in a fixed interval of time, given that at least one event has occurred.
#'
#' The estimation is performed by minimizing the negative log-likelihood of the observed
#' data `.x` under the Zero-Truncated Poisson model. The negative log-likelihood function
#' used for optimization is defined as:
#'
#' \deqn{-\sum_{i=1}^{n} \log(P(X_i = x_i \mid X_i > 0, \lambda))}{,}
#'
#' where \( X_i \) are the observed values in `.x` and `lambda` is the parameter
#' of the Zero-Truncated Poisson distribution.
#'
#' The optimization process uses the `optim` function to find the value of `lambda`
#' that minimizes this negative log-likelihood. The chosen optimization method is Brent's
#' method (`method = "Brent"`) within a specified interval `[0, max(.x)]`.
#'
#' If `.auto_gen_empirical` is set to `TRUE`, the function will generate empirical data
#' statistics using `tidy_empirical()` for the input data `.x` and then combine this
#' empirical data with the estimated Zero-Truncated Poisson distribution using
#' `tidy_combine_distributions()`. This combined data can be accessed via the
#' `$combined_data_tbl` element of the function output.
#'
#' The function returns a tibble containing the estimated parameter `lambda` along
#' with other summary statistics of the input data (sample size, minimum, maximum).
#'
#' @description This function will attempt to estimate the Zero Truncated Poisson
#' lambda parameter given some vector of values `.x`. The function will return a
#' tibble output, and if the parameter `.auto_gen_empirical` is set to `TRUE`
#' then the empirical data given to the parameter `.x` will be run through the
#' `tidy_empirical()` function and combined with the estimated Zero Truncated
#' Poisson data.
#'
#' @param .x The vector of data to be passed to the function. Must be non-negative
#' integers.
#' @param .auto_gen_empirical This is a boolean value of TRUE/FALSE with default
#' set to TRUE. This will automatically create the `tidy_empirical()` output
#' for the `.x` parameter and use the `tidy_combine_distributions()`. The user
#' can then plot out the data using `$combined_data_tbl` from the function output.
#'
#' @examples
#' library(dplyr)
#' library(ggplot2)
#'
#' tc <- tidy_zero_truncated_poisson() |> pull(y)
#' output <- util_zero_truncated_poisson_param_estimate(tc)
#'
#' output$parameter_tbl
#'
#' output$combined_data_tbl |>
#' tidy_combined_autoplot()
#'
#' @return
#' A tibble/list
#'
#' @name util_zero_truncated_poisson_param_estimate
NULL

#' @export
#' @rdname util_zero_truncated_poisson_param_estimate

util_zero_truncated_poisson_param_estimate <- function(.x, .auto_gen_empirical = TRUE) {

# Tidyeval ----
x_term <- as.numeric(.x)
minx <- min(x_term)
maxx <- max(x_term)
n <- length(x_term)

# Define negative log-likelihood function
neg_loglik <- function(lambda, data) {
-sum(log(actuar::dztpois(x_term, lambda = lambda)))
}

# Optimize to find lambda that minimizes negative log-likelihood
optim_result <- stats::optim(par = 1, fn = neg_loglik, data = x_term,
method = "Brent",
lower = 0, upper = max(x))

# Extract estimated lambda
lambda_est <- optim_result$par

# Return Tibble ----
if (.auto_gen_empirical) {
te <- tidy_empirical(.x = x_term)
td <- tidy_zero_truncated_poisson(.n = n, .lambda = round(lambda_est, 3))
combined_tbl <- tidy_combine_distributions(te, td)
}

# Return Tibble
ret <- dplyr::tibble(
dist_type = "Zero Truncated Poisson",
samp_size = n,
min = minx,
max = maxx,
lambda = lambda_est
)

# Return ----
attr(ret, "tibble_type") <- "parameter_estimation"
attr(ret, "family") <- "zero truncated poisson"
attr(ret, "x_term") <- .x
attr(ret, "n") <- n

if (.auto_gen_empirical) {
output <- list(
combined_data_tbl = combined_tbl,
parameter_tbl = ret
)
} else {
output <- list(
parameter_tbl = ret
)
}

return(output)
}
67 changes: 67 additions & 0 deletions R/utils-aic-ztpoisson.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#' Calculate Akaike Information Criterion (AIC) for zero-truncated poisson Distribution
#'
#' This function calculates the Akaike Information Criterion (AIC) for a zero-truncated poisson distribution fitted to the provided data.
#'
#' @family Utility
#' @author Steven P. Sanderson II, MPH
#'
#' @description
#' This function estimates the parameters of a zero-truncated poisson distribution from the provided data using maximum likelihood estimation,
#' and then calculates the AIC value based on the fitted distribution.
#'
#' @param .x A numeric vector containing the data to be fitted to a zero-truncated poisson distribution.
#'
#' @examples
#' library(actuar)
#'
#' # Example 1: Calculate AIC for a sample dataset
#' set.seed(123)
#' x <- rztpois(30, lambda = 3)
#' util_ztp_aic(x)
#'
#' @return
#' The AIC value calculated based on the fitted zero-truncated poisson distribution to the provided data.
#'
#' @name util_ztp_aic
NULL

#' @export
#' @rdname util_ztp_aic

util_ztp_aic <- function(.x) {
# Validate input
if (!is.numeric(.x) || any(!is.na(.x) & .x != as.integer(.x)) || any(.x < 0)) {
stop("Input data (.x) must be a numeric vector of non-negative integers.")
}

x <- as.numeric(.x)

# Get initial parameter estimates using TidyDensity package (if available)
pe <- TidyDensity::util_zero_truncated_poisson_param_estimate(x)$parameter_tbl

# Negative log-likelihood function for zero-truncated poisson distribution
nll <- function(par, data) {
lambda <- par[1]
-sum(actuar::dztpois(data, lambda = lambda, log = TRUE))
}

# Fit zero-truncated poisson distribution to sample data (optimization)
fit_ztp<- stats::optim(
pe$lambda,
nll,
data = x,
method = "Brent",
lower = 0,
upper = 1000
)

# Extract log-likelihood and number of parameters
logLik_ztp<- -fit_ztp$value
k_ztp <- length(pe) # Number of parameters for zero-truncated poisson distribution (degrees of freedom and ncp)

# Calculate AIC
AIC_ztp <- 2 * k_ztp - 2 * logLik_ztp

# Return AIC value
return(AIC_ztp)
}
3 changes: 2 additions & 1 deletion man/check_duplicate_rows.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/convert_to_ts.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/quantile_normalize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/tidy_mcmc_sampling.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/tidy_poisson.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/tidy_zero_truncated_poisson.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_bernoulli_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_beta_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_beta_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_binomial_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_binomial_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_burr_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_cauchy_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_cauchy_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_chisq_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_chisquare_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_exponential_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_exponential_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/util_gamma_aic.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/util_gamma_param_estimate.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 5ed579a

Please sign in to comment.