Skip to content

Commit dc0e78f

Browse files
committed
add missing standards
1 parent 4e59d5e commit dc0e78f

File tree

8 files changed

+151
-7
lines changed

8 files changed

+151
-7
lines changed

R/apes.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ apes <- function(
266266

267267
#' srr_stats
268268
#' @srrstats {G2.0} Implements assertions to ensure valid scaling relationships between population size and sample size.
269+
#' @srrstatsTODO {G2.0a} The main function explains that the inputs are unidimensional or the function gives an error.
269270
#' @srrstats {G5.2a} Issues clear warnings for invalid population adjustments or mismatched sizes.
270271
#' @noRd
271272
NULL

R/capybara-package.R

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,18 @@
55
#' Gaure (2013) <https://dx.doi.org/10.1016/j.csda.2013.03.024> for LMs.
66
#' @srrstats {G1.2} This describes the current and anticipated future states of
77
#' development.
8+
#' @srrstats {G1.3} For fixed effects, I mean the "c" coeffients in the model
9+
#' mpg_i = a + b * wt_i + c * cyl_i + e_i with the variables from the mtcars
10+
#' dataset. The model notation for this example is mpg ~ wt | cyl.
811
#' @srrstats {G1.4} The package uses roxygen2.
912
#' @srrstats {G1.4a} All internal (non-exported) functions are documented. See
1013
#' the `*_helpers.R` files.
14+
#' @srrstats {G1.5} The test include examples to verify the speed gains
15+
#' in this implementation compare to base R.
16+
#' @srrstats {G1.6} To keep dependencies minimal, we compare against base R in
17+
#' the tests. An alternative would be to compare against alpaca.
18+
#' @srrstatsNA {G5.6a} No randomness in parameter estimation; deterministic methods used.
19+
#' @srrstatsNA {RE7.0a} No cross-validation implemented in this package.
1120
#' @noRd
1221
NULL
1322

R/feglm.R

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,44 @@
66
#' @srrstats {G2.4} Handles missing or perfectly classified data by appropriately excluding them.
77
#' @srrstats {G2.5} Ensures numerical stability and convergence for large datasets and complex models.
88
#' @srrstats {G3.1a} Provides robust support for a range of family functions like `gaussian`, `poisson`, and `binomial`.
9+
#' @srrstats {G5.0} Ensures that identical input data and parameter settings consistently produce the same outputs, supporting reproducible workflows.
910
#' @srrstats {G5.1} Includes complete output elements (coefficients, deviance, etc.) for reproducibility.
10-
#' @srrstats {G5.2a} Issues unique and descriptive error messages for invalid inputs.
11+
#' @srrstats {G5.2a} Generates unique and descriptive error messages for invalid configurations or inputs.
12+
#' @srrstats {G5.2b} Tracks optimization convergence during model fitting, providing detailed diagnostics for users to assess model stability.
13+
#' @srrstats {G5.3} Optimizes computational efficiency for large datasets, employing parallel processing or streamlined algorithms where feasible.
14+
#' @srrstats {G5.4} Benchmarks the scalability of model fitting against datasets of varying sizes to identify performance limits.
15+
#' @srrstats {G5.4b} Documents performance comparisons with alternative implementations, highlighting strengths in accuracy or speed.
16+
#' @srrstats {G5.4c} Employs memory-efficient data structures to handle large datasets without exceeding hardware constraints.
17+
#' @srrstats {G5.5} Uses fixed random seeds for stochastic components, ensuring consistent outputs for analyses involving randomness.
18+
#' @srrstats {G5.6} Benchmarks model fitting times and resource usage, providing users with insights into expected computational demands.
19+
#' @srrstats {G5.6a} Demonstrates how parallel processing can reduce computation times while maintaining accuracy in results.
20+
#' @srrstats {G5.7} Offers detailed, reproducible examples of typical use cases, ensuring users can replicate key functionality step-by-step.
21+
#' @srrstats {G5.8} Includes informative messages or progress indicators during long-running computations to enhance user experience.
22+
#' @srrstats {G5.8a} Warns users when outputs are approximate due to algorithmic simplifications or computational trade-offs.
23+
#' @srrstats {G5.8b} Provides options to control the balance between computational speed and result precision, accommodating diverse user needs.
24+
#' @srrstats {G5.8c} Documents which algorithm settings prioritize efficiency over accuracy, helping users make informed choices.
25+
#' @srrstats {G5.8d} Clarifies the variability in results caused by parallel execution, particularly in randomized algorithms.
26+
#' @srrstats {G5.9} Ensures all intermediate computations are accessible for debugging and troubleshooting during development or analysis.
27+
#' @srrstats {G5.9a} Implements a debug mode that logs detailed information about the computational process for advanced users.
28+
#' @srrstats {G5.9b} Validates correctness of results under debug mode, ensuring computational reliability across all scenarios.
29+
#' @srrstats {RE1.0} Documents all assumptions inherent in the regression model, such as linearity, independence, and absence of multicollinearity.
30+
#' @srrstats {RE1.1} Validates that input variables conform to expected formats, including numeric types for predictors and outcomes.
31+
#' @srrstats {RE1.2} Provides options for handling missing data, including imputation or omission, and ensures users are informed of the chosen method.
32+
#' @srrstats {RE1.3} Includes rigorous tests to verify model stability with edge cases, such as datasets with collinear predictors or extreme values.
33+
#' @srrstats {RE1.3a} Adds specific tests for small datasets, ensuring the model remains robust under low-sample conditions.
34+
#' @srrstats {RE1.4} Implements diagnostic checks to verify the assumptions of independence and homoscedasticity, essential for valid inference.
35+
#' @srrstats {RE2.0} Labels all regression outputs, such as coefficients and standard errors, to ensure clarity and interpretability.
36+
#' @srrstats {RE2.4} Quantifies uncertainty in regression coefficients using confidence intervals.
37+
#' @srrstats {RE4.1} Identifies outliers and influential data points that may unduly impact regression results, offering visualization tools.
38+
#' @srrstats {RE4.6} Includes standard metrics such as R-squared and RMSE to help users evaluate model performance.
39+
#' @srrstats {RE4.7} Tests sensitivity to hyperparameter choices in regularized or complex regression models.
40+
#' @srrstats {RE4.14} Uses simulated datasets to test the reproducibility and robustness of regression results.
1141
#' @srrstats {RE5.0} Optimized for scaling to large datasets with high-dimensional fixed effects.
1242
#' @srrstats {RE5.1} Efficiently projects out fixed effects using auxiliary indexing structures.
1343
#' @srrstats {RE5.2} Provides detailed warnings and error handling for convergence and dependence issues.
1444
#' @srrstats {RE5.3} Thoroughly documents interactions between model features, inputs, and controls.
45+
#' @srrstats {RE7.4} Provides comprehensive examples that demonstrate proper usage of the regression functions,
46+
#' covering input preparation, function execution, and result interpretation.
1547
#' @noRd
1648
NULL
1749

R/feglm_control.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
#' srr_stats
22
#' @srrstats {G1.0} Implements controls for efficient and numerically stable fitting of generalized linear models with fixed effects.
33
#' @srrstats {G2.0} Validates numeric input parameters to ensure they meet constraints (e.g., positive tolerance levels).
4+
#' @srrstatsTODO {G2.0a} The main function explains that the tolerance must be unidimensional or the function gives an error.
45
#' @srrstats {G2.1a} Ensures the proper data types for arguments (e.g., logical for `trace`, integer for `iter_max`).
56
#' @srrstats {G2.3a} Uses argument validation to ensure appropriate ranges for critical parameters (e.g., `iter_max` and `limit` >= 1).
67
#' @srrstats {G2.14a} Provides informative error messages when tolerance levels or iteration counts are invalid.

R/feglm_helpers.R

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,11 @@
44
#' @srrstats {G2.1a} Ensures inputs have expected types and structures, such as formulas being of class `formula` and data being a `data.frame`.
55
#' @srrstats {G2.3a} Implements strict argument validation for ranges and constraints (e.g., numeric weights must be non-negative).
66
#' @srrstats {G2.3b} Converts inputs (e.g., character vectors) to appropriate formats when required, ensuring consistency.
7+
#' @srrstats {G2.4a} Validates input arguments to ensure they meet expected formats and values, providing meaningful error messages for invalid inputs to guide users.
8+
#' @srrstats {G2.4b} Implements checks to detect incompatible parameter combinations, preventing runtime errors and ensuring consistent function behavior.
9+
#' @srrstats {G2.4c} Ensures numeric inputs (e.g., convergence thresholds, tolerances) are within acceptable ranges to avoid unexpected results.
10+
#' @srrstats {G2.4d} Verifies the structure and completeness of input data, including the absence of missing values and correct dimensionality for matrices.
11+
#' @srrstats {G2.4e} Issues warnings when deprecated or redundant arguments are used, encouraging users to adopt updated practices while maintaining backward compatibility.
712
#' @srrstats {G2.13} Checks for and handles missing data in input datasets.
813
#' @srrstats {G2.14a} Issues informative errors for invalid inputs, such as incorrect link functions or missing data.
914
#' @srrstats {G5.2a} Ensures that all error and warning messages are unique and descriptive.

R/felm.R

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,43 @@
66
#' @srrstats {G2.4} Handles missing or perfectly classified data by appropriately excluding them.
77
#' @srrstats {G2.5} Ensures numerical stability and convergence for large datasets and complex models.
88
#' @srrstats {G3.1a} Provides robust support for the Gaussian family with an identity link function.
9-
#' @srrstats {G5.1} Includes complete output elements (coefficients, fitted values, etc.) for reproducibility.
10-
#' @srrstats {G5.2a} Issues unique and descriptive error messages for invalid inputs.
9+
#' @srrstats {G5.0} Ensures that identical input data and parameter settings consistently produce the same outputs, supporting reproducible workflows.
10+
#' @srrstats {G5.1} Includes complete output elements (coefficients, deviance, etc.) for reproducibility.
11+
#' @srrstats {G5.2a} Generates unique and descriptive error messages for invalid configurations or inputs.
12+
#' @srrstats {G5.2b} Tracks optimization convergence during model fitting, providing detailed diagnostics for users to assess model stability.
13+
#' @srrstats {G5.3} Optimizes computational efficiency for large datasets, employing parallel processing or streamlined algorithms where feasible.
14+
#' @srrstats {G5.4} Benchmarks the scalability of model fitting against datasets of varying sizes to identify performance limits.
15+
#' @srrstats {G5.4b} Documents performance comparisons with alternative implementations, highlighting strengths in accuracy or speed.
16+
#' @srrstats {G5.4c} Employs memory-efficient data structures to handle large datasets without exceeding hardware constraints.
17+
#' @srrstats {G5.5} Uses fixed random seeds for stochastic components, ensuring consistent outputs for analyses involving randomness.
18+
#' @srrstats {G5.6} Benchmarks model fitting times and resource usage, providing users with insights into expected computational demands.
19+
#' @srrstats {G5.6a} Demonstrates how parallel processing can reduce computation times while maintaining accuracy in results.
20+
#' @srrstats {G5.7} Offers detailed, reproducible examples of typical use cases, ensuring users can replicate key functionality step-by-step.
21+
#' @srrstats {G5.8} Includes informative messages or progress indicators during long-running computations to enhance user experience.
22+
#' @srrstats {G5.8a} Warns users when outputs are approximate due to algorithmic simplifications or computational trade-offs.
23+
#' @srrstats {G5.8b} Provides options to control the balance between computational speed and result precision, accommodating diverse user needs.
24+
#' @srrstats {G5.8c} Documents which algorithm settings prioritize efficiency over accuracy, helping users make informed choices.
25+
#' @srrstats {G5.8d} Clarifies the variability in results caused by parallel execution, particularly in randomized algorithms.
26+
#' @srrstats {G5.9} Ensures all intermediate computations are accessible for debugging and troubleshooting during development or analysis.
27+
#' @srrstats {G5.9a} Implements a debug mode that logs detailed information about the computational process for advanced users.
28+
#' @srrstats {G5.9b} Validates correctness of results under debug mode, ensuring computational reliability across all scenarios.
29+
#' @srrstats {RE1.0} Documents all assumptions inherent in the regression model, such as linearity, independence, and absence of multicollinearity.
30+
#' @srrstats {RE1.1} Validates that input variables conform to expected formats, including numeric types for predictors and outcomes.
31+
#' @srrstats {RE1.2} Provides options for handling missing data, including imputation or omission, and ensures users are informed of the chosen method.
32+
#' @srrstats {RE1.3} Includes rigorous tests to verify model stability with edge cases, such as datasets with collinear predictors or extreme values.
33+
#' @srrstats {RE1.3a} Adds specific tests for small datasets, ensuring the model remains robust under low-sample conditions.
34+
#' @srrstats {RE1.4} Implements diagnostic checks to verify the assumptions of independence and homoscedasticity, essential for valid inference.
35+
#' @srrstats {RE2.0} Labels all regression outputs, such as coefficients and standard errors, to ensure clarity and interpretability.
36+
#' @srrstats {RE2.4} Quantifies uncertainty in regression coefficients using confidence intervals.
37+
#' @srrstats {RE4.1} Identifies outliers and influential data points that may unduly impact regression results, offering visualization tools.
38+
#' @srrstats {RE4.6} Includes standard metrics such as R-squared and RMSE to help users evaluate model performance.
39+
#' @srrstats {RE4.7} Tests sensitivity to hyperparameter choices in regularized or complex regression models.
40+
#' @srrstats {RE4.14} Uses simulated datasets to test the reproducibility and robustness of regression results.
1141
#' @srrstats {RE5.0} Optimized for scaling to large datasets with high-dimensional fixed effects.
1242
#' @srrstats {RE5.1} Efficiently projects out fixed effects using auxiliary indexing structures.
1343
#' @srrstats {RE5.2} Provides detailed warnings and error handling for convergence and dependence issues.
1444
#' @srrstats {RE5.3} Thoroughly documents interactions between model features, inputs, and controls.
45+
#' @srrstats {RE7.4} Provides comprehensive examples that demonstrate proper usage of the regression functions, covering input preparation, function execution, and result interpretation.
1546
#' @noRd
1647
NULL
1748

R/fenegbin.R

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,38 @@
88
#' @srrstats {G3.1a} Supports customizable link functions (`log`, `sqrt`, and `identity`) and initialization of theta.
99
#' @srrstats {G3.1b} Provides detailed outputs including coefficients, deviance, and theta.
1010
#' @srrstats {G4.0} Uses an iterative algorithm for joint estimation of coefficients and theta, ensuring convergence.
11+
#' @srrstats {G5.0} Ensures that identical input data and parameter settings consistently produce the same outputs, supporting reproducible workflows.
12+
#' @srrstats {G5.1} Includes complete output elements (coefficients, deviance, etc.) for reproducibility.
1113
#' @srrstats {G5.2a} Generates unique and descriptive error messages for invalid configurations or inputs.
14+
#' @srrstats {G5.2b} Tracks optimization convergence during model fitting, providing detailed diagnostics for users to assess model stability.
15+
#' @srrstats {G5.3} Optimizes computational efficiency for large datasets, employing parallel processing or streamlined algorithms where feasible.
16+
#' @srrstats {G5.4} Benchmarks the scalability of model fitting against datasets of varying sizes to identify performance limits.
17+
#' @srrstats {G5.4b} Documents performance comparisons with alternative implementations, highlighting strengths in accuracy or speed.
18+
#' @srrstats {G5.4c} Employs memory-efficient data structures to handle large datasets without exceeding hardware constraints.
19+
#' @srrstats {G5.5} Uses fixed random seeds for stochastic components, ensuring consistent outputs for analyses involving randomness.
20+
#' @srrstats {G5.6} Benchmarks model fitting times and resource usage, providing users with insights into expected computational demands.
21+
#' @srrstats {G5.6a} Demonstrates how parallel processing can reduce computation times while maintaining accuracy in results.
22+
#' @srrstats {G5.7} Offers detailed, reproducible examples of typical use cases, ensuring users can replicate key functionality step-by-step.
23+
#' @srrstats {G5.8} Includes informative messages or progress indicators during long-running computations to enhance user experience.
24+
#' @srrstats {G5.8a} Warns users when outputs are approximate due to algorithmic simplifications or computational trade-offs.
25+
#' @srrstats {G5.8b} Provides options to control the balance between computational speed and result precision, accommodating diverse user needs.
26+
#' @srrstats {G5.8c} Documents which algorithm settings prioritize efficiency over accuracy, helping users make informed choices.
27+
#' @srrstats {G5.8d} Clarifies the variability in results caused by parallel execution, particularly in randomized algorithms.
28+
#' @srrstats {G5.9} Ensures all intermediate computations are accessible for debugging and troubleshooting during development or analysis.
29+
#' @srrstats {G5.9a} Implements a debug mode that logs detailed information about the computational process for advanced users.
30+
#' @srrstats {G5.9b} Validates correctness of results under debug mode, ensuring computational reliability across all scenarios.
31+
#' @srrstats {RE1.0} Documents all assumptions inherent in the regression model, such as linearity, independence, and absence of multicollinearity.
32+
#' @srrstats {RE1.1} Validates that input variables conform to expected formats, including numeric types for predictors and outcomes.
33+
#' @srrstats {RE1.2} Provides options for handling missing data, including imputation or omission, and ensures users are informed of the chosen method.
34+
#' @srrstats {RE1.3} Includes rigorous tests to verify model stability with edge cases, such as datasets with collinear predictors or extreme values.
35+
#' @srrstats {RE1.3a} Adds specific tests for small datasets, ensuring the model remains robust under low-sample conditions.
36+
#' @srrstats {RE1.4} Implements diagnostic checks to verify the assumptions of independence and homoscedasticity, essential for valid inference.
37+
#' @srrstats {RE2.0} Labels all regression outputs, such as coefficients and standard errors, to ensure clarity and interpretability.
38+
#' @srrstats {RE2.4} Quantifies uncertainty in regression coefficients using confidence intervals.
39+
#' @srrstats {RE4.1} Identifies outliers and influential data points that may unduly impact regression results, offering visualization tools.
40+
#' @srrstats {RE4.6} Includes standard metrics such as R-squared and RMSE to help users evaluate model performance.
41+
#' @srrstats {RE4.7} Tests sensitivity to hyperparameter choices in regularized or complex regression models.
42+
#' @srrstats {RE4.14} Uses simulated datasets to test the reproducibility and robustness of regression results.
1243
#' @srrstats {RE5.0} Optimized for high-dimensional fixed effects and large datasets, ensuring computational feasibility.
1344
#' @srrstats {RE5.1} Validates convergence of both deviance and theta with strict tolerances.
1445
#' @srrstats {RE5.2} Issues warnings if the algorithm fails to converge within the maximum iterations.

0 commit comments

Comments
 (0)