diff --git a/.Rbuildignore b/.Rbuildignore index aab2a5bf..1517968c 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -17,3 +17,4 @@ dev_load.R CLAUDE.md .claude LICENSE +conda \ No newline at end of file diff --git a/.cursor/rules/python-adam-structure.mdc b/.cursor/rules/python-adam-structure.mdc deleted file mode 100644 index 89eb77f4..00000000 --- a/.cursor/rules/python-adam-structure.mdc +++ /dev/null @@ -1,162 +0,0 @@ ---- -description: -globs: -alwaysApply: false ---- -# Python ADAM Package Structure - Direct Translation from R - -## Overview - -The Python ADAM (Augmented Dynamic Adaptive Model) package in [python/](mdc:python/) is a direct translation of the R version found in [R/adam.R](mdc:R/adam.R). It implements advanced time series forecasting methods combining ETS (Error, Trend, Seasonal) and ARIMA models. - -## Package Architecture - -The Python implementation follows a modular design with clear separation of concerns: - -``` -python/smooth/ -├── __init__.py # Package initialization -└── adam_general/ - ├── __init__.py # Module initialization - ├── _adam_general.py # Low-level C++ bindings (adam_fitter, adam_forecaster) - └── core/ - ├── __init__.py # Core module initialization - ├── adam.py # Main ADAM class (user interface) - ├── checker.py # Parameter validation - ├── creator.py # Model structure creation - ├── estimator.py # Parameter estimation & model selection - ├── forecaster.py # Forecast generation - └── utils/ # Utility functions - ├── cost_functions.py # Optimization cost functions - ├── ic.py # Information criteria - ├── polynomials.py # ARIMA polynomial utilities - ├── utils.py # General utilities - └── var_covar.py # Variance-covariance calculations -``` - -## Core Components Translation - -### 1. Main Interface: [adam.py](mdc:python/smooth/adam_general/core/adam.py) -- **R Equivalent**: [R/adam.R](mdc:R/adam.R) main `adam()` function -- **Python Class**: `ADAM` class with scikit-learn style interface -- **Key Methods**: - - `__init__()`: Model configuration (equivalent to R function parameters) - - `fit(y, X)`: Model estimation (equivalent to R's main estimation logic) - - `predict(h, X)`: Point forecasting - - `predict_intervals()`: Prediction intervals - -### 2. Parameter Validation: [checker.py](mdc:python/smooth/adam_general/core/checker.py) -- **R Equivalent**: Parameter checking logic embedded in [R/adam.R](mdc:R/adam.R) -- **Main Function**: `parameters_checker()` - validates all input parameters -- **Validates**: Model specification, lags, persistence, initial values, ARIMA orders - -### 3. Model Creation: [creator.py](mdc:python/smooth/adam_general/core/creator.py) -- **R Equivalent**: Matrix creation logic in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `architector()`: Defines model architecture (components, lags, profiles) - - `creator()`: Builds state-space matrices (mat_wt, mat_f, vec_g, mat_vt) - - `initialiser()`: Prepares initial parameter vector and bounds for optimization - - `filler()`: Populates matrices with parameter values during optimization - -### 4. Parameter Estimation: [estimator.py](mdc:python/smooth/adam_general/core/estimator.py) -- **R Equivalent**: Estimation logic in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `estimator()`: Single model parameter estimation using NLopt - - `selector()`: Model selection from candidate pool - - Uses cost functions from [utils/cost_functions.py](mdc:python/smooth/adam_general/core/utils/cost_functions.py) - -### 5. Forecasting: [forecaster.py](mdc:python/smooth/adam_general/core/forecaster.py) -- **R Equivalent**: Forecasting logic in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `preparator()`: Prepares fitted model for forecasting - - `forecaster()`: Generates point forecasts and prediction intervals - -## Utility Functions Translation - -### Cost Functions: [utils/cost_functions.py](mdc:python/smooth/adam_general/core/utils/cost_functions.py) -- **R Equivalent**: Cost function logic in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `CF()`: Main cost function for optimization (likelihood, MSE, etc.) - - `log_Lik_ADAM()`: Log-likelihood calculation - -### Information Criteria: [utils/ic.py](mdc:python/smooth/adam_general/core/utils/ic.py) -- **R Equivalent**: IC calculations in [R/adam.R](mdc:R/adam.R) -- **Function**: `ic_function()` - calculates AIC, AICc, BIC, BICc - -### General Utilities: [utils/utils.py](mdc:python/smooth/adam_general/core/utils/utils.py) -- **R Equivalent**: Various utility functions in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `msdecompose()`: Multiple seasonal decomposition - - `calculate_likelihood()`: Distribution-specific likelihood calculations - - `scaler()`: Error scaling for different distributions - -### Variance-Covariance: [utils/var_covar.py](mdc:python/smooth/adam_general/core/utils/var_covar.py) -- **R Equivalent**: Variance calculations in [R/adam.R](mdc:R/adam.R) -- **Key Functions**: - - `sigma()`: Scale parameter calculation - - `covar_anal()` / `var_anal()`: Variance-covariance analysis - - `matrix_power_wrap()`: Matrix power calculations - -## Workflow Translation - -The Python implementation mirrors the R workflow: - -``` -R: adam(y, model="ZXZ", ...) -↓ -Python: ADAM(model="ZXZ", ...).fit(y).predict(h) -``` - -### Detailed Flow: -1. **Initialization**: `ADAM.__init__()` ← R function parameters -2. **Validation**: `parameters_checker()` ← R parameter validation -3. **Architecture**: `architector()` ← R architecture setup -4. **Creation**: `creator()` ← R matrix creation -5. **Estimation**: `estimator()` ← R optimization -6. **Forecasting**: `forecaster()` ← R forecasting logic - -## Key Differences from R - -1. **Object-Oriented**: Python uses class-based approach vs R's functional approach -2. **Scikit-learn Style**: `fit()` and `predict()` methods for consistency -3. **Type Hints**: Extensive type annotations for better IDE support -4. **Pandas Integration**: Native support for pandas Series/DataFrames -5. **NLopt Optimization**: Uses NLopt library instead of R's optim - -## Configuration Files - -- [pyproject.toml](mdc:python/pyproject.toml): Modern Python packaging configuration -- [setup.cfg](mdc:python/setup.cfg): Additional package metadata -- [CMakeLists.txt](mdc:python/CMakeLists.txt): C++ extension build configuration -- [Makefile](mdc:python/Makefile): Build automation - -## Testing and Documentation - -- [tests/](mdc:python/smooth/adam_general/tests/): Jupyter notebooks for testing -- [README.md](mdc:python/README.md): Package documentation -- [smooth_package_structure.md](mdc:python/smooth_package_structure.md): Detailed architecture documentation - -## Usage Pattern - -```python -# Python equivalent of R's adam() function -from smooth.adam_general.core.adam import ADAM - -# Initialize model (equivalent to R function parameters) -model = ADAM( - model="ANN", # R: model="ANN" - lags=[12], # R: lags=c(12) - distribution="dnorm", # R: distribution="dnorm" - ic="AICc" # R: ic="AICc" -) - -# Fit model (equivalent to R's main estimation) -model.fit(y, X) - -# Generate forecasts (equivalent to R's forecast.adam()) -forecasts = model.predict(h=12, X=X_future) -intervals = model.predict_intervals(h=12, X=X_future) -``` - -This Python implementation maintains full compatibility with the R version while providing a more modern, object-oriented interface suitable for Python ecosystems. - diff --git a/.cursor/rules/python-adam-technical.mdc b/.cursor/rules/python-adam-technical.mdc deleted file mode 100644 index ceff812a..00000000 --- a/.cursor/rules/python-adam-technical.mdc +++ /dev/null @@ -1,299 +0,0 @@ ---- -description: -globs: -alwaysApply: false ---- -# Python ADAM Technical Implementation Details - -## Parameter Mapping: R to Python - -### Core ADAM Function Parameters -| R Parameter | Python Parameter | Location | Notes | -|-------------|------------------|----------|-------| -| `model` | `model` | `ADAM.__init__()` | ETS model specification | -| `lags` | `lags` | `ADAM.__init__()` | Seasonal periods | -| `orders` | `ar_order`, `i_order`, `ma_order` | `ADAM.__init__()` | Split into separate parameters | -| `constant` | `constant` | `ADAM.__init__()` | Include constant term | -| `distribution` | `distribution` | `ADAM.__init__()` | Error distribution | -| `loss` | `loss` | `ADAM.__init__()` | Loss function | -| `ic` | `ic` | `ADAM.__init__()` | Information criterion | -| `bounds` | `bounds` | `ADAM.__init__()` | Parameter bounds | -| `occurrence` | `occurrence` | `ADAM.__init__()` | Intermittent data model | -| `persistence` | `persistence` | `ADAM.__init__()` | Fixed persistence parameters | -| `phi` | `phi` | `ADAM.__init__()` | Damping parameter | -| `initial` | `initial` | `ADAM.__init__()` | Initial states | -| `arma` | `arma` | `ADAM.__init__()` | ARMA parameters | -| `h` | `h` | `predict()` method | Forecast horizon | -| `holdout` | `holdout` | `ADAM.__init__()` | Use holdout sample | -| `silent` | `verbose` | `ADAM.__init__()` | Inverted logic | - -### Model Selection Parameters -| R Parameter | Python Parameter | Python Value | Notes | -|-------------|------------------|--------------|-------| -| `model="ZZZ"` | `model_do="select"` | `model="ZZZ"` | Automatic model selection | -| `model="XXX"` | `model_do="select"` | `model="XXX"` | Additive components only | -| `model="YYY"` | `model_do="select"` | `model="YYY"` | Multiplicative components only | -| `model="CCC"` | `model_do="combine"` | `model="CCC"` | Forecast combination | - -## Matrix Structure and Naming - -### State-Space Matrices -| Matrix | Python Name | R Equivalent | Purpose | -|--------|-------------|--------------|---------| -| State Vector | `mat_vt` | `matVt` | Contains all state components | -| Measurement Matrix | `mat_wt` | `matWt` | Maps states to observations | -| Transition Matrix | `mat_f` | `matF` | State evolution | -| Persistence Vector | `vec_g` | `vecG` | Smoothing parameters | - -### Matrix Dimensions -```python -# For ETS(A,A,A) with seasonal period 12: -mat_vt.shape = (components_all, lags_max + 1) # e.g., (14, 13) -mat_wt.shape = (obs_in_sample, components_all) # e.g., (100, 14) -mat_f.shape = (components_all, components_all) # e.g., (14, 14) -vec_g.shape = (components_all,) # e.g., (14,) -``` - -## Component Indexing System - -### ETS Components -```python -# Level component -mat_vt[0, :] = level_states -vec_g[0] = alpha # Level smoothing parameter - -# Trend component (if present) -mat_vt[1, :] = trend_states -vec_g[1] = beta # Trend smoothing parameter - -# Seasonal components (if present) -for i in range(seasonal_components): - mat_vt[2 + i, :] = seasonal_states[i] - vec_g[2 + i] = gamma[i] # Seasonal smoothing parameters -``` - -### ARIMA Components -```python -# ARIMA states follow ETS components -arima_start_idx = components_number_ets -for i in range(components_number_arima): - mat_vt[arima_start_idx + i, :] = arima_states[i] -``` - -## Parameter Vector (B) Structure - -The parameter vector B contains all estimable parameters in a specific order: - -```python -B = [ - # ETS persistence parameters - alpha, # Level smoothing (if estimated) - beta, # Trend smoothing (if estimated) - gamma_1, ..., # Seasonal smoothing (if estimated) - - # Damping parameter - phi, # Damping (if estimated) - - # Initial states - level_0, # Initial level (if estimated) - trend_0, # Initial trend (if estimated) - seasonal_0, ..., # Initial seasonal states (if estimated) - - # ARIMA parameters - ar_1, ar_2, ..., # AR parameters (if estimated) - ma_1, ma_2, ..., # MA parameters (if estimated) - - # Regression parameters - beta_1, ..., # Regression coefficients (if estimated) - - # Constant - constant, # Constant term (if estimated) - - # Distribution parameters - other # Shape/scale parameters (if estimated) -] -``` - -## Distribution Implementation - -### Supported Distributions -| Distribution | Python String | R Equivalent | Parameters | -|-------------|---------------|--------------|------------| -| Normal | `"dnorm"` | `"dnorm"` | location, scale | -| Laplace | `"dlaplace"` | `"dlaplace"` | location, scale | -| S-distribution | `"ds"` | `"ds"` | location, scale | -| Generalized Normal | `"dgnorm"` | `"dgnorm"` | location, scale, shape | -| Log-Normal | `"dlnorm"` | `"dlnorm"` | meanlog, sdlog | -| Gamma | `"dgamma"` | `"dgamma"` | shape, scale | -| Inverse Gaussian | `"dinvgauss"` | `"dinvgauss"` | mean, shape | - -### Default Distribution Selection -```python -if distribution == "default": - if loss == "likelihood": - if error_type == "A": - distribution = "dnorm" - else: - distribution = "dgamma" - elif loss in ["MAE", "MAEh", "MACE"]: - distribution = "dlaplace" - elif loss in ["HAM", "HAMh", "CHAM"]: - distribution = "ds" - else: - distribution = "dnorm" -``` - -## Cost Function Implementation - -### Main Cost Function: [cost_functions.py](mdc:python/smooth/adam_general/core/utils/cost_functions.py) - -```python -def CF(B, ...): - # 1. Fill matrices with current parameters - adamElements = filler(B, ...) - - # 2. Check parameter bounds and constraints - if bounds == "usual": - # Apply strict bounds (return 1e100 if violated) - - # 3. Call C++ fitting routine - adam_fitted = adam_fitter( - matrixVt=mat_vt, - matrixWt=mat_wt, - matrixF=mat_f, - vectorG=vec_g, - ... - ) - - # 4. Calculate and return cost - return adam_fitted['cost'] -``` - -### Constraint Handling -```python -# ETS smoothing parameters: 0 ≤ α, β, γ ≤ 1 -if any(vec_g[:components_number_ets] > 1) or any(vec_g[:components_number_ets] < 0): - return 1e100 - -# Trend constraint: β ≤ α -if model_is_trendy and vec_g[1] > vec_g[0]: - return 1e100 - -# Seasonal constraint: γ ≤ 1 - α -if model_is_seasonal and any(seasonal_persistence > (1 - vec_g[0])): - return 1e100 - -# Damping constraint: 0 ≤ φ ≤ 1 -if phi_estimate and (phi > 1 or phi < 0): - return 1e100 -``` - -## Optimization Configuration - -### NLopt Settings -```python -# Algorithm selection -if explanatory_dict["xreg_model"]: - algorithm = nlopt.LN_NELDERMEAD # For regression models -else: - algorithm = nlopt.LN_SBPLX # For standard models - -# Tolerance settings (matching R) -opt.set_xtol_rel(1e-6) -opt.set_ftol_rel(1e-8) -opt.set_ftol_abs(0) -opt.set_xtol_abs(1e-8) - -# Maximum evaluations -maxeval = len(B) * 200 # Default scaling -``` - -### Parameter Bounds -```python -# Default bounds for different parameter types -bounds_dict = { - "persistence": (1e-16, 1 - 1e-16), # ETS smoothing parameters - "phi": (1e-16, 1 - 1e-16), # Damping parameter - "initial": (-1e100, 1e100), # Initial states - "arma": (-1 + 1e-16, 1 - 1e-16), # ARIMA parameters - "constant": (-1e100, 1e100), # Constant term - "other": (1e-16, 1e100) # Distribution parameters -} -``` - -## Error Handling and Robustness - -### Common Issues and Solutions -```python -# 1. Singular matrices -if np.linalg.cond(mat_f) > 1e12: - warnings.warn("Transition matrix is near-singular") - return 1e100 - -# 2. Explosive parameters -if any(np.abs(eigvals(mat_f)) > 1 + 1e-10): - return 1e100 - -# 3. Invalid initial states -if any(np.isnan(mat_vt)) or any(np.isinf(mat_vt)): - warnings.warn("Invalid initial states detected") - # Apply fallback initialization -``` - -### Numerical Stability -```python -# Safe logarithm calculation -def safe_log(x): - return np.log(np.maximum(x, 1e-100)) - -# Safe division -def safe_divide(a, b): - return np.divide(a, b, out=np.zeros_like(a), where=b!=0) -``` - -## C++ Integration Details - -### Function Signatures -```python -# From _adam_general.py (pybind11 bindings) -adam_fitter( - matrixVt: np.ndarray, # State matrix (Fortran order) - matrixWt: np.ndarray, # Measurement matrix (Fortran order) - matrixF: np.ndarray, # Transition matrix (Fortran order) - vectorG: np.ndarray, # Persistence vector - lags: np.ndarray, # Lag structure - indexLookupTable: np.ndarray, # Index lookup - profilesRecent: np.ndarray, # Recent profiles - E: str, # Error type ("A" or "M") - T: str, # Trend type ("N", "A", "Ad", "M", "Md") - S: str, # Season type ("N", "A", "M") - # ... other parameters -) -> Dict[str, Any] -``` - -### Memory Layout -```python -# Ensure Fortran (column-major) order for C++ compatibility -mat_vt = np.asfortranarray(mat_vt, dtype=np.float64) -mat_wt = np.asfortranarray(mat_wt, dtype=np.float64) -mat_f = np.asfortranarray(mat_f, dtype=np.float64) -vec_g = np.asfortranarray(vec_g, dtype=np.float64) -``` - -## Performance Considerations - -### Optimization Tips -1. **Matrix Preallocation**: Matrices are allocated once and reused -2. **Vectorized Operations**: Use NumPy vectorization where possible -3. **Memory Layout**: Ensure Fortran order for C++ compatibility -4. **Parameter Bounds**: Early return for constraint violations - -### Profiling Points -```python -# Key performance bottlenecks: -# 1. filler() function (called many times during optimization) -# 2. adam_fitter() C++ routine -# 3. Matrix operations in creator() functions -# 4. Parameter bound checking in CF() -``` - diff --git a/.cursor/rules/python-adam-workflow.mdc b/.cursor/rules/python-adam-workflow.mdc deleted file mode 100644 index 6895383b..00000000 --- a/.cursor/rules/python-adam-workflow.mdc +++ /dev/null @@ -1,216 +0,0 @@ ---- -description: -globs: -alwaysApply: false ---- -# Python ADAM Workflow and Function Relationships - -## Execution Flow Overview - -The Python ADAM implementation follows a structured pipeline that mirrors the R version in [R/adam.R](mdc:R/adam.R): - -``` -User Input → Validation → Architecture → Creation → Estimation → Forecasting -``` - -## Detailed Function Flow - -### 1. Model Initialization: [adam.py](mdc:python/smooth/adam_general/core/adam.py) - -```python -ADAM.__init__() -├── Store all configuration parameters -├── Set default values for unspecified parameters -└── Initialize timing and state tracking -``` - -**Key Parameters Stored:** -- `model`: ETS model specification (e.g., "ANN", "ZXZ") -- `lags`: Seasonal periods -- `ar_order`, `i_order`, `ma_order`: ARIMA components -- `distribution`: Error distribution -- `loss`: Loss function for estimation -- `ic`: Information criterion for model selection - -### 2. Model Fitting: [adam.py](mdc:python/smooth/adam_general/core/adam.py) → [checker.py](mdc:python/smooth/adam_general/core/checker.py) - -```python -ADAM.fit(y, X) -├── _check_parameters(ts) -│ └── parameters_checker() [checker.py] -│ ├── _check_model_composition() -│ ├── _process_observations() -│ ├── _check_lags() -│ ├── _check_persistence() -│ ├── _check_initial() -│ ├── _check_arima_orders() -│ └── _check_explanatory_vars() -├── _execute_estimation() / _execute_selection() -└── _prepare_results() -``` - -**Validation Process:** -- Converts input data to appropriate format -- Validates model specification strings -- Checks parameter bounds and consistency -- Processes missing values and outliers - -### 3. Model Architecture: [creator.py](mdc:python/smooth/adam_general/core/creator.py) - -```python -_execute_estimation() -├── architector() [creator.py] -│ ├── _setup_components() # Determine ETS/ARIMA component counts -│ ├── _setup_lags() # Finalize lag structure -│ └── _create_profiles() # Create lookup tables -└── creator() [creator.py] # Build state-space matrices - ├── _extract_model_parameters() - ├── _setup_matrices() # Initialize mat_vt, mat_wt, mat_f, vec_g - ├── _setup_measurement_vector() - ├── _setup_persistence_vector() - ├── _handle_polynomial_setup() - └── _initialize_states() -``` - -**Matrix Creation:** -- `mat_vt`: State vector (levels, trends, seasonals, ARIMA states) -- `mat_wt`: Measurement matrix (how states contribute to observations) -- `mat_f`: Transition matrix (how states evolve) -- `vec_g`: Persistence vector (smoothing parameters) - -### 4. Parameter Estimation: [estimator.py](mdc:python/smooth/adam_general/core/estimator.py) - -```python -estimator() [estimator.py] -├── initialiser() [creator.py] # Get initial parameter vector B and bounds -├── _create_objective_function() -│ └── CF() [utils/cost_functions.py] -│ └── filler() [creator.py] # Fill matrices with current B -├── _run_optimization() [nlopt] -├── _calculate_loglik() -├── _generate_forecasts() -└── _format_output() -``` - -**Optimization Process:** -- Uses NLopt library for parameter optimization -- Cost function `CF()` evaluates likelihood or other loss functions -- `filler()` updates matrices with current parameter values during optimization -- Applies parameter bounds and constraints - -### 5. Model Selection (Optional): [estimator.py](mdc:python/smooth/adam_general/core/estimator.py) - -```python -_execute_selection() -├── selector() [estimator.py] -│ ├── _form_model_pool() # Generate candidate models -│ ├── For each candidate: -│ │ ├── architector() [creator.py] -│ │ ├── creator() [creator.py] -│ │ └── estimator() [estimator.py] -│ └── _select_best_model() # Based on IC (AICc, BIC, etc.) -└── _execute_estimation(estimation=False) # Setup chosen model -``` - -**Selection Methods:** -- Branch and Bound for "Z" components -- Pool-based selection for specified model lists -- Information criteria comparison (AIC, AICc, BIC, BICc) - -### 6. Forecasting: [forecaster.py](mdc:python/smooth/adam_general/core/forecaster.py) - -```python -ADAM.predict(h, X) -├── _validate_prediction_inputs() -├── _prepare_prediction_data() -│ └── preparator() [forecaster.py] -│ ├── _fill_matrices_if_needed() # Uses filler() from creator.py -│ ├── _prepare_profiles_recent_table() -│ └── _initialize_fitted_series() -├── _execute_prediction() -│ └── forecaster() [forecaster.py] -│ ├── _prepare_forecast_index() -│ ├── _initialize_forecast_series() -│ ├── _prepare_lookup_table() -│ ├── _generate_point_forecasts() # Uses adam_forecaster from _adam_general.py -│ ├── _prepare_forecast_intervals() -│ └── _format_forecast_output() -└── return forecasts -``` - -## Key Function Relationships - -### Core Matrix Functions -- **`filler()`** in [creator.py](mdc:python/smooth/adam_general/core/creator.py): Central function that populates matrices with parameter values -- **Used by**: `CF()` cost function, `preparator()` for forecasting -- **Purpose**: Converts parameter vector B into structured matrices - -### Cost Function Chain -```python -CF() [cost_functions.py] -├── filler() [creator.py] # Fill matrices with parameters -├── adam_fitter() [_adam_general.py] # C++ fitting routine -└── Various penalty/constraint checks -``` - -### State Initialization Chain -```python -_initialize_states() [creator.py] -├── _initialize_ets_states() -├── _initialize_arima_states() -├── _initialize_xreg_states() -└── _initialize_constant() -``` - -## Data Flow Patterns - -### Parameter Vector (B) Flow -1. **Initial**: `initialiser()` creates initial B vector -2. **Optimization**: NLopt modifies B to minimize cost function -3. **Filling**: `filler()` converts B to matrices during each evaluation -4. **Storage**: Final B stored in model for forecasting - -### Matrix Flow -1. **Creation**: `creator()` builds empty matrices -2. **Filling**: `filler()` populates with parameters -3. **Fitting**: C++ `adam_fitter()` processes matrices -4. **Forecasting**: `adam_forecaster()` uses matrices for predictions - -### Error Handling -- Constraint violations return large penalty values (1e100) -- Parameter bounds enforced during optimization -- NaN/Inf values trigger warnings and fallback procedures - -## Low-Level Integration - -### C++ Bindings: [_adam_general.py](mdc:python/smooth/adam_general/_adam_general.py) -- **`adam_fitter()`**: Core fitting routine (equivalent to R's C++ code) -- **`adam_forecaster()`**: Core forecasting routine -- **Called by**: Cost functions and forecasting functions -- **Purpose**: High-performance matrix operations - -### Utility Integration -- **[utils/ic.py](mdc:python/smooth/adam_general/core/utils/ic.py)**: Information criteria calculations -- **[utils/var_covar.py](mdc:python/smooth/adam_general/core/utils/var_covar.py)**: Variance-covariance for intervals -- **[utils/utils.py](mdc:python/smooth/adam_general/core/utils/utils.py)**: General utilities (decomposition, likelihood) - -## Development Guidelines - -### Adding New Functionality -1. **Validation**: Add checks to [checker.py](mdc:python/smooth/adam_general/core/checker.py) -2. **Matrix Setup**: Extend [creator.py](mdc:python/smooth/adam_general/core/creator.py) functions -3. **Cost Function**: Update [cost_functions.py](mdc:python/smooth/adam_general/core/utils/cost_functions.py) -4. **Interface**: Expose through [adam.py](mdc:python/smooth/adam_general/core/adam.py) class - -### Testing Workflow -- Unit tests for individual functions -- Integration tests for complete workflows -- Comparison tests against R implementation -- Performance benchmarks for optimization - -### Debugging Tips -- Use `verbose` parameter in ADAM class for detailed output -- Check matrix dimensions in `creator()` functions -- Validate parameter bounds in cost functions -- Monitor convergence in optimization routines - diff --git a/.cursor/rules/smooth_package_structure.md b/.cursor/rules/smooth_package_structure.md deleted file mode 100644 index 34d48e51..00000000 --- a/.cursor/rules/smooth_package_structure.md +++ /dev/null @@ -1,342 +0,0 @@ -# SMOOTH Package Structure and Function Flow - -## Package Overview - -The SMOOTH forecasting package is a Python implementation of advanced time series forecasting methods, with ADAM (Augmented Dynamic Adaptive Model) as its central component. The package provides a flexible framework for forecasting that combines various models including ETS (Error, Trend, Seasonal), ARIMA (Autoregressive Integrated Moving Average), and their hybrid combinations. - -## Directory Structure - -``` -smooth/ -├── __init__.py -└── adam_general/ - ├── __init__.py - ├── _adam_general.py # Low-level implementation functions (e.g., adam_fitter, adam_forecaster) - └── core/ - ├── __init__.py - ├── adam.py # Main ADAM class interface - ├── checker.py # Parameter validation - ├── creator.py # Model matrix creation, optimization parameter initialization - ├── estimator.py # Parameter estimation & model selection - ├── forecaster.py # Forecast generation - └── utils/ # Utilities and helper functions - ├── __init__.py - ├── cost_functions.py # Cost functions for optimization (CF, log_Lik_ADAM) - ├── dump.py # (Currently empty) - ├── ic.py # Information criteria (AIC, BIC, AICc, BICc, ic_function) - ├── likelihood.py # (Currently empty) - ├── polynomials.py # ARIMA polynomial utilities (adam_polynomialiser) - ├── utils.py # General utilities (msdecompose, calculate_acf, calculate_pacf, calculate_likelihood, scaler, etc.) - └── var_covar.py # Variance-covariance utilities (sigma, covar_anal, var_anal, matrix_power_wrap) -``` - -## Core Components - -The package consists of five main components that work together to implement the ADAM forecasting framework: - -1. **ADAM Class (`adam.py`)**: The high-level interface for users, providing methods to configure, fit, and forecast with the ADAM model. - -2. **Parameter Checker (`checker.py`)**: Validates and processes user inputs via `parameters_checker()`, converting them into the appropriate format for model estimation. - -3. **Model Creator (`creator.py`)**: Contains functions to define the model structure. - * `architector()`: Defines the high-level architecture (number of ETS/ARIMA components, lags, profiles). - * `creator()`: Constructs the state-space matrices (`mat_wt`, `mat_f`, `vec_g`) and initializes the state vector (`mat_vt`). - * `initialiser()`: Prepares the initial parameter vector (`B`) and bounds (`Bl`, `Bu`) for optimization (called by `estimator.py`). - * `filler()`: Populates the model matrices with specific parameter values from vector `B` (called during optimization by cost functions and by `preparator()` in `forecaster.py`). - -4. **Parameter Estimator (`estimator.py`)**: Estimates optimal model parameters using `estimator()` based on provided data, or selects the best model using `selector()`. It utilizes cost functions (e.g., `CF` from `utils.cost_functions.py`) which internally use `filler()` from `creator.py`. - -5. **Forecaster (`forecaster.py`)**: Generates point forecasts and prediction intervals. - * `preparator()`: Prepares the fitted model and its matrices for forecasting. - * `forecaster()`: Produces the actual forecast values and intervals. - -## Function Flow - -The typical workflow follows this sequence: - -``` -User Input → ADAM.__init__() → ADAM.fit() → ADAM.predict() / ADAM.predict_intervals() -``` - -Let's look at each step in detail: - -### 1. Initialization Phase: `ADAM.__init__()` - -``` -ADAM.__init__() -├── Store configuration parameters (model, lags, orders, loss, ic, etc.) -└── Set up default values -``` - -The initialization phase sets up all model configuration parameters such as model type, seasonality, ARIMA orders, loss function, etc. This follows scikit-learn conventions by storing all model-related parameters during initialization. - -### 2. Parameter Validation and Data Processing: `ADAM.fit() → checker.py` - -``` -ADAM.fit(y, X) -├── _check_parameters(ts) // Calls parameters_checker from checker.py -│ └── parameters_checker() [checker.py] -│ ├── _check_model_composition() // and many other internal _check_* functions -│ ├── _process_observations() -│ ├── _check_lags() -│ ├── _check_persistence() -│ ├── _check_initial() // For how initial states/params are specified (optimal, provided) -│ ├── _check_arima_orders() -│ ├── _check_constant() -│ └── _check_explanatory_vars() -├── _execute_estimation() / _execute_selection() -└── _prepare_results() -``` - -During the fitting phase, the `parameters_checker()` function in `checker.py` validates and processes all input parameters and data. It checks the model specification, processes observations, validates lags, persistence parameters, initial state/parameter specifications, ARIMA orders, constants, and explanatory variables. The processed parameters are returned as a collection of dictionaries that will be used in subsequent steps. - -### 3. Model Structure Creation: `_execute_estimation() → creator.py (architector, creator)` - -When `ADAM.fit()` calls `_execute_estimation()`: -``` -_execute_estimation() -├── // ... (handle special cases like LASSO/RIDGE) -├── architector() [creator.py] // Defines model architecture -│ ├── _setup_components() // Determines number of ETS, ARIMA components -│ ├── _setup_lags() // Finalizes lags based on components -│ └── _create_profiles() // Creates profile matrices (uses adam_profile_creator) -└── creator() [creator.py] // Creates state-space matrices and initializes states - ├── _extract_model_parameters() - ├── _setup_matrices() // Initializes mat_vt, mat_wt, mat_f, vec_g - ├── _setup_measurement_vector()// Configures mat_wt, mat_f (phi) - ├── _setup_persistence_vector()// Configures mat_f, vec_g (fixed persistence) - ├── _handle_polynomial_setup() // For fixed ARIMA params - └── _initialize_states() // Sets initial values in mat_vt -``` -The `architector()` function first defines the counts of various components (ETS, ARIMA) and finalizes the lag structure. Then, `creator()` builds the actual state-space matrices (`mat_wt` for measurement, `mat_f` for transition, `vec_g` for persistence) and the initial state matrix (`mat_vt`), filling them based on the model specification and data characteristics. - - -### 4. Parameter Estimation: `estimator() [estimator.py]` - -If `estimation=True` (default for `_execute_estimation`, or after model selection): -``` -_execute_estimation() -└── estimator() [estimator.py] // Called if estimation=True - ├── initialiser() [creator.py] // Gets initial parameter vector B and bounds Bl, Bu - ├── _create_objective_function() // Wraps the cost function (e.g., CF from utils.cost_functions.py) - │ └── CF() [utils.cost_functions.py] - │ └── filler() [creator.py] // Fills matrices with current B during optimization - ├── _run_optimization() [nlopt] // Finds optimal B - ├── _calculate_loglik() // Calculates final log-likelihood, AIC, etc. - ├── _generate_forecasts() // Generates in-sample forecasts (fitted values) - ├── _format_output() // Prepares results (errors, scale, final parameters) -``` -The `estimator.py` module handles parameter estimation. -1. It first calls `initialiser()` (from `creator.py`) for the initial parameter guess, sets up and runs the optimization (using a cost function like `CF` from `utils.cost_functions.py`, which internally uses `filler()`), and then processes results (log-likelihood, fitted values, errors, scale). - -### 5. Model Selection (Optional): `_execute_selection() → selector() [estimator.py]` - -``` -_execute_selection() -├── selector() [estimator.py] -│ ├── _form_model_pool() / _build_models_pool_from_components() // Generates candidate models -│ ├── For each candidate model: -│ │ └── _estimate_model() // Calls estimator() for each model -│ │ ├── architector() [creator.py] -│ │ ├── creator() [creator.py] -│ │ └── estimator() [estimator.py] // (Simplified: actual estimation logic) -│ └── _select_best_model() // Based on IC (e.g., AICc) -└── // After best model is selected: - // _execute_estimation(estimation=False) is called to set up the chosen model's matrices - ├── architector() [creator.py] - └── creator() [creator.py] -``` -When using model selection (`model_do="select"`), the `selector()` function in `estimator.py` evaluates multiple candidate models. For each candidate, it typically goes through a simplified estimation process to get its information criterion value. The best model is then chosen. Finally, `_execute_estimation(estimation=False)` is called for the selected model to properly set up its matrices using `architector()` and `creator()`. - -### 6. Results Preparation: `ADAM._prepare_results()` - -``` -_prepare_results() // Called in ADAM.fit() after estimation/selection -├── _format_time_series_data() // Ensures y_in_sample, y_holdout are pandas Series -└── _select_distribution() // Determines final distribution if 'default' was used -``` -After model estimation or selection, the results are prepared for user consumption, including formatting time series data and selecting the appropriate distribution for prediction intervals. Fitted parameters are also set as attributes on the ADAM object (e.g., `model.persistence_level_`). - -### 7. Forecast Generation: `ADAM.predict() / ADAM.predict_intervals()` - -``` -ADAM.predict(h, X, ...) or ADAM.predict_intervals(h, X, ...) -├── _validate_prediction_inputs() -├── _prepare_prediction_data() -│ └── preparator() [forecaster.py] // Prepares model for forecasting -│ ├── _fill_matrices_if_needed() // Calls filler() from creator.py -│ ├── _prepare_profiles_recent_table() -│ ├── _prepare_fitter_inputs() // Uses adam_fitter from _adam_general.py for in-sample if needed -│ └── _initialize_fitted_series() -├── _execute_prediction() -│ └── forecaster() [forecaster.py] // Generates forecasts and intervals -│ ├── _prepare_forecast_index() -│ ├── _check_fitted_values() -│ ├── _initialize_forecast_series() -│ ├── _prepare_lookup_table() // Uses adam_profile_creator from creator.py -│ ├── _prepare_matrices_for_forecast() -│ ├── _generate_point_forecasts() // Uses adam_forecaster from _adam_general.py -│ ├── _handle_forecast_safety_checks() -│ ├── _process_occurrence_forecast() -│ ├── _prepare_forecast_intervals() // (for predict_intervals or if calculate_intervals=True) -│ │ └── (uses sigma, covar_anal/var_anal from utils.var_covar.py, or simulation) -│ └── _format_forecast_output() -└── return forecasts / {forecasts, lower, upper} -``` -The prediction phase: -1. Validates inputs. -2. Calls `preparator()` (from `forecaster.py`) which readies the fitted model for forecasting. This might involve filling matrices using `filler()` (from `creator.py`) with the estimated parameters if they weren't already in their final form. -3. Calls `forecaster()` (from `forecaster.py`) which generates point forecasts (potentially using `adam_forecaster` from `_adam_general.py`) and, if requested, prediction intervals. Interval calculation can be parametric (using variance calculations from `utils.var_covar.py`) or simulation-based. - -## Data Flow Diagram - -``` -┌───────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ ┌───────────┐ -│ User Input│────►│ Checker │────►│ Creator │────►│ Estimator │────►│ Forecaster│ -└───────────┘ └──────────┘ └──────────┘ └───────────┘ └───────────┘ - │ │ - │ │ - │ ┌───────────────────────┐ │ - └──────────────────── ADAM (Main Interface) ────────────────────────┘ - └───────────────────────┘ -``` - -## Main Functions and Their Responsibilities - -### ADAM (Main Interface) - `adam.py` - -The ADAM class provides the primary user interface, modeled after scikit-learn's API: - -- **`__init__(...)`**: Configure the model parameters. -- **`fit(y, X=None)`**: Fit the model to the data. This orchestrates calls to `parameters_checker`, `architector`, `creator`, and `estimator`/`selector`. -- **`predict(h, X=None, calculate_intervals=True, ...)`**: Generate point forecasts. Can also compute and store prediction intervals. -- **`predict_intervals(h, X=None, levels=[0.8, 0.95], ...)`**: Generate and return point forecasts and prediction intervals. - -### Checker (Parameter Validation) - `checker.py` - -The `parameters_checker()` function validates all user inputs and converts them into the format needed by the rest of the system: -- Validates model specification (ETS components, ARIMA orders). -- Processes observation data, handles occurrence models for intermittent data. -- Checks lags and seasonal periods. -- Validates persistence parameters (smoothing parameters) specifications. -- Checks initial state/parameter specifications (e.g., "optimal", "provided"). -- Validates constant terms and explanatory variables. - -### Creator (Model Structure and Optimization Initialization) - `creator.py` - -This module builds the state-space model structure and prepares for optimization: - -- **`architector()`**: Defines the high-level model architecture: number of ETS/ARIMA components, lag structure (calling `_setup_components`, `_setup_lags`), and forecasting profiles (calling `_create_profiles` which uses `adam_profile_creator`). -- **`creator()`**: Constructs the core state-space matrices (measurement `mat_wt`, transition `mat_f`, persistence `vec_g`) and initializes the actual state vector (`mat_vt`) based on data characteristics or provided initial values. -- **`initialiser()`**: **Called by `estimator.py`**. Prepares the initial parameter vector (`B`) for optimization, along with their lower (`Bl`) and upper (`Bu`) bounds, and parameter names. -- **`filler()`**: **Called by cost functions (e.g., `CF`) during optimization and by `preparator()` in `forecaster.py`**. Updates model matrices (`mat_vt`, `mat_wt`, `mat_f`, `vec_g`) based on a given parameter vector `B`. - -### Estimator (Parameter Estimation and Model Selection) - `estimator.py` - -This module handles parameter optimization and model selection: - -- **`estimator()`**: Manages the estimation process for a single model. It calls `initialiser()` (from `creator.py`) for the initial parameter guess, sets up and runs the optimization (using a cost function like `CF` from `utils.cost_functions.py`, which internally uses `filler()`), and then processes results (log-likelihood, fitted values, errors, scale). -- **`selector()`**: Manages the model selection process. It generates a pool of candidate models, estimates each (typically a simplified run of `estimator()`), and selects the best one based on an information criterion (e.g., AICc calculated via `ic_function` from `utils.ic.py`). -- **`CF()` (in `utils.cost_functions.py`)**: The main cost function used by `estimator()`. It takes a parameter vector `B`, uses `filler()` to update matrices, runs the `adam_fitter` (from `_adam_general.py`), and computes the loss (e.g., likelihood, MSE). -- **`log_Lik_ADAM()` (in `utils.cost_functions.py`)**: Calculates the log-likelihood of the ADAM model. - -### Forecaster (Forecast Generation) - `forecaster.py` - -This module generates forecasts and prediction intervals: - -- **`preparator()`**: Prepares the fitted model for forecasting. This involves setting up the final state vector, matrices (possibly calling `filler()` from `creator.py` with estimated parameters), and profiles. -- **`forecaster()`**: Generates point forecasts (using `adam_forecaster` from `_adam_general.py`) and, if requested, prediction intervals. Interval calculation can be parametric (using `sigma`, `covar_anal`/`var_anal` from `utils.var_covar.py`) or simulation-based. -- **`generate_prediction_interval()`**: A utility function for generating prediction intervals, likely used internally by `forecaster`. - - -## Class Attributes and Fitted Parameters - -After calling `fit()`, the ADAM class stores fitted parameters as attributes with trailing underscores (scikit-learn convention): - -- **`persistence_level_`**: Smoothing parameter for level. -- **`persistence_trend_`**: Smoothing parameter for trend. -- **`persistence_seasonal_`**: Smoothing parameters for seasonal components. -- **`persistence_xreg_`**: Smoothing parameters for exogenous regressors. -- **`phi_`**: Damping parameter. -- **`arma_parameters_`**: ARIMA parameters (coefficients for AR and MA terms). -- **`initial_states_`**: Initial state values used for the model. -- (`Other fitted parameters like scale, specific distribution parameters may also be stored`). - -## Examples of Usage - -Simple example with ETS model: - -```python -from smooth.adam_general.core.adam import ADAM -import numpy as np - -# Sample data -y_data = np.array([10, 12, 15, 13, 16, 18, 20, 19, 22, 25, 28, 30, - 11, 13, 16, 14, 17, 19, 21, 20, 23, 26, 29, 31]) - - -# Initialize the model -model = ADAM(model="ANN", lags=[1,12]) # Additive error, no trend, no seasonality, lags for level and seasonality - -# Fit the model to data -model.fit(y_data) - -# Generate forecasts -forecasts = model.predict(h=10) - -# Generate prediction intervals (these are also calculated by predict if calculate_intervals=True) -intervals = model.predict_intervals(h=10, levels=[0.8, 0.95]) -``` - -Example with ARIMA: - -```python -# Initialize an ARIMA(1,1,1) model with seasonality 12 for the AR/MA parts too. -# Assuming non-seasonal ARIMA part applied with lags=[1] -# and seasonal ARIMA part (if any) would need lags=[12] and orders specified per lag. -# For a simple ARIMA(p,d,q) on the deseasonalized series (if ETS part exists) -# or on the original series (if no ETS part), lags=[1] is typical for orders. -model_arima = ADAM(ar_order=[1], i_order=[1], ma_order=[1], lags=[1]) - - -# Fit and forecast -model_arima.fit(y_data) -forecasts_arima = model_arima.predict(h=10) -``` - -Example with exogenous variables: - -```python -# Sample exogenous data -X_data = np.random.rand(len(y_data), 2) -X_future = np.random.rand(10, 2) - - -# Initialize a model with exogenous variables -# Assuming "AAN" with lags=[1,12] for level and season -model_xreg = ADAM(model="AAN", lags=[1,12], regressors="use") - -# Fit with exogenous variables -model_xreg.fit(y=y_data, X=X_data) - -# Forecast with future exogenous variables -forecasts_xreg = model_xreg.predict(h=10, X=X_future) -``` - -## Performance Considerations - -- The package relies on numerical optimization (NLopt) for parameter estimation, which can be computationally intensive for complex models or large datasets. -- Models with high-frequency seasonality (e.g., hourly data with multiple seasonal lags) or high-order ARIMA components may require more computation time. -- The `fast=True` option in `ADAM` initialization can speed up estimation but might lead to less accurate results. - -## Summary of Refactoring Improvements - -The refactored package offers several improvements over the original translated code: - -1. **Improved Code Organization**: Functions are broken down into smaller, focused units following the single responsibility principle, organized into logical modules (`checker`, `creator`, `estimator`, `forecaster`, `utils`). -2. **Better Documentation**: Comprehensive docstrings and comments explain how the code works (ongoing effort). This markdown document aims to provide a high-level overview. -3. **Standardized Interface**: Follows scikit-learn conventions for a familiar API (`__init__`, `fit`, `predict`). -4. **Improved Type Hints**: Clear type annotations help prevent errors and improve code readability. -5. **Better Error Handling**: More descriptive error messages and robust validation are incorporated. - -These improvements make the package more maintainable, easier to understand, and more user-friendly, while preserving the original functionality and accuracy of the forecasting methods. \ No newline at end of file diff --git a/.github/ISSUE_TEMPLATE/bug_report_python.yml b/.github/ISSUE_TEMPLATE/bug_report_python.yml new file mode 100644 index 00000000..1ccd4a00 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report_python.yml @@ -0,0 +1,55 @@ +name: 🐛 Bug Report for Python +description: Report a bug in the smooth package for Python +labels: ["bug", "triage"] +body: + - type: markdown + attributes: + value: | + Thanks for reporting a bug! Please provide as much detail as possible to help us reproduce the issue. + - type: textarea + id: description + attributes: + label: Description + description: A clear and concise description of what the bug is. + validations: + required: true + - type: textarea + id: reproduction + attributes: + label: Reproducible Example + description: | + Please provide a minimal, self-contained snippet of Python code. + Use synthetic data (e.g., `np.random`) or a time series + from the `fcompdata` package if necessary. + render: python + placeholder: | + # Your code here... + validations: + required: true + - type: dropdown + id: frequency + attributes: + label: Data Frequency + description: What is the frequency of your time series? + options: + - Hourly/Minutely + - Daily + - Weekly + - Monthly + - Yearly + - Irregular + validations: + required: true + - type: textarea + id: environment + attributes: + label: Environment Info + description: | + Run this and insert the output: + ```python + from smooth.utils import show_versions; show_versions() + ``` + placeholder: | + - OS: macOS/Windows/Linux + - Python version: 3.13 + - Package version: 1.0.0 diff --git a/.github/ISSUE_TEMPLATE/bug_report_r.yml b/.github/ISSUE_TEMPLATE/bug_report_r.yml new file mode 100644 index 00000000..384a2984 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report_r.yml @@ -0,0 +1,56 @@ +name: 🐛 Bug Report for R +description: Report a bug in the smooth package for R +labels: ["bug", "triage"] +body: + - type: markdown + attributes: + value: | + Thanks for reporting a bug! Please provide as much detail as possible to help us reproduce the issue. + - type: textarea + id: description + attributes: + label: Description + description: A clear and concise description of what the bug is. + validations: + required: true + - type: textarea + id: reproduction + attributes: + label: Reproducible Example + description: | + Please provide a minimal, self-contained snippet of R code. + Use synthetic data (e.g., `rnrom`) or a time series + from the `datasets` or `Mcomp` package if necessary. + render: R + placeholder: | + # Your code here... + validations: + required: true + - type: dropdown + id: frequency + attributes: + label: Data Frequency + description: What is the frequency of your time series? + options: + - Hourly/Minutely + - Daily + - Weekly + - Monthly + - Yearly + - Irregular + validations: + required: true + - type: textarea + id: environment + attributes: + label: Environment Info + description: | + Run this and insert the output: + ```r + sessionInfo() + packageVersion("smooth") + ``` + placeholder: | + - OS: macOS/Windows/Linux + - R version: 4.5.1 + - Package version: 4.4.0 diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 00000000..79b38971 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,5 @@ +blank_issues_enabled: true +contact_links: + - name: 💬 Discussions + url: https://github.com/config-i1/smooth/discussions + about: Ask questions or discuss forecasting techniques here. diff --git a/.github/ISSUE_TEMPLATE/documentation_imp.yml b/.github/ISSUE_TEMPLATE/documentation_imp.yml new file mode 100644 index 00000000..19a15512 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/documentation_imp.yml @@ -0,0 +1,49 @@ +name: 📖 Documentation Improvement +description: Suggest a fix for a typo, a clearer explanation, or a new tutorial. +labels: ["documentation"] +body: + - type: markdown + attributes: + value: | + Strong documentation is the backbone of reliable forecasting! Thank you for helping us improve. + - type: dropdown + id: doc-type + attributes: + label: What kind of improvement is this? + options: + - Missing Documentation (e.g., a function has no docstring) + - Typo or Grammar Fix + - Mathematical Clarification (e.g., LaTeX formula is wrong) + - New Tutorial or Example (e.g., a Jupyter notebook guide) + - Outdated Information (e.g., API has changed) + validations: + required: true + - type: input + id: location + attributes: + label: Where is the improvement needed? + description: Provide a link to the documentation page or the path to the file in the repo. + placeholder: "e.g., https://github.com/config-i1/smooth/wiki/ADAM or /R/adam.R" + validations: + required: true + - type: textarea + id: current-content + attributes: + label: Current Content + description: What does the documentation currently say? (Copy-paste the snippet if applicable). + - type: textarea + id: suggested-change + attributes: + label: Suggested Change + description: | + Please describe the change you'd like to see. + If this is a mathematical fix, feel free to use LaTeX notation. + placeholder: "The formula for the seasonal component should be..." + validations: + required: true + - type: checkboxes + id: contribute + attributes: + label: Would you like to help? + options: + - label: I would like to submit a Pull Request to fix this myself! diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml new file mode 100644 index 00000000..3da3c142 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.yml @@ -0,0 +1,42 @@ +name: ✨ Feature Request +description: Suggest a new model, method or evaluation metric. +labels: ["enhancement"] +body: + - type: textarea + id: feature-description + attributes: + label: Is your feature request related to a problem? + description: A clear description of what the problem is (e.g., I'm frustrated because...). + validations: + required: true + - type: textarea + id: solution + attributes: + label: Describe the solution you'd like + description: | + Explain the new functionality. + If it's a new model (e.g., Theta Method), please link to the relevant paper or documentation. + validations: + required: true + - type: textarea + id: alternative + attributes: + label: Describe alternatives you've considered + description: Are there existing ways to achieve this in the package or via other libraries? + - type: checkboxes + id: language + attributes: + label: Language + description: Which programming language is this for? + options: + - label: R + - label: Python + - label: Other + validations: + required: true + - type: checkboxes + id: contribution + attributes: + label: Would you be willing to contribute this feature? + options: + - label: Yes, I can submit a Pull Request. diff --git a/.github/workflows/publish-testpypi.yml b/.github/workflows/publish-testpypi.yml new file mode 100644 index 00000000..44d95f5a --- /dev/null +++ b/.github/workflows/publish-testpypi.yml @@ -0,0 +1,100 @@ +name: Publish to TestPyPI + +on: + workflow_dispatch: + +jobs: + build_wheels: + name: Build wheels on ${{ matrix.os }} + runs-on: ${{ matrix.os }} + strategy: + fail-fast: false + matrix: + os: [ubuntu-latest, windows-latest, macos-latest] + + steps: + - uses: actions/checkout@v4 + with: + submodules: recursive + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Install cibuildwheel + run: python -m pip install cibuildwheel + + - name: Build wheels + run: python -m cibuildwheel python --output-dir wheelhouse + env: + CIBW_BUILD: cp310-* cp311-* cp312-* cp313-* + CIBW_SKIP: "*-win32 *-manylinux_i686 *-musllinux*" + CIBW_BEFORE_BUILD_LINUX: yum install -y openblas-devel lapack-devel + CIBW_BEFORE_BUILD_MACOS: brew install openblas || true + CIBW_BEFORE_BUILD_WINDOWS: pip install numpy + CIBW_TEST_REQUIRES: pytest numpy + CIBW_TEST_COMMAND: "pytest {project}/python/tests" + CIBW_ARCHS_MACOS: "x86_64 arm64" + CIBW_MANYLINUX_X86_64_IMAGE: manylinux2014 + CIBW_MANYLINUX_AARCH64_IMAGE: manylinux2014 + + - name: Upload wheels as artifacts + uses: actions/upload-artifact@v4 + with: + name: wheels-${{ matrix.os }} + path: ./wheelhouse/*.whl + retention-days: 5 + + build_sdist: + name: Build source distribution + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + with: + submodules: recursive + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Install build + run: python -m pip install build + + - name: Build sdist + run: python -m build --sdist python + + - name: Upload sdist as artifact + uses: actions/upload-artifact@v4 + with: + name: sdist + path: ./python/dist/*.tar.gz + retention-days: 5 + + publish: + name: Publish to TestPyPI + needs: [build_wheels, build_sdist] + runs-on: ubuntu-latest + environment: + name: testpypi + url: https://test.pypi.org/p/smooth + permissions: + id-token: write + + steps: + - name: Download all artifacts + uses: actions/download-artifact@v4 + with: + path: dist + merge-multiple: true + + - name: Display downloaded files + run: ls -R dist/ + + - name: Publish to TestPyPI + uses: pypa/gh-action-pypi-publish@release/v1 + with: + repository-url: https://test.pypi.org/legacy/ diff --git a/README.md b/README.md index fed1ca46..08c4f1ea 100644 --- a/README.md +++ b/README.md @@ -8,11 +8,13 @@ R: Python: +[![Python CI](https://github.com/config-i1/smooth/actions/workflows/python_ci.yml/badge.svg)](https://github.com/config-i1/smooth/actions/workflows/python_ci.yml) The **smooth** package implements Single Source of Error (SSOE) state-space models for forecasting and time series analysis, available for both R and Python. -![hex-sticker of the smooth package for R](https://github.com/config-i1/smooth/blob/master/man/figures/smooth-web.png?raw=true) +![hex-sticker of the smooth package for R](https://github.com/config-i1/smooth/blob/master/man/figures/smooth-web.png?raw=true) ![hex-sticker of the smooth package for Python](https://github.com/config-i1/smooth/blob/master/python/img/smooth-python-web.png?raw=true) + ## Installation diff --git a/conda/r-smooth/bld.bat b/conda/r-smooth/bld.bat new file mode 100644 index 00000000..01bc5ead --- /dev/null +++ b/conda/r-smooth/bld.bat @@ -0,0 +1,2 @@ +"%R%" CMD INSTALL --build . %R_ARGS% +IF %ERRORLEVEL% NEQ 0 exit /B 1 diff --git a/conda/r-smooth/build.sh b/conda/r-smooth/build.sh new file mode 100755 index 00000000..b8d26355 --- /dev/null +++ b/conda/r-smooth/build.sh @@ -0,0 +1,3 @@ +#!/bin/bash +export DISABLE_AUTOBREW=1 +${R} CMD INSTALL --build . ${R_ARGS} diff --git a/conda/r-smooth/meta.yaml b/conda/r-smooth/meta.yaml new file mode 100644 index 00000000..15129ae9 --- /dev/null +++ b/conda/r-smooth/meta.yaml @@ -0,0 +1,82 @@ +{% set version = "4.4.0" %} +{% set posix = 'm2-' if win else '' %} + +package: + name: r-smooth + version: {{ version|replace("-", "_") }} + +source: + url: + - {{ cran_mirror }}/src/contrib/smooth_{{ version }}.tar.gz + - {{ cran_mirror }}/src/contrib/Archive/smooth/smooth_{{ version }}.tar.gz + sha256: 872e24e0ad125c1f42b0347eb36edd0d708d2dda642f75e5840135ba2cb8f497 + +build: + number: 0 + missing_dso_whitelist: + - $RPATH/R.dll # [win] + - Library/bin/libgcc_s_seh-1.dll # [win] + - Library/bin/libstdc++-6.dll # [win] + rpaths: + - lib/R/lib/ + - lib/ + +requirements: + build: + - cross-r-base {{ r_base }} # [build_platform != target_platform] + - {{ compiler('c') }} + - {{ compiler('cxx') }} + - {{ stdlib('c') }} + - {{ posix }}make + - {{ posix }}coreutils # [win] + - {{ posix }}zip # [win] + host: + - r-base + - r-greybox >=2.0.2 + - r-rcpp >=0.12.3 + - r-rcpparmadillo >=0.8.100.0.0 + - r-generics >=0.1.2 + - r-pracma + - r-statmod + - r-mass + - r-nloptr + - r-xtable + - r-zoo + - libblas + - libcblas + - liblapack + run: + - r-base + - r-greybox >=2.0.2 + - r-rcpp >=0.12.3 + - r-rcpparmadillo >=0.8.100.0.0 + - r-generics >=0.1.2 + - r-pracma + - r-statmod + - r-mass + - r-nloptr + - r-xtable + - r-zoo + +test: + commands: + - $R -e "library('smooth')" # [not win] + - "\"%R%\" -e \"library('smooth')\"" # [win] + +about: + home: https://github.com/config-i1/smooth + license: LGPL-2.1-only + license_family: LGPL + license_file: + - {{ environ["PREFIX"] }}/lib/R/share/licenses/LGPL-2.1 + summary: Forecasting Using State Space Models + description: | + Functions implementing Single Source of Error state space models for + purposes of time series analysis and forecasting. The package includes + ADAM, Exponential Smoothing, SARIMA, Complex Exponential Smoothing, + Simple Moving Average, and several simulation functions. + dev_url: https://github.com/config-i1/smooth + +extra: + recipe-maintainers: + - config-i1 diff --git a/conda/smooth/bld.bat b/conda/smooth/bld.bat new file mode 100644 index 00000000..d448a214 --- /dev/null +++ b/conda/smooth/bld.bat @@ -0,0 +1,10 @@ +@echo off +cd "%SRC_DIR%\python" + +:: Replace FetchContent(pybind11) with find_package(pybind11) +:: conda provides pybind11, and network access is not available during builds +powershell -Command "(Get-Content CMakeLists.txt) -replace 'include\(FetchContent\)', '' | Set-Content CMakeLists.txt" +powershell -Command "$c = Get-Content CMakeLists.txt -Raw; $c = $c -replace '(?s)FetchContent_Declare\(.*?FetchContent_MakeAvailable\(pybind11\)', 'find_package(pybind11 REQUIRED)'; Set-Content CMakeLists.txt $c" + +%PYTHON% -m pip install . -vv --no-deps --no-build-isolation +if errorlevel 1 exit /B 1 diff --git a/conda/smooth/build.sh b/conda/smooth/build.sh new file mode 100755 index 00000000..5c568988 --- /dev/null +++ b/conda/smooth/build.sh @@ -0,0 +1,11 @@ +#!/bin/bash +set -ex + +cd "${SRC_DIR}/python" + +# Replace FetchContent(pybind11) with find_package(pybind11) +# conda provides pybind11, and network access is not available during builds +sed -i 's/include(FetchContent)//' CMakeLists.txt +sed -i '/FetchContent_Declare/,/FetchContent_MakeAvailable(pybind11)/c\find_package(pybind11 REQUIRED)' CMakeLists.txt + +${PYTHON} -m pip install . -vv --no-deps --no-build-isolation diff --git a/conda/smooth/meta.yaml b/conda/smooth/meta.yaml new file mode 100644 index 00000000..cdbe190d --- /dev/null +++ b/conda/smooth/meta.yaml @@ -0,0 +1,72 @@ +{% set name = "smooth" %} +{% set version = "1.0.0" %} +{% set carma_version = "0.6.7" %} + +package: + name: {{ name }} + version: {{ version }} + +source: + - url: https://github.com/config-i1/smooth/archive/refs/tags/v{{ version }}.tar.gz + sha256: REPLACE_WITH_ACTUAL_HASH + folder: . + - url: https://github.com/RUrlus/carma/archive/v{{ carma_version }}.tar.gz + sha256: abc4a9cf12a177f9ad100d80f27274809b8111807911745fd6ca148ae151ecc7 + folder: src/libs/carma + +build: + number: 0 + skip: true # [py<310] + script: bash ${RECIPE_DIR}/build.sh # [not win] + script: {{ RECIPE_DIR }}\bld.bat # [win] + +requirements: + build: + - {{ compiler('cxx') }} + - {{ stdlib('c') }} + - cmake >=3.25 + - ninja + - python # [build_platform != target_platform] + - cross-python_{{ target_platform }} # [build_platform != target_platform] + host: + - python + - pip + - scikit-build-core >=0.3.3 + - pybind11 >=2.6.0 + - pybind11-abi + - numpy + - libblas + - libcblas + - liblapack + - armadillo + run: + - python + - numpy >=1.14 + - pandas >=2.0 + - nlopt + - scipy >=1.11.0,<1.16.0 + - statsmodels + +test: + imports: + - smooth + - smooth.adam_general._adamCore + commands: + - python -c "from smooth.adam_general.core.adam import ADAM; import numpy as np; y = np.array([10,12,15,13,16,18,20,19,22,25,28,30]); m = ADAM(model='ANN', lags=[1]); m.fit(y); fc = m.predict(h=3); assert len(fc.mean) == 3" + +about: + home: https://github.com/config-i1/smooth + license: LGPL-2.1-only + license_family: LGPL + license_file: LICENSE + summary: Forecasting Using State Space Models (Python) + description: | + Python implementation of the smooth forecasting package. Provides ADAM + (Augmented Dynamic Adaptive Model) for time series forecasting using + state space models, including ETS, ARIMA, and regression in a unified + Single Source of Error framework. + dev_url: https://github.com/config-i1/smooth + +extra: + recipe-maintainers: + - config-i1 diff --git a/conda/submit.sh b/conda/submit.sh new file mode 100755 index 00000000..2f2e5c67 --- /dev/null +++ b/conda/submit.sh @@ -0,0 +1,131 @@ +#!/bin/bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)" +STAGING_DIR="${HOME}/conda-forge-staging" + +usage() { + echo "Usage: $0 " + echo "" + echo " tag Create v1.0.0 tag and compute SHA256 hash" + echo " python Submit Python smooth recipe (can run anytime after tag)" + echo " r-smooth Submit r-smooth recipe (run after r-greybox is merged)" + exit 1 +} + +[ $# -lt 1 ] && usage + +ensure_staging_repo() { + if [ -d "${STAGING_DIR}" ]; then + cd "${STAGING_DIR}" + git checkout main + git pull upstream main 2>/dev/null || git pull origin main + else + gh repo fork conda-forge/staged-recipes --clone -- "${STAGING_DIR}" + cd "${STAGING_DIR}" + fi +} + +case "$1" in + tag) + echo "=== Creating v1.0.0 tag ===" + cd "${REPO_ROOT}" + git tag v1.0.0 + git push origin v1.0.0 + echo "Waiting 5s for GitHub to process the tag..." + sleep 5 + + echo "=== Computing SHA256 ===" + HASH=$(curl -sL "https://github.com/config-i1/smooth/archive/refs/tags/v1.0.0.tar.gz" | sha256sum | awk '{print $1}') + echo "SHA256: ${HASH}" + + echo "=== Updating meta.yaml ===" + sed -i "s/REPLACE_WITH_ACTUAL_HASH/${HASH}/" "${SCRIPT_DIR}/smooth/meta.yaml" + echo "Updated conda/smooth/meta.yaml with hash ${HASH}" + ;; + + python) + # Verify hash has been set + if grep -q "REPLACE_WITH_ACTUAL_HASH" "${SCRIPT_DIR}/smooth/meta.yaml"; then + echo "ERROR: Run '$0 tag' first to set the source hash." + exit 1 + fi + + echo "=== Submitting Python smooth recipe ===" + ensure_staging_repo + + git checkout -b smooth-python main 2>/dev/null || git checkout smooth-python + cp -r "${SCRIPT_DIR}/smooth" recipes/smooth + + git add recipes/smooth + git commit -m "Add smooth recipe (Python v1.0.0)" + git push -u origin smooth-python + + gh pr create \ + --repo conda-forge/staged-recipes \ + --title "Add smooth (Python)" \ + --body "$(cat <<'EOF' +## Summary +- Python package `smooth` v1.0.0 +- ADAM (Augmented Dynamic Adaptive Model) for time series forecasting +- State space models: ETS, ARIMA, and regression in a unified framework +- License: LGPL-2.1 +- Source: GitHub release + carma submodule (v0.6.7) + +## Build notes +- C++ extensions via pybind11 (conda-provided, FetchContent patched out) +- Requires BLAS/LAPACK and Armadillo +- Python >= 3.10 + +## Checklist +- [x] Dual source entries (main repo + carma submodule) +- [x] CMake FetchContent patched to use conda pybind11 +- [x] Build scripts for Linux/macOS and Windows +- [x] Test: import + fit/predict smoke test +EOF +)" + + echo "=== Done! Python smooth PR created. ===" + ;; + + r-smooth) + echo "=== Submitting r-smooth recipe ===" + echo "Make sure r-greybox has been merged on conda-forge first!" + read -rp "Has r-greybox been merged? [y/N] " confirm + [[ "${confirm}" =~ ^[Yy]$ ]] || { echo "Aborting."; exit 0; } + + ensure_staging_repo + + git checkout -b r-smooth main 2>/dev/null || git checkout r-smooth + cp -r "${SCRIPT_DIR}/r-smooth" recipes/r-smooth + + git add recipes/r-smooth + git commit -m "Add r-smooth recipe (v4.4.0 from CRAN)" + git push -u origin r-smooth + + gh pr create \ + --repo conda-forge/staged-recipes \ + --title "Add r-smooth" \ + --body "$(cat <<'EOF' +## Summary +- CRAN package `smooth` v4.4.0 +- Forecasting using Single Source of Error state space models +- License: LGPL-2.1 +- Depends on `r-greybox` (now available on conda-forge) + +## Checklist +- [x] Source from CRAN with archive fallback +- [x] All dependencies available on conda-forge (including r-greybox) +- [x] Build scripts for Linux/macOS and Windows +- [x] Test: `library('smooth')` +EOF +)" + + echo "=== Done! r-smooth PR created. ===" + ;; + + *) + usage + ;; +esac diff --git a/python/README.md b/python/README.md index 1fc64604..71062a49 100644 --- a/python/README.md +++ b/python/README.md @@ -6,31 +6,32 @@ Python implementation of the **smooth** package for time series forecasting usin ![hex-sticker of the smooth package for Python](https://github.com/config-i1/smooth/blob/master/python/img/smooth-python-web.png?raw=true) -**Status:** Work in progress ## Installation -**From GitHub:** +**From GitHub (source):** ```bash pip install "git+https://github.com/config-i1/smooth.git@master#subdirectory=python" ``` -**From source (development):** +**From wheels:** +Check the wheels for your system [the latest release](https://github.com/config-i1/smooth/releases/tag/v4.4.0). + +For example, for Windows with Python 3.13: ```bash -git clone https://github.com/config-i1/smooth.git -cd smooth/python -pip install -e ".[dev]" +pip install https://github.com/config-i1/smooth/releases/download/v4.4.0/smooth-1.0.0-cp313-cp313-win_amd64.whl ``` +See the [Installation Guide](https://github.com/config-i1/smooth/wiki/Installation) for platform-specific instructions. + + ## System Requirements -This package requires compilation of C++ extensions. Before installing, ensure you have: +If installing from source, this package requires compilation of C++ extensions. Before installing, ensure you have: - **C++ compiler** (g++, clang++, or MSVC) - **CMake** >= 3.25 - **Armadillo** linear algebra library -See the [Installation Guide](https://github.com/config-i1/smooth/wiki/Installation) for platform-specific instructions. - ## Quick Example ```python diff --git a/python/img/smooth-python-web.png b/python/img/smooth-python-web.png index ed5aa4be..5caaecc8 100644 Binary files a/python/img/smooth-python-web.png and b/python/img/smooth-python-web.png differ diff --git a/python/src/smooth/adam_general/core/utils/printing.py b/python/src/smooth/adam_general/core/utils/printing.py index 63920cf8..bd6c6940 100644 --- a/python/src/smooth/adam_general/core/utils/printing.py +++ b/python/src/smooth/adam_general/core/utils/printing.py @@ -726,6 +726,11 @@ def _format_phi(model: Any, digits: int) -> str: if not model._model_type.get("damped", False): return "" + # Skip if phi is not estimated (means model isn't actually damped) + if hasattr(model, "_phi_internal") and model._phi_internal: + if not model._phi_internal.get("phi_estimate", False): + return "" + phi_val = None if hasattr(model, "_phi_internal") and model._phi_internal: