-
Notifications
You must be signed in to change notification settings - Fork 87
Description
Describe the new API function requested
While sits_reduce() handles temporal combinations, it would be valuable to also include a model-driven feature selection method that:
- Complements existing reduction by selecting optimal bands/features for specific classification tasks
- Uses Random Forest's variable importance (e.g., Mean Decrease in Accuracy and Gini importance) to:
- Rank features by predictive importance
- Iteratively remove least important features (backward elimination) - Preserves
sitsworkflow by returning a modifiedsitstibble
Associated sits API function
sits_feature_selection(
samples, # sits tibble (time series)
bands = NULL, # Bands to evaluate (NULL = all)
importance_metric = "mda", # or "gini"
n_iter = 20, # Max iterations
accuracy_loss = 0.02, # Allowed accuracy drop (2%)
rf_params = list(), # Custom RF params (num_trees, etc)
multicores = 1 # Parallel processing
)
The returned tibble can then be used to generate a reduced sampled cube, continuing the standard pipeline with sits_train(), sits_classify(), etc.
Additional context
This function would be especially useful for reducing high-dimensional inputs from multiple indices and texture metrics, improving performance and minimizing overfitting. Focusing on band-level optimization, complementing the temporal reduction approach of sits_reduce().
Note: I’m a user and not familiar with the internal feasibility of this integration.
References
[1] Mahmood et al. (2025) - Demonstrated 80% feature reduction with <2% accuracy loss
Metadata
Metadata
Assignees
Labels
Type
Projects
Status