Model feature influence by JGarciaCondado · Pull Request #73 · compneurobilbao/ageml

JGarciaCondado · 2025-04-02T21:08:29Z

Summary

We have added two new methods to the ageml processing pipeline. The new commands are:

model_feature_influence
age_model_vs_logistic_regression

Description of commands

`model_feature_influence`:

Takes as input features, clinical file and two groups.

Orders the N features given according to their Mutual Information (MI) with age and the MI discriminative power between the two clinical groups.
Creates N models trained with different feature sets according to the features' MI with age. The first model is trained with the best feature according to the MI with age. Then the second best feature is added and so on...
Creates N models trained with different feature sets according to the features' MI with their discriminative power. The first model is trained with the best feature according to the MI with their discriminative power. Then the second best feature is added and so on...

Output:

Order of features according to MI with age, and MI with discriminative power
Graph of the features' MI with age in the x-axis and MI with discriminative power in the y-axis
Graph of MAE according to the number of features used.
Graph of AUC to classify the two groups according to the number of features used.
Graph of MAE and AUC for both the age-MI and discriminative-MI models.

`age_model_vs_logistic_regression`

Takes as input a features file, a clinical file, and two groups.

Repeats the process above. However it trains three different BrainAge models: Linear Regressor, Ridge and SVM.
We are interested in the AUC only in classifying.
Trains a logisitic regressor directly with the features, then computes the AUC.
Also trains 4 logisitic regressors using: only the features, [features + age], only the delta, [features + delta]

This pipeline gives you an idea of the benefit of computing the deltas to classify clinical groups, compared to just using the features.

Output:

AUC graph for each of the 4 models (3 Brain Age models and 1 using features)
AUC values for the 4 logisitic regressors.

itellaetxe

LGTM. Minor stuff:

Argparser nargs comments
Inheritance for ...FoldMetrics
Docstrings should be in the same format, they are not consistent.

itellaetxe · 2025-04-11T08:16:48Z

+        self.parser.add_argument(
+            "-m",
+            "--model",
+            nargs="*",


why nargs="*"? (nargs doc)
Can you specify more than one model? Or you just want to put the specified model string into a list? E.g.-> you specify "ridge" and it automatically gets put into ["ridge"]

If you only want one or zero arguments, you should instead change this to nargs="?"

So this is inherited from previous code above, see model_age. When building the models the user can choose to use one of the availble models (only one). However, they can specify several other arguments to set in the model. Example: -m linear_reg fit_intercept=False normalize=True. As the user can input no arguments (default uses) or one argument (model type that will use the default settings for tha model type) or muliple arguments which woul require then *. If it is more appropriate to use ? we should change it above too.

Okay seems reasonable! nargs="*" is good for our use case, then. 👍

itellaetxe · 2025-04-11T08:18:17Z

+        self.parser.add_argument(
+            "-s",
+            "--scaler",
+            nargs="*",


See comment above about nargs. Same applies to --scaler argument here.

I have replied in the commment above and its the same case it can have 0, 1 or muliple arguments.

itellaetxe · 2025-04-11T08:50:31Z

+            raise ValueError('task_type must be either "regression" or "classification"')
+        else:
+            self.task_type = task_type
+        self.train_metrics: List[Union[RegressionFoldMetrics, ClassificationFoldMetrics]] = []


To avoid this List and Union thing, use class inheritance.

Making RegressionFoldMetrics and ClassificationFoldMetrics children of the same abstract parent class e.g. FoldMetrics would be easier. This way you just have to check if the parent class of the input is of type FoldMetrics.

I agree this has been changed.

itellaetxe · 2025-04-11T08:51:02Z

+        self.test_metrics.append(fold_test)
+
+    def _calculate_summary(self, metrics_list: List[Union[RegressionFoldMetrics, ClassificationFoldMetrics]]) -> Dict[str, Dict[str, float]]:
+        # TODO - Automatically infer instead of using Union to have both types


See comment above about inheritance with FoldMetrics

itellaetxe · 2025-04-11T08:58:04Z

        plt.close()

+    def ordering(self, mi_age, mi_discr, feature_names, system_dict, title):
+        """Plot in the ssame figure the mutual information for age and discrimination."""


Incomplete docstring

I have now updated the docstring.

itellaetxe · 2025-04-11T08:58:14Z

+        plt.close()
+
+    def multiple_metrics_vs_num_features(self, metrics_age, metrics_discrimination, title):
+


Missing docstring

This has now been added

itellaetxe

LGTM. Merging. Ty for the work @JGarciaCondado 🤝

itellaetxe · 2025-04-28T14:17:47Z

+        self.parser.add_argument(
+            "-m",
+            "--model",
+            nargs="*",


Okay seems reasonable! nargs="*" is good for our use case, then. 👍

JGarciaCondado added 18 commits October 1, 2024 10:58

[ENH] Add command for model feature influence

6742277

[ENH] Add feature ordering with mutual info

2be3284

[ENH] Add prediction of deltas + auc classify

eb1a81b

[ENH] Add visualization to model feature analysis

e0e2689

[ENH] Add verbose wrapper to modelling funcs

14880f1

[ENH] Add CVHandler for metrics

283d2e6

[ENH] Add command for age models vs logistic regression

1215bf9

[ENH] Encaspulate simultaneous age prediction and classification

4b5852a

[ENH] Add visualization and different models

2a3bcfb

[ENH] Add classifcation analysis with all features

25b9220

[BUG] Change scaler to be within CV

f7c321d

[TST] Add tests for all the new functions and classes

29a8cdd

[ENH] Add CVHandler to Classifier

4131ffd

[ENH] Create a figure with both orders together

38282a6

[ENH] improve visualization of orders in age_model_vs_lr

01ce9ff

Add graph of MI for age and discrimination

093a12d

[ENH] Add system labelling in orderin visualization

d45ee65

[BUG] Fix correct labels in graphs produced

5697821

JGarciaCondado linked an issue Apr 2, 2025 that may be closed by this pull request

Model Feature Influence #53

Closed

JGarciaCondado requested a review from itellaetxe April 2, 2025 21:08

itellaetxe approved these changes Apr 11, 2025

View reviewed changes

Comment thread src/ageml/visualizer.py

[FIX} Fix issues from PR

5b53958

JGarciaCondado force-pushed the model-feature-influence branch 3 times, most recently from 33c2235 to 5b53958 Compare April 24, 2025 18:55

JGarciaCondado added 2 commits April 25, 2025 11:32

[FIX] Fix hyperparameter tunning integration from rebaseing

ea0adfa

[ENH] Updated ReadME with new pipeline

e6f63be

itellaetxe approved these changes Apr 28, 2025

View reviewed changes

itellaetxe merged commit 047cf21 into main Apr 28, 2025
3 checks passed

		plt.close()

		def multiple_metrics_vs_num_features(self, metrics_age, metrics_discrimination, title):

Conversation

JGarciaCondado commented Apr 2, 2025 • edited by itellaetxe Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Description of commands

model_feature_influence:

age_model_vs_logistic_regression

Uh oh!

itellaetxe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

itellaetxe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JGarciaCondado commented Apr 2, 2025 •

edited by itellaetxe

Loading

`model_feature_influence`:

`age_model_vs_logistic_regression`