List of tabular models update #1379

dmitryglhf · 2025-03-17T14:08:15Z

This is a 🙋 feature or enhancement.

Summary

todo's:

actualize the set of candidate models
update docs for the list of available models
test on benchmark with enabled composer (1 fold 1h8c) for updated model candidates
adaptation of tests for the updated candidates
benchmarking with limited operations num: {'resample', 'scaling', 'pca', 'normalization', 'poly_features'}
check optimization history
check benchmark with models: {'lgbm', 'xgboost', 'catboost', 'rf', 'logit', 'ridge', 'knn', 'treg', 'linear', 'lasso', 'adareg'}

done:

new set of tabular candidate models: {'lgbm', 'xgboost', 'catboost', 'rf', 'logit', 'ridge', 'knn', 'treg'}

Context

Closes #1339

github-actions · 2025-03-17T14:09:15Z

All PEP8 errors has been fixed, thanks ❤️

Comment last updated at Fri, 28 Mar 2025 18:28:37

nicl-nno · 2025-03-19T12:41:03Z

try to implement operations aggregations

Думаю это лучше отдельным PR-ом

codecov · 2025-03-19T15:39:57Z

Codecov Report

Attention: Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 80.21%. Comparing base (eae485e) to head (e5ceb44).

Files with missing lines	Patch %	Lines
...ot/core/composer/gp_composer/specific_operators.py	50.00%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1379   +/-   ##
=======================================
  Coverage   80.20%   80.21%           
=======================================
  Files         146      146           
  Lines       10597    10597           
=======================================
+ Hits         8499     8500    +1     
+ Misses       2098     2097    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

test/unit/optimizer/gp_operators/test_mutation.py

…f-tabular-models-update

dmitryglhf · 2025-03-22T10:53:29Z

Results on benchmark with 1 hour 8c and 1 fold:

	Metric (mean)	Master	Models update	PR#1380-Operations update
0	auc	0.901339	0.901033	0.898024
1	acc	0.86848	0.865966	0.863784
2	balacc	0.854555	0.852153	0.8357
3	logloss	0.339388	0.340543	0.339235
4	training_duration	2101.31	1810.22	1930.7

Метрика немного упала.

Dataset	Version	Pipeline	Accuracy	AUC	Balanced Accuracy	Log Loss
Australian	Master	logit, fast_ica, logit, pca, scaling, xgboost, lgbm, catboost, bernb	0.855072	0.941426	0.859508	0.449198
	PR-Models	rf, scaling, poly_features, scaling, normalization	0.869565	0.944822	0.872666	0.29653
	PR-Operations+Models	logit, xgboost, scaling, knn, knn, xgboost, knn	0.855072	0.94652	0.859508	0.331309
Blood Transfusion Service Center	Master	mlp, normalization, poly_features, normalization, pca, fast_ica, normalization	0.76	0.759747	0.595029	0.486645
	PR-Models	logit, poly_features, scaling, resample	0.773333	0.774366	0.69883	0.59641
	PR-Operations+Models	logit, logit, poly_features, normalization	0.746667	0.757797	0.52924	0.489928
car	Master	rf	0.803468		0.865695	0.386452
	PR-Models	catboost, logit	0.774566		0.750098	0.40911
	PR-Operations+Models	rf	0.797688		0.743699	0.387637
christine	Master	catboost, scaling	0.750923	0.83595	0.750923	0.505129
	PR-Models	catboost, scaling	0.750923	0.83595	0.750923	0.505129
	PR-Operations+Models	catboost, scaling	0.750923	0.83595	0.750923	0.505129
cnae-9	Master	logit, scaling	0.962963		0.962963	0.141408
	PR-Models	logit	0.962963		0.962963	0.138759
	PR-Operations+Models	catboost	0.935185		0.935185	0.191485
credit-g	Master	logit, poly_features, scaling, isolation_forest_class	0.82	0.850952	0.77619	0.449925
	PR-Models	logit, poly_features, isolation_forest_class, scaling, poly_features	0.81	0.82619	0.769048	0.465911
	PR-Operations+Models	logit, pca, scaling, poly_features, scaling	0.81	0.822381	0.75	0.469132
fabert	Master	logit	0.697816		0.66528	0.834174
	PR-Models	logit	0.697816		0.66528	0.834174
	PR-Operations+Models	logit	0.697816		0.66528	0.834174
jasmine	Master	rf, bernb, poly_features, isolation_forest_class	0.80602	0.874116	0.805638	0.402396
	PR-Models	logit, rf, normalization, fast_ica, isolation_forest_class, xgboost	0.819398	0.880761	0.81906	0.454436
	PR-Operations+Models	rf, normalization	0.826087	0.873333	0.825749	0.397765
kr-vs-kp	Master	logit, xgboost, scaling, catboost, lgbm, dt	0.99375	0.999961	0.993738	0.0173921
	PR-Models	logit, catboost, scaling, fast_ica, xgboost, lgbm	1.0	1.0	1.0	0.00739224
	PR-Operations+Models	logit, catboost, scaling, lgbm, xgboost, scaling	0.99375	0.999961	0.993464	0.0151288
phoneme	Master	mlp, logit, normalization, rf, scaling, poly_features	0.907579	0.964898	0.886825	0.409138
	PR-Models	rf, poly_features	0.903882	0.963878	0.8787	0.227145
	PR-Operations+Models	logit, catboost, poly_features, resample, poly_features, rf, rf, scaling, poly_features	0.911275	0.967796	0.876593	0.29887
segment	Master	catboost	0.991342		0.991342	0.0691071
	PR-Models	logit, catboost, rf, scaling, lgbm	0.982684		0.982684	0.0707873
	PR-Operations+Models	logit, knn, resample, logit, scaling, rf	0.978355		0.978355	0.0878775
vehicle	Master	mlp, scaling, fast_ica, normalization, resample	0.894118		0.895022	0.279058
	PR-Models	logit, scaling, normalization, poly_features, scaling	0.858824		0.86039	0.408728
	PR-Operations+Models	logit, poly_features, scaling, resample	0.870588		0.872294	0.415623

Судя по всему, на метрике сказался не обновленный список моделей, а то, что композер чаще выбирал операции с данными.

nicl-nno · 2025-03-22T11:44:12Z

А сохранилась история оптимизации для credit-g?

nicl-nno · 2025-03-22T11:49:57Z

И для car тоже интересно, конечно. Вроде rf же есть в начальных приближениях?

Но с правками этого PR действительно не связано.

dmitryglhf · 2025-03-22T13:16:55Z

А сохранилась история оптимизации для credit-g?

Не могу найти где посмотреть историю оптимизации в бенчмарке, вижу только обычные логи и мета-данные. В целом, я зафиксировал сид экспериментов, попробую повторить его локально.

И для car тоже интересно, конечно. Вроде rf же есть в начальных приближениях?

Да интересно тут получилось, при этом и rf и catboost есть в начальных приближениях отдельно. Тоже посмотрю.

nicl-nno · 2025-03-24T09:39:54Z

Не могу найти где посмотреть историю оптимизации в бенчмарке

Там вроде можно сохранять произвольные артефакты дополнительно к обязательным.

This reverts commit e81c823.

…f-tabular-models-update

dmitryglhf added 2 commits March 15, 2025 13:12

work: added deprecated tag

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

cb76021

work: set 'boosting' tag for gbm

Loading
Loading status checks…

96bce20

dmitryglhf self-assigned this Mar 17, 2025

dmitryglhf added enhancement in progress labels Mar 17, 2025

dmitryglhf added 2 commits March 18, 2025 13:23

docs: available tabular models

Loading
Loading status checks…

b36ce42

small fix

Loading
Loading status checks…

791c599

dmitryglhf added 3 commits March 19, 2025 17:27

work: tests adaptation, closed old todo

8101671

fix: test update

472420f

work: update specific operators

Loading
Loading status checks…

2012296

nicl-nno reviewed Mar 19, 2025

View reviewed changes

test/unit/optimizer/gp_operators/test_mutation.py Show resolved Hide resolved

nicl-nno approved these changes Mar 20, 2025

View reviewed changes

dmitryglhf added 3 commits March 21, 2025 15:19

Merge branch 'master' of https://github.com/aimclub/FEDOT into list-o…

76c31fd

…f-tabular-models-update

Merge branch 'master' of https://github.com/aimclub/FEDOT into list-o…

73757b3

…f-tabular-models-update

fix: test models repository update

Loading
Loading status checks…

43bfdff

dmitryglhf added 6 commits March 26, 2025 18:56

chore: operation repository update

Loading
Loading status checks…

e81c823

Revert "chore: operation repository update"

Loading
Loading status checks…

c4b03ef

This reverts commit e81c823.

chore: test_result_reproducing params update

Loading
Loading status checks…

a8152c3

Merge branch 'master' of https://github.com/aimclub/FEDOT into list-o…

Loading
Loading status checks…

a965a75

…f-tabular-models-update

chore: test_presets update

Loading
Loading status checks…

5d940dc

docs: available models update

Loading
Loading status checks…

e5ceb44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List of tabular models update #1379

List of tabular models update #1379

dmitryglhf commented Mar 17, 2025 •

edited

Loading

github-actions bot commented Mar 17, 2025 •

edited

Loading

nicl-nno commented Mar 19, 2025

codecov bot commented Mar 19, 2025 •

edited

Loading

dmitryglhf commented Mar 22, 2025 •

edited

Loading

nicl-nno commented Mar 22, 2025

nicl-nno commented Mar 22, 2025 •

edited

Loading

dmitryglhf commented Mar 22, 2025

nicl-nno commented Mar 24, 2025

List of tabular models update #1379

Are you sure you want to change the base?

List of tabular models update #1379

Conversation

dmitryglhf commented Mar 17, 2025 • edited Loading

Summary

Context

github-actions bot commented Mar 17, 2025 • edited Loading

Comment last updated at Fri, 28 Mar 2025 18:28:37

nicl-nno commented Mar 19, 2025

codecov bot commented Mar 19, 2025 • edited Loading

Codecov Report

dmitryglhf commented Mar 22, 2025 • edited Loading

nicl-nno commented Mar 22, 2025

nicl-nno commented Mar 22, 2025 • edited Loading

dmitryglhf commented Mar 22, 2025

nicl-nno commented Mar 24, 2025

dmitryglhf commented Mar 17, 2025 •

edited

Loading

github-actions bot commented Mar 17, 2025 •

edited

Loading

codecov bot commented Mar 19, 2025 •

edited

Loading

dmitryglhf commented Mar 22, 2025 •

edited

Loading

nicl-nno commented Mar 22, 2025 •

edited

Loading