Skip to content

Commit 6bad102

Browse files
committed
feat: Add additional data descriptors to benchmark results
- Updated to include additional information (, , and ) in the benchmark results. - Ensured that the dictionary is updated with the new fields and metrics. - Modified to handle and process the new data descriptors in the benchmark results. These changes enhance the benchmark results by providing more detailed information about the dataset and its characteristics
1 parent 7c7f1f9 commit 6bad102

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

Diff for: sklbench/benchmarks/custom_function.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -104,11 +104,14 @@ def main(bench_case: BenchCase, filters: List[BenchCase]):
104104
"function": function_name,
105105
}
106106
result = enrich_result(result, bench_case)
107-
# TODO: replace `x_train` data_desc with more informative values
108-
result.update(data_description["x_train"])
107+
# Replace `x_train` data_desc with more informative values
108+
result.update({
109+
"memory_usage": x_train.nbytes,
110+
"feature_names": list(x_train.columns) if isinstance(x_train, pd.DataFrame) else None,
111+
"class_distribution": dict(pd.Series(y_train).value_counts()) if y_train is not None else None
112+
})
109113
result.update(metrics)
110114
return [result]
111115

112-
113116
if __name__ == "__main__":
114117
main_template(main)

Diff for: sklbench/report/implementation.py

+3
Original file line numberDiff line numberDiff line change
@@ -89,12 +89,15 @@
8989
"dataset",
9090
"samples",
9191
"features",
92+
"feature_names",
9293
"format",
9394
"dtype",
9495
"order",
9596
"n_classes",
97+
"class_distribution",
9698
"n_clusters",
9799
"batch_size",
100+
"memory_usage",
98101
]
99102

100103
DIFFBY_COLUMNS = ["environment_name", "library", "format", "device"]

0 commit comments

Comments
 (0)