Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop favouring numeric sort specialized algorithms #11

Open
dumblob opened this issue Jun 21, 2023 · 4 comments
Open

Stop favouring numeric sort specialized algorithms #11

dumblob opened this issue Jun 21, 2023 · 4 comments

Comments

@dumblob
Copy link

dumblob commented Jun 21, 2023

It seems the current benchmark sorts numbers (disregarding whether floats or integers) in an array/vector/list/...

Yet, this is fundamentally flawed in my opinion as many (most?) languages and compilers and libraries do specialization (in compile time or in run time or both) and switch to magnitudes faster numeric sorting algorithm ("non-comparison sort") which has totally different mathematical boundaries (almost O(n)) than what this benchmark wants to measure - i.e. the pair-wise comparison sorting algorithms ("comparison sort") with provably best possible O(n log n).

https://en.wikipedia.org/wiki/Sorting_algorithm#Non-comparison_sorts

Should we rather compare some complex structs of e.g. 19 bytes in size?

I would prefer no power of 2, not bigger than half of a typical cache line (i.e. max 32 bytes), no even number, and not "power of two plus 1" (to make struct compression difficult to avoid matching power of 2). Any implicit padding under the hood is fair IMHO.

@LilithHafner
Copy link
Owner

This repo is for evaluating different language's approaches to performing sorting dominated workflows. There are no requirements about the internals of the sorting algorithms, including whether they are comparison-based or non-comparison-based.

That said, there are lots of sorting tasks and ways to evaluate sorting algorithms and I agree that it would be nice to cover cases other than IEEE floating point. If I get the time and energy to do it, I might extend this repo to a wider scope. However, having to reproduce all the benchmarking code in 8 languages makes this somewhat time-consuming. which is why, as of now, according to the readme, "the benchmark covers only sorting of random floating point numbers into increasing order."

@dumblob
Copy link
Author

dumblob commented Jun 22, 2023

the benchmark covers only sorting of random floating point numbers into increasing order

Yep. IMHO this necessitates a big disclaimer in the readme about the comparison (O(n log n)) and non-comparison (O(n)) differences as this benchmark seems to be linked from different places AFAIK but all those places deal with comparison sorts and this benchmark is being understood as a comparison benchmark which is 100% false leading to flawed and very misleading assertions like - and I quote - "so the primary comparison of note there is that Julia's default sorting algorithm (which happens to be stable) is about twice as fast as glidesort".

@LilithHafner
Copy link
Owner

Please quote complete sentences when a partial sentence excerpt omits important limitations.

The purpose of InterLanguageSortingBenchmarks is to give a rough idea of performance across languages, not within them, so the primary comparison of note there is that Julia's default sorting algorithm (which happens to be stable) is about twice as fast as glidesort in that particular benchmark on that particular system.

orlp/glidesort#2 (comment)

@dumblob
Copy link
Author

dumblob commented Jun 25, 2023

Yes, that is exactly what I meant - "twice as fast" is by far not - and I quote - "a rough idea of performance" (in the context of sorting algorithms where the champions compete with rather small differences in performance).

That is why I find it very misleading and actually rather false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants