Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How come fromList is O(n) instead of O(n*log n)? #309

Open
sjakobi opened this issue Mar 29, 2021 · 4 comments · May be fixed by #498
Open

How come fromList is O(n) instead of O(n*log n)? #309

sjakobi opened this issue Mar 29, 2021 · 4 comments · May be fixed by #498

Comments

@sjakobi
Copy link
Member

sjakobi commented Mar 29, 2021

-- | /O(n)/ Construct a map with the supplied mappings. If the list
-- contains duplicate mappings, the later mappings take precedence.
fromList :: (Eq k, Hashable k) => [(k, v)] -> HashMap k v
fromList = L.foldl' (\ m (k, v) -> unsafeInsert k v m) empty
{-# INLINABLE fromList #-}

By contrast, fromListWith is documented to be O(n*log n).

The difference between these functions is that the former uses unsafeInsert while the latter uses unsafeInsertWith. Both are recursive though, so I don't understand how fromList gets away with linear complexity?!

@sjakobi sjakobi changed the title How come fromList is O(n) instead of O(n*log n) How come fromList is O(n) instead of O(n*log n)? Mar 29, 2021
@jberryman
Copy link
Contributor

I think the log n is "log n but not reaaaaaally log n since the base is so large", so there's just an inconsistency here

@treeowl
Copy link
Collaborator

treeowl commented Sep 29, 2021

It's also not really log n because it's only expected to be a balanced tree with a perfect hash function. For instances like Hashable Int, that's very far from the case.

@Qqwy
Copy link

Qqwy commented Feb 24, 2025

I stumbled across this today. What is very curious, is that it is documented as O(n * log(n)) in the strict module, but as O(n) in the lazy and internal modules. (Thanks @Melvar for finding this)

The implementation between the strict and lazy versions are the same, except for an extra bang pattern on insert of each element. That cannot cause a difference in big-O notation as we're not counting 'time to evaluate a thunk' here.

fromListWith is implemented similarly (using unsafeInsertWith instead of unafeInsert) and it is listed as O(n * log(n)) in all modules.

Looking at when the documentation of the Strict module changed, points to commit fc031a3 which is aptly named "Documentation improvements".

This leads to the conclusion that it is a documentation mistake in the Internal and Lazy modules, and that the O(n * log(n)) version is correct.

@treeowl
Copy link
Collaborator

treeowl commented Feb 24, 2025

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants