Skip to content

Files

Latest commit

 

History

History
124 lines (107 loc) · 8.63 KB

complexity_analysis.md

File metadata and controls

124 lines (107 loc) · 8.63 KB

Time complexity analysis

Amortized analysis

  • Amortized Analysis is used for algorithms where an occasional operation is very slow, but most of the other operations are faster.
  • The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint Sets and Splay Trees.
  • The complexity analysis using worst case gives a loose upper bound. Amortized analysis gives a tighter upper bound. In Amortized Analysis, we analyze a sequence of operations and guarantee a worst case average time which is lower than the worst case time of a particular expensive operation.

List operations

  • Inner working: Python lists are internally represented as arrays. The idea used is similar to implementation of vectors in C++ or ArrayList in Java. The costly operations are inserting and deleting items near the beginning (as everything has to be moved). Insert at the end also becomes costly if preallocated space becomes full.
Operation Examples Amortized complexity
Append l.append(item) O(1)
Clear l.clear() O(1)
Contain item in/not in l O(N)
Copy l.copy() O(N)
Delete del l[i] O(N)
Extend l.extend(…) O(N)
Equality l1==l2, l1!=l2 O(N)
Index l[i] O(1)
Iteration for item in l: O(N)
Length len(l) O(1)
Multiply k*l O(k*N)
Min, Max min(l), max(l) O(N)
Pop from end l.pop(-1) O(1)
Pop intermediate l.pop(item) O(N)
Reverse l.reverse() O(N)
Slice l[x:y] O(y-x)
Sort l.sort() O(N*log(N))
Store l[i]=item O(1)

Deque

  • Deque is implemented using a doubly linked list inside. If you need to add/remove at both ends, consider using a collections.deque instead of list.
Operation Examples Amortized complexity
Copy l.copy() O(1)
append l.append() O(1)
appendLeft l.appendleft O(1)
pop l.pop() O(1)
popleft l.popleft() O(1)
extend l.extend(…) O(K)
extendleft l.extendleft(…) O(K)
remove l.remove(…) O(n)

Dictionary operations

Operation Examples Amortized complexity
Clear d.clear() O(1)
Construction dict(…) O(len(d))
Delete del d[k] O(1)
Get d.get() O(1)
Iteration(key, value, item) for item in d: O(N)
Length len(d) O(1)
Pop d.pop(item) O(1)
Pop Item d.popitem() O(1)
Returning Views d.keys() O(1)
Fromkeys d.fromkeys(seq) O(len(seq))

Set operations

  • Inner working: Set in Python can be defined as the collection of items. In Python, these are basically used to include membership testing and eliminating duplicate entries. The data structure used in this is Hashing, a popular technique to perform insertion, deletion and traversal in O(1) on average. The operations on Hash Table are some what similar to Linked List. Sets in python are unordered list with duplicate elements removed.
Operation Examples Amortized complexity
Add s.add(item) O(1)
Clear s.clear() O(1)
Copy s.copy() O(N)
Containment item in/not in s O(1)
Creation set(…) O(len(s))
Discard s.discard(item) O(1)
Difference s1-s2 O(len(s1))
Difference Update s1.difference_update(s2) O(len(s2))
Equality s1==s2, s1!=s2 O(min(len(s1), len(s2)))
Intersection s1 & s2 O(min(len(s1), len(s2)))
Iteration for item in s: O(N)
Is Subset s1<=s2 O(len(s1))
Is Superset s1>=s2 O(len(s2))
Pop s.pop() O(1)
Union s1 s2
Symmetric Difference s1^s2 len(s1)

Sorted container inner working

  • Inner implementation: http://www.grantjenks.com/docs/sortedcontainers/implementation.html
  • First impl intuition:
    • The first is that Python’s list is fast, really fast. Lists have great characteristics for memory management and random access.
    • The second is that bisect.insort is fast. This is somewhat counter-intuitive since it involves shifting a series of items in a list
  • Second impl intuition: But using only one list and bisect.insort would produce sluggish behavior for lengths exceeding ten thousand. So the implementation of Sorted List uses a list of lists to store elements. In this way, inserting or deleting is most often performed on a short list. Only rarely does a new list need to be added or deleted.

Compare with tree based impl

  • Traditional tree-based designs have better big-O notation but that ignores the realities of today’s software and hardware. For a more in-depth analysis, read Performance at Scale. Compared to tree-based implementations, using lists of lists has a few advantages based on memory usage.
    1. Most insertion/deletion doesn’t require allocating or freeing memory. This can be a big win as it takes a lot of strain off the garbage collector and memory system.
    2. Pointers to elements are packed densely. A traditional tree-based implementation would require two pointers (left/right) to child nodes. Lists have no such overhead. This benefits the hardware’s memory architecture and more efficiently utilizies caches.
    3. The memory overhead per item is effectively a pointer to the item. Binary tree implementations must add at least two more pointers per item.
    4. Iteration is extremely fast as sequentially indexing lists is a strength of modern processors.

SortedList operations

SortedSet operations

SortedDict operations

References