Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison to SortedContainers.SortedList #1

Open
grantjenks opened this issue Jul 6, 2015 · 5 comments
Open

Comparison to SortedContainers.SortedList #1

grantjenks opened this issue Jul 6, 2015 · 5 comments

Comments

@grantjenks
Copy link

Have you seen the Python SortedContainers module? It's a fast, pure-Python implementation of SortedList, SortedSet, and SortedDict data types. It also has an extensive performance comparison with modules related to yours.

I was interested in your performance claims and I wanted to benchmark SortedContainers against your skip lists. But a SortedList is not quite the same as a SkipList. So I wrote a little wrapper and did a benchmark comparison (using your perf_skiplist.py. Here's my results:

Test PySkipList SortedContainers Compare
skiplist_index_throughput_1000 152326.28 602629.89 3.95617808037 times faster
skiplist_index_throughput_10000 134544.94 485423.76 3.60789309505 times faster
skiplist_index_throughput_100000 100671.55 382475.61 3.79924228841 times faster
skiplist_insert_throughput_1000 77543.06 368406.15 4.75098803168 times faster
skiplist_insert_throughput_10000 57264.03 321809.49 5.61974925621 times faster
skiplist_insert_throughput_100000 42304.85 257105.53 6.07744809401 times faster
skiplist_remove_throughput_1000 88881.20 171370.95 1.92808996728 times faster
skiplist_remove_throughput_10000 66682.10 140854.81 2.11233314488 times faster
skiplist_remove_throughput_100000 50299.01 109368.50 2.17436685136 times faster
skiplist_search_throughput_1000 226229.99 346636.69 1.53223138099 times faster
skiplist_search_throughput_10000 159439.83 241837.23 1.51679307485 times faster
skiplist_search_throughput_100000 103455.39 181585.36 1.75520444126 times faster

I'm not sure if you published this because the SkipList data structure is awesome (which it is) or because you need this in a production environment where performance matters. In the latter case, I wanted to provide this wrapper for you and get your thoughts. Here's the wrapper source (Apache2 License):

from operator import itemgetter
from sortedcontainers import SortedListWithKey

class PySkipList(SortedListWithKey):
    def __init__(self):
        self._list = SortedListWithKey(key=itemgetter(0))

    def insert(self, key, value):
        self._list.add((key, value))

    def replace(self, key, value):
        if not len(self):
            self._list.add((key, value))

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            self._list[pos] = (key, value)
        else:
            self._list.add((key, value))

    def clear(self):
        self._list.clear()

    def __len__(self):
        return len(self._list)

    def __iter__(self, start=None, stop=None):
        return self._list.irange_key(
            min_key=start, max_key=stop, inclusive=(False, False)
        )

    items = __iter__

    def keys(self, start=None, stop=None):
        return (pair[0] for pair in self.items(start, stop))

    def values(self, start=None, stop=None):
        return (pair[1] for pair in self.items(start, stop))

    def popitem(self):
        if len(self):
            return self._list.pop(0)
        else:
            raise KeyError

    def search(self, key, default=None):
        if not len(self):
            return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return pair
        else:
            return default

    def remove(self, key):
        if not len(self):
            raise KeyError

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return self._list.pop(pos)
        else:
            raise KeyError

    __not_set = object()

    def pop(self, key, default=__not_set):
        if not len(self):
            if default is __not_set:
                raise KeyError
            else:
                return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return self._list.pop(pos)[1]
        else:
            if default is __not_set:
                raise KeyError
            else:
                return default

    def __contains__(self, key):
        if not len(self):
            return False

        pos = self._list.bisect_key_left(key)

        return key == self._list[pos][0]

    def index(self, key, default=__not_set):
        if not len(self):
            if default is __not_set:
                raise KeyError
            else:
                return default

        pos = self._list.bisect_key_left(key)
        pair = self._list[pos]

        if key == pair[0]:
            return pos
        else:
            if default is __not_set:
                raise KeyError
            else:
                return default

    def count(self, key):
        start = self._list.bisect_key_left(key)
        end = self._list.bisect_key_right(key)
        return end - start

    def __getitem__(self, pos):
        if isinstance(pos, slice):
            return self._list.islice(pos.start, pos.stop)
        else:
            return self._list[pos]

    def __delitem__(self, pos):
        del self._list[pos]

    def __setitem__(self, pos, value):
        pair = self._list.pop(pos)
        self._list[pos] = (pair[0], value)

If you're interested in making this even faster, let me know. There's a few things that could be done.

@geertj
Copy link
Owner

geertj commented Jul 11, 2015

Thanks!

I wasn't aware of sortedcontainers. The skiplist code originated in 2013 or so. Had I known about this module, I might have used it.

Your implementation is in essence a 2-level B-tree with a leaf page size of 1000, correct? And you maintain a separate index for positional access (it wasn't immediately clear to me how that worked).

About performance... I was mostly interested in Big-O performance. Having a small constant factor between my and your implementation doesn't bother me that much.

That said, I don't want to back down from a nice challenge.. .:) The reason sortedcontainers is fast is because your inner loops are in essence bisect, list.insert and list.remove, which have fast C-level implementations in the standard lib. The inner loops of PySkipList are not common so they aren't in the standard library.

I did a small test where I implemented the inner loop for search() (which is _find_lt()) in C. The results are here:

Test PySkipList SortedContainers Compare
skiplist_search_throughput_1000 515587.46 252061.54 2.05 times slower
skiplist_search_throughput_10000 413028.46 167220.33 2.47 times slower
skiplist_search_throughput_100000 241382.81 123530.10 1.95 times slower

As you can see SortedContainers is about 2x slower here. Also on PyPy I would expect PySkipList to be very fast without a C extension, as I expect the JIT to specialize and inline most the inner loops.

C implementation is below. Right now it only contains the _find_lt method, which speeds up the search operations. To speed up insertion and removal I'd have to implement _insert() and _remove as well.

I may add this to PySkipList, but I'm not sure yet. The inner loops for insertion and removal are a bit more messy.

/*
 * This file is part of PySkiplist. PySkiplist is Copyright (c) 2012-2015 by
 * the PySkiplist authors.
 *
 * PySkiplist is free software available under the MIT license. See the file
 * named LICENSE distributed with this file for the exact licensing terms.
 */

#include <Python.h>

#if PY_MAJOR_VERSION != 3
#  error "This module is for Python 3 only."
#endif

static PyObject *
_skiplist_find_lt(PyObject *self, PyObject *args)
{
    PyObject *Phead, *Ptail, *Ppath, *Pdistance, *Pkey, *Pnode, *Pnnode;
    long i, level, distance;

    if (!PyArg_ParseTuple(args, "O!O!O!O!lO:find_lt",
                &PyList_Type, &Phead, &PyList_Type, &Ptail,
                &PyList_Type, &Ppath, &PyList_Type, &Pdistance,
                &level, &Pkey))
        return NULL;

    Pnode = Phead;
    distance = 0;

    for (i=level-1; i>=0; i--) {
        Pnnode = PyList_GET_ITEM(Pnode, 2+i);
        while ((Pnnode != Ptail) &&
                    PyObject_RichCompareBool(PyList_GET_ITEM(Pnnode, 0), Pkey, Py_LT)) {
            Pnode = Pnnode;
            Pnnode = PyList_GET_ITEM(Pnnode, 2+i);
            distance += i ? PyLong_AS_LONG(PyList_GET_ITEM(Pnode, Py_SIZE(Pnode)-1)) : 1;
        }
        Py_INCREF(Pnode);
        if (PyList_SetItem(Ppath, i, Pnode) < 0)
            return NULL;
        if (PyList_SetItem(Pdistance, i, PyLong_FromLong(distance)) < 0)
            return NULL;
    }

    Py_INCREF(Py_None);
    return Py_None;
}

static PyMethodDef _skiplist_methods[] =
{
    {"find_lt", (PyCFunction) _skiplist_find_lt, METH_VARARGS},
    {NULL, NULL}
};

PyDoc_STRVAR(_skiplist_doc, "PySkipList accelerator methods");

static struct PyModuleDef _skiplist_module =  {
    PyModuleDef_HEAD_INIT,
    "_skiplist",
    _skiplist_doc,
    -1,
    _skiplist_methods
};

PyMODINIT_FUNC PyInit__skiplist(void)
{
    return PyModule_Create(&_skiplist_module);
}

To use it, you need this patch:

diff --git a/pyskiplist/skiplist.py b/pyskiplist/skiplist.py
index 5285ef8..5d396d5 100644
--- a/pyskiplist/skiplist.py
+++ b/pyskiplist/skiplist.py
@@ -13,6 +13,8 @@ import sys
 import math
 import random

+from . import _skiplist
+
 __all__ = ['SkipList']


@@ -362,7 +364,8 @@ class SkipList(object):
         If the key was not found, return *default*. If no default was provided,
         return ``None``. This method never raises a ``KeyError``.
         """
-        self._find_lt(key)
+        _skiplist.find_lt(self._head, self._tail, self._path, self._distance,
+                          self._level, key)
         node = self._path[0][2]
         if node is self._tail or key < node[0]:
             return default
diff --git a/setup.cfg b/setup.cfg
index 8bd5075..70a0349 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -1,3 +1,6 @@
+[build_ext]
+inplace=1
+
 [bdist_wheel]
 universal=1

diff --git a/setup.py b/setup.py
index 484130c..510bcdc 100644
--- a/setup.py
+++ b/setup.py
@@ -7,7 +7,7 @@

 from __future__ import absolute_import, print_function

-from setuptools import setup
+from setuptools import setup, Extension


 version_info = {
@@ -32,7 +32,11 @@ version_info = {


 def main():
-    setup(packages=['pyskiplist'], **version_info)
+    setup(
+        packages=['pyskiplist'],
+        ext_modules=[Extension('pyskiplist._skiplist', ['pyskiplist/_skiplist.c'])],
+        **version_info
+    )


 if __name__ == '__main__':

@grantjenks
Copy link
Author

Wow. So great to get a positive response and a counter-challenge.

That's right. SortedContainers uses a 2-level B-tree (which is really a half-hearted B-tree) with leaf size of 1,000. The indexing is hard to explain. I probably left the best notes in the source at:

Can you try benchmarking the SortedContainers version again but with this search method instead:

    def search(self, key, default=None):
        _list = self._list
        _maxes = _list._maxes

        if not _maxes:
            return default

        pos = bisect_left(_maxes, key)

        if pos == len(_maxes):
            return default

        idx = bisect_left(_list._keys[pos], key)

        pair = _list._lists[pos][idx]

        if key == pair[0]:
            return pair
        else:
            return default

This method uses some knowledge of the SortedList internals and skips the numeric indexing. It also inlines the bisect_key_left call. Function calls can be surprisingly slow in Python. Here's the performance numbers compared to the old version:

Test old_soco_search new_soco_search Compare
skiplist_search_throughput_1000 271388.16 644781.55 2.37586470242
skiplist_search_throughput_10000 238774.00 783103.81 3.27968627237
skiplist_search_throughput_100000 189808.98 552685.68 2.9117994312

I think that would put it ahead of your C implementation. Let me know your local results.

I also wouldn't put so much emphasis on big-O as it can be deceiving.

I'd bet SortedContainers does still better on PyPy mainly because of memory utilization. Your skip-list adds a node for every inserted item which you implement as a list. SortedContainers just adds a pointer to the object. I think that's why SortedContainers has an increasing lead on the insert benchmark. Let me know if you run the numbers. SortedContainers has a runtime comparison that may interest you.

@geertj
Copy link
Owner

geertj commented Jul 11, 2015

Latest results. I did some more optimization on the skiplist side. I also ran the benchmark up to 10m nodes to see what would happen.

Test PySkipList SortedContainers Compare
skiplist_search_throughput_1000 611218.89 636555.93 1.04 times faster
skiplist_search_throughput_10000 474677.83 572452.82 1.21 times faster
skiplist_search_throughput_100000 265458.48 432803.68 1.63 times faster
skiplist_search_throughput_1000000 166906.62 313530.94 1.87 times faster
skiplist_search_throughput_10000000 110138.05 219230.91 1.99 times faster

It's quite interesting to me how well SortedContainers holds up to 10m nodes, even if at that point the load factor of 1,000 is too small.

SortedContaniers is a fine implementation indeed!

I think your insight is correct: modern processors can do a memmove() much faster than they can do random memory access. This means that B-tree type algorithms become more efficient than structures like binary trees or linked lists where each node is represented by its own allocated structure. And the second benefit is that because not every node is its own allocation, you save memory. (And the third, Python-specific benefit would be that you don't have to write C code but instead just use list).

Essentially, memory has become like rotating storage again :)

I'll put a note on the pyskiplist page. I think that at this point users should definitely consider using SortedCollections instead.

The only (small) thing for me is that when storing key/value tuples in a SortedListWithKey, that I get the per-item overhead of the tuple again. I can use a SortedDict instead but only if there are no duplicate keys. Did you ever consider a variant that stores a separate value in-line with the key?

When SortedContainers reaches 1.0, I think you should consider submitting it for inclusion in the standard library. Sorted data types are obviously missing. And as this thread shows, it is not immediately obvious what the best solution is.

@grantjenks
Copy link
Author

The performance of SortedContainers at scale was a pleasant surprise to me too. That it also does so in pure-Python is one of those lessons about "programming in Python" versus "programming the interpreter". The bisect module exposes a surprisingly powerful and fast set of interpreter commands.

Thanks for being willing to put a note on the pyskiplist page. I've been trying to build support around SortedContainers and these steps are just the ones toward inclusion in the standard library. You're support joins Mark Summerfield and his sorteddict project.

You raise a good point about the design of SortedListWithKey. I had originally wanted to implement your pyskiplist using a SortedDict but colliding keys would break indexing. The SortedListWithKey actually already stores the keys in a parallel list of lists as the values. You wouldn't have to override too many methods to get your version working. For example, add currently looks like this:

    def add(self, val):
        """Add the element *val* to the list."""
        _maxes, _lists, _keys = self._maxes, self._lists, self._keys

        key = self._key(val)

        if _maxes:
            pos = bisect_right(_maxes, key)

            if pos == len(_maxes):
                pos -= 1
                _maxes[pos] = key
                _keys[pos].append(key)
                _lists[pos].append(val)
            else:
                idx = bisect_right(_keys[pos], key)
                _keys[pos].insert(idx, key)
                _lists[pos].insert(idx, val)

            self._expand(pos)
        else:
            _maxes.append(key)
            _keys.append([key])
            _lists.append([val])

        self._len += 1

But it could instead work like:

    def add(self, key, val):
        """Add the element *val* to the list with *key*."""
        _maxes, _lists, _keys = self._maxes, self._lists, self._keys

        if _maxes:
            pos = bisect_right(_maxes, key)

            if pos == len(_maxes):
                pos -= 1
                _maxes[pos] = key
                _keys[pos].append(key)
                _lists[pos].append(val)
            else:
                idx = bisect_right(_keys[pos], key)
                _keys[pos].insert(idx, key)
                _lists[pos].insert(idx, val)

            self._expand(pos)
        else:
            _maxes.append(key)
            _keys.append([key])
            _lists.append([val])

        self._len += 1

The only difference there is the addition of the key parameter and removal of the self._key call. Other methods are more complex (e.g. update) but not impossible. For your particular purposes, it would certainly be faster to avoid the tuple overhead.

I'm not eager to add a bunch of add_with_key and update_with_key methods but if you want to open a bug report, I'd be glad to see how many other people have a similar need. I suppose an alternate implementation would have SortedListWithKey require the key manually supplied and the current SortedListWithKey could inherit and be called SortedListWithKeyFunc or something.

The diversity of the problem space is one reason Python doesn't have sorted container types. The current api originated from the blist module and adapts the sorted(list, key=...) built-in.

Thanks for your dialogue on this topic.

@grantjenks
Copy link
Author

Thanks for adding the SortedContainers reference to your README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants