Blosc
diff --git a/‎ANNOUNCE.rst
Lines changed: 1 addition & 1 deletion b/‎ANNOUNCE.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.rst
Lines changed: 60 additions & 77 deletions b/‎README.rst
Lines changed: 60 additions & 77 deletions
@@ -34,7 +34,7 @@ On top of C-Blosc2 we built Python-Blosc2, a Python wrapper that exposes the
 C-Blosc2 API, plus many extensions that allow it to work transparently with
 NumPy arrays, while performing advanced computations on compressed data that
 can be stored either in-memory, on-disk or on the network (via the
-`Caterva2 library <https://github.com/Blosc/Caterva2>`_).
+`Caterva2 library <https://github.com/ironArray/Caterva2>`_).
 
 Python-Blosc2 leverages both NumPy and numexpr for achieving great performance,
 but with a twist. Among the main differences between the new computing engine
 
@@ -2,8 +2,8 @@
 Python-Blosc2
 =============
 
-A fast & compressed ndarray library with a flexible computational engine
-========================================================================
+A fast & compressed ndarray library with a flexible compute engine
+==================================================================
 
 :Author: The Blosc development team
 :Contact: [email protected]
@@ -26,58 +26,46 @@ A fast & compressed ndarray library with a flexible computational engine
 What it is
 ==========
 
-`C-Blosc2 <https://github.com/Blosc/c-blosc2>`_ is a blocking, shuffling and
-lossless compression library meant for numerical data written in C.  Blosc2
-is the next generation of Blosc, an
-`award-winning <https://www.blosc.org/posts/prize-push-Blosc2/>`_
+Python-Blosc2 is a high-performance compressed ndarray library with a flexible
+compute engine.  It uses the C-Blosc2 library as the compression backend.
+`C-Blosc2 <https://github.com/Blosc/c-blosc2>`_ is the next generation of
+Blosc, an `award-winning <https://www.blosc.org/posts/prize-push-Blosc2/>`_
 library that has been around for more than a decade, and that is been used
 by many projects, including `PyTables <https://www.pytables.org/>`_ or
 `Zarr <https://zarr.readthedocs.io/en/stable/>`_.
 
-On top of C-Blosc2 we built Python-Blosc2, a Python wrapper that exposes the
-C-Blosc2 API, plus many extensions that allow it to work transparently with
-NumPy arrays, while performing advanced computations on compressed data that
+Python-Blosc2 is Python wrapper that exposes the C-Blosc2 API, *plus* a
+compute engine that allow it to work transparently with NumPy arrays,
+while performing advanced computations on compressed data that
 can be stored either in-memory, on-disk or on the network (via the
-`Caterva2 library <https://github.com/Blosc/Caterva2>`_).
+`Caterva2 library <https://github.com/ironArray/Caterva2>`_).
 
-Python-Blosc2 leverages both NumPy and numexpr for achieving great performance,
-but with a twist. Among the main differences between the new computing engine
-and NumPy or numexpr, you can find:
+Python-Blosc2 makes special emphasis on interacting well with existing
+libraries and tools. In particular, it provides:
 
-* Support for n-dim arrays that are compressed in-memory, on-disk or on the
-  network.
-* High performance compression codecs, for integer, floating point, complex
-  booleans, string and structured data.
+* Support for NumPy `universal functions mechanism <https://numpy.org/doc/2.1/reference/ufuncs.html>`_,
+  allowing to mix and match NumPy and Blosc2 computation engines.
+* Excellent integration with Numba and Cython via
+  `User Defined Functions <https://www.blosc.org/python-blosc2/getting_started/tutorials/03.lazyarray-udf.html>`_.
+* Lazy expressions that are computed only when needed, and that can be stored
+  for later use.
+
+Python-Blosc2 leverages both `NumPy <https://numpy.org>`_ and
+`NumExpr <https://numexpr.readthedocs.io/en/latest/>`_ for achieving great
+performance, but with a twist. Among the main differences between the new
+computing engine and NumPy or numexpr, you can find:
+
+* Support for ndarrays that can be compressed and stored in-memory, on-disk
+  or `on the network <https://github.com/ironArray/Caterva2>`_.
 * Can perform many kind of math expressions, including reductions, indexing,
   filters and more.
-* Support for NumPy ufunc mechanism, allowing to mix and match NumPy and
-  Blosc2 computations.
-* Excellent integration with Numba and Cython via User Defined Functions.
-* Support for broadcasting operations. This is a powerful feature that
-  allows to perform operations on arrays of different shapes.
+* Support for broadcasting operations. Allows to perform operations on arrays
+  of different shapes.
 * Much better adherence to the NumPy casting rules than numexpr.
-* Lazy expressions that are computed only when needed, and can be stored for
-  later use.
-* Persistent reductions that can be updated incrementally.
+* Persistent reductions where ndarrays that can be updated incrementally.
 * Support for proxies that allow to work with compressed data on local or
   remote machines.
 
-You can read some of our tutorials on how to perform advanced computations at:
-
-https://www.blosc.org/python-blosc2/getting_started/tutorials
-
-As well as the full documentation at:
-
-https://www.blosc.org/python-blosc2
-
-Finally, Python-Blosc2 aims to leverage the full C-Blosc2 functionality to
-support a wide range of compression and decompression needs, including
-metadata, serialization and other bells and whistles.
-
-**Note:** Blosc2 is meant to be backward compatible with Blosc(1) data.
-That means that it can read data generated with Blosc, but the opposite
-is not true (i.e. there is no *forward* compatibility).
-
 NDArray: an N-Dimensional store
 ===============================
 
@@ -132,21 +120,19 @@ Here it is a simple example:
 As you can see, the ``NDArray`` instances are very similar to NumPy arrays,
 but behind the scenes, they store compressed data that can be processed
 efficiently using the new computing engine included in Python-Blosc2.
-[Although not exercised above, broadcasting and reductions also work, as well as
-filtering, indexing and sorting operations for structured arrays (tables).]
 
-To pique your interest, here is the performance (measured on a modern desktop machine)
+To wet your appetite, here is the performance (measured on a modern desktop machine)
 that you can achieve when the operands in the expression above fit comfortably in memory
 (20_000 x 20_000):
 
 .. image:: https://github.com/Blosc/python-blosc2/blob/main/images/lazyarray-expr.png?raw=true
   :width: 90%
   :alt: Performance when operands fit in-memory
 
-In this case, the performance is somewhat below that of top-tier libraries like Numexpr,
-but it is still quite good, specially when compared with plain NumPy.  For these short
-benchmarks, numba normally loses because its relatively large compiling overhead cannot be
-amortized.
+In this case, the performance is somewhat below that of top-tier libraries like
+Numexpr, but still quite good, specially when compared with plain NumPy.  For
+these short benchmarks, numba normally loses because its relatively large
+compiling overhead cannot be amortized.
 
 One important point is that the memory consumption when using the ``LazyArray.compute()``
 method is pretty low (does not exceed 100 MB) because the output is an ``NDArray`` object,
@@ -159,26 +145,29 @@ Another point is that, when using the Blosc2 engine, computation with compressio
 actually faster than without it (not by a large margin, but still).  To understand why,
 you may want to read `this paper <https://www.blosc.org/docs/StarvingCPUs-CISE-2010.pdf>`_.
 
-And here it is the performance when the operands barely fit in memory (50_000 x 50_000):
+And here it is the performance when the operands and result (50_000 x 50_000) barely fit in memory
+(a machine with 64 GB of RAM, for a working set of 60 GB):
 
 .. image:: https://github.com/Blosc/python-blosc2/blob/main/images/lazyarray-expr-large.png?raw=true
   :width: 90%
   :alt: Performance when operands do not fit well in-memory
 
-In this latter case, the memory consumption figures does not seem extreme, but this is because
-the displayed values represent *actual* memory consumption *during* the computation
-(not virtual memory); in addition, the resulting array is boolean, so it does not take too much
-space to store (just 2.4 GB uncompressed). In this scenario, the performance compared to top-tier
-libraries like Numexpr or Numba is quite competitive.
+In this latter case, the memory consumption figures do not seem extreme; this
+is because the displayed values represent *actual* memory consumption *during*
+the computation, and not virtual memory; in addition, the resulting array is
+boolean, so it does not take too much space to store (just 2.4 GB uncompressed).
 
-You can find the benchmark for the examples above at:
+In this later scenario, the performance compared to Numexpr or Numba is quite
+competitive, and actually faster than those.  This is because the Blosc2
+compute engine is is able to perform the computation streaming over the
+compressed chunks and blocks, for a better use of the memory and CPU caches.
+
+You can find the notebooks for these benchmarks at:
 
 https://github.com/Blosc/python-blosc2/blob/main/bench/ndarray/lazyarray-expr.ipynb
 
 https://github.com/Blosc/python-blosc2/blob/main/bench/ndarray/lazyarray-expr-large.ipynb
 
-Feel free to run them in your own machine and compare the results.
-
 Installing
 ==========
 
@@ -189,12 +178,17 @@ You can install the binary packages from PyPi using ``pip``:
 
     pip install blosc2
 
-We are in the process of releasing 3.0.0, along with wheels for various
-versions.  For example, to install the first release candidate version, you can use:
+If you want to install the latest release, you can do it with pip:
 
 .. code-block:: console
 
-    pip install blosc2==3.0.0rc2
+    pip install blosc2 --upgrade
+
+For conda users, you can install the package from the conda-forge channel:
+
+.. code-block:: console
+
+    conda install -c conda-forge blosc2
 
 Documentation
 =============
@@ -209,7 +203,7 @@ https://github.com/Blosc/python-blosc2/tree/main/examples
 
 Finally, we taught a tutorial at the `PyData Global 2024 <https://pydata.org/global2024/>`_
 that you can find at: https://github.com/Blosc/Python-Blosc2-3.0-tutorial.  There you will
-find differents Jupyter notebook that explains the main features of Python-Blosc2.
+find different Jupyter notebook that explains the main features of Python-Blosc2.
 
 Building from sources
 =====================
@@ -233,18 +227,7 @@ correctly by running the tests:
 .. code-block:: console
 
     pip install .[test]
-    pytest  (add -v for verbose mode)
-
-Benchmarking
-============
-
-If you are curious, you may want to run a small benchmark that compares a plain
-NumPy array copy against compression using different compressors in your Blosc2
-build:
-
-.. code-block:: console
-
-     python bench/pack_compress.py
+    pytest   # add -v for verbose mode
 
 License
 =======
@@ -287,11 +270,11 @@ to the core development of the Blosc2 library:
 - Ivan Vilata i Balaguer
 - Oumaima Ech.Chdig
 
-In addition, other people have contributed to the project in different
+In addition, other people have participated to the project in different
 aspects:
 
-- Jan Sellner, who contributed the mmap support for NDArray/SChunk objects.
-- Dimitri Papadopoulos, who contributed a large bunch of improvements to the
+- Jan Sellner, contributed the mmap support for NDArray/SChunk objects.
+- Dimitri Papadopoulos, contributed a large bunch of improvements to the
   in many aspects of the project.  His attention to detail is remarkable.
 - And many others that have contributed with bug reports, suggestions and
   improvements.
@@ -319,4 +302,4 @@ organization, which is a non-profit that supports many open-source projects.
 Thank you!
 
 
-**Make compression better!**
+**Compress Better, Compute Bigger**