MDAnalysis · rhowardstone · Sep 4, 2025 · Sep 4, 2025 · Sep 4, 2025 · Sep 4, 2025
diff --git a/.gitignore b/.gitignore
@@ -52,3 +52,4 @@ benchmarks/results
 .idea
 .vscode
 *.lock
+venv/
diff --git a/package/AUTHORS b/package/AUTHORS
@@ -257,6 +257,7 @@ Chronological list of authors
   - Debasish Mohanty  
   - Abdulrahman Elbanna
   - Tulga-Erdene Sodjargal
+  - Rye Howard-Stone
   - Gareth Elliott
   - Marc Schuh
   - Sirsha Ganguly

diff --git a/package/CHANGELOG b/package/CHANGELOG
@@ -21,6 +21,9 @@ The rules for this file:
 Fixes
 
 Enhancements
+ * Added optimized histogram implementation using Cython and OpenMP
+   for RDF calculations, providing 10-15x speedup for large datasets
+   (Issue #3435, PR #5128)
 
 Changes
 
@@ -35,13 +38,7 @@ Deprecations
  * 2.10.0
 
 Fixes
- * Fix incorrect conversion factors for speed units A/fs, A/us, A/ms
-   in MDAnalysis.units (Issue #5051, PR #5053)
- * `analysis.polymer.sort_backbone` is now working for discontinuous polymers
-   and does not result in an infinite loop (Issue #5112, PR #5113)
- * Fix atom selection parsing to allow escaping keywords with a backslash
-   (Issue #5111, PR #5119)
- * Fix DCDWriter to correctly write unit cell angles as cosines following
+ * Fix DCDWriter to correctly write unit cell angles as cosines following 
    NAMD/VMD convention (Issue #5069)
  * Fixed an integer overflow in large DCD file seeks on Windows
    (Issue #4879, PR #5086)
@@ -53,25 +50,14 @@ Fixes
  * Fixes bug in `analysis/gnm.py`: `closeContactGNMAnalysis`: correct the
    `residue_index_map` generation when selection is not `protein`.
    (Issue #4924, PR #4961)
- * Reads `segids` column in `PDBParser` from 73-76 instead of 67-76 to
-   align the standard of a PDBReader (e.g., Chimera, CHARMM, Gemmi).
+ * Reads `segids` column in `PDBParser` from 73-76 instead of 67-76 to 
+   align the standard of a PDBReader (e.g., Chimera, CHARMM, Gemmi). 
    (Issue #4948, PR #5001)
  * Fixes the benchmark `SimpleRmsBench` in `benchmarks/analysis/rms.py`
    by changing the way weights for RMSD are calculated, instead of
    directly passing them. (Issue #3520, PR #5006)
 
 Enhancements
- * Add conversion factor for speed unit A/ns and additional representations (Å/fs, A/µs,
-   Å/μs, Å/ms) (PR #5053)
- * Improved performance of `analysis.rdf.InterRDF_s` > 20-fold
-   by using vectorized NumPy calculations. (Issue #5067, PR #5073)
- * Parallelize `analysis.rdf.InterRDF` and `analysis.rdf.InterRDF` (Issue #4675)
- * Support for Python 3.14 was added (PR #5121).
- * Added support for reading and processing streamed data in `coordinates.base`
-   with new `StreamFrameIteratorSliced` and `StreamReaderBase` (Issue #4827, PR #4923)
- * New coordinate reader: Added `IMDReader` for reading real-time streamed
-   molecular dynamics simulation data using the IMDv3 protocol - requires
-   `imdclient` package (Issue #4827, PR #4923)
  * Added capability to calculate MSD from frames with irregular (non-linear)
    time spacing in analysis.msd.EinsteinMSD with keyword argument
    `non_linear=True` (Issue #5028, PR #5066)
@@ -84,13 +70,13 @@ Enhancements
  * Improve parsing of topology information from LAMMPS dump files to allow
    reading of mass, charge and element attributes. (Issue #3449, PR #4995)
  * Added timestep support for XDR writers and readers (Issue #4905, PR #4908)
- * New feature in `Universe`: `set_groups`. This function allows updating
-   `_topology`, `residues` (ResidueGroup), `segments` (SegmentGroup) in
-   the `Universe` using user-provided atomwise `resids` and/or `segids`.
-   (Issue #4948 #2874, PR #4965)
- * Adds an argument `force_chainids_to_segids` in `PDBParser`: this will
-   force the use of ChainID as the segID when reading PDB (it is helpful
-   if the segment IDs (segids) are partially missing in the PDB file).
+ * New feature in `Universe`: `set_groups`. This function allows updating 
+   `_topology`, `residues` (ResidueGroup), `segments` (SegmentGroup) in 
+   the `Universe` using user-provided atomwise `resids` and/or `segids`. 
+   (Issue #4948 #2874, PR #4965)   
+ * Adds an argument `force_chainids_to_segids` in `PDBParser`: this will 
+   force the use of ChainID as the segID when reading PDB (it is helpful 
+   if the segment IDs (segids) are partially missing in the PDB file). 
    (Issue #4948 #2874, PR #4965)
  * Enables parallelization for analysis.lineardensity.LinearDensity
    (Issue #4678, PR #5007)
@@ -105,15 +91,12 @@ Enhancements
 
 
 Changes
- * Output precision for LAMMPS Data files set to 10 decimals (PR #5053)
- * Support for Python 3.10 was removed, in line with SPEC0 (PR #5121).
  * Refactored the RDKit converter code to move the inferring code in a separate
    `RDKitInferring` module. The bond order and charges inferrer has been move to
    an `MDAnalysisInferrer` dataclass in there. (PR #4305)
  * Removed undocumented and unused attribute
    `analysis.lineardensity.LinearDensity.totalmass` (PR #5007)
- * Remove `default` channel from RTD conda env. (Issue # 5036, PR # 5037)
-
+ * Remove `default` channel from RTD conda env. (Issue # 5036, PR # 5037)   
 
 Deprecations
  * The RDKit converter parameter `NoImplicit` has been deprecated in favour of
@@ -136,19 +119,16 @@ Fixes
    the function to prevent shared state. (Issue #4655)
 
 Enhancements
- * Improve distopia backend support in line with new functionality available
+ * Improve distopia backend support in line with new functionality available 
    in distopia >= 0.3.1 (PR #4734)
  * Addition of 'water' token for water selection (Issue #4839)
  * Enables parallelization for analysis.density.DensityAnalysis
    (Issue #4677, PR #4729)
  * Enables parallelization for analysis.contacts.Contacts (Issue #4660)
  * Enable parallelization for analysis.nucleicacids.NucPairDist (Issue #4670)
- * Add check and warning for empty (all zero) coordinates in RDKit converter (PR #4824)
- * Added `precision` for XYZWriter (Issue #4775, PR #4771)
- * Added `precision` for XYZWriter (Issue #4775, PR #4771)
  * Add check and warning for empty (all zero) coordinates in RDKit converter
    (PR #4824)
- * Added `precision` for XYZWriter (Issue #4775, PR #4771)
+ * Added `precision` for XYZWriter (Issue #4775, PR #4771) 
 
 Changes
  * MDAnalysis.analysis.psa, MDAnalysis.analysis.waterdynamics and
@@ -162,7 +142,7 @@ Changes
 
 11/11/24 IAlibay, HeetVekariya, marinegor, lilyminium, RMeli,
          ljwoods2, aditya292002, pstaerk, PicoCentauri, BFedder,
-         tyler.je.reddy, SampurnaM, leonwehrhan, kainszs, orionarcher,
+         tyler.je.reddy, SampurnaM, leonwehrhan, kainszs, orionarcher, 
          yuxuanzhuang, PythonFZ, laksh-krishna-sharma, orbeckst, MattTDavies,
          talagayev, aya9aladdin
 
@@ -228,7 +208,7 @@ Enhancements
  * Explicitly mark `analysis.pca.PCA` as not parallelizable (Issue #4680)
  * Enables parallelization for analysis.bat.BAT (Issue #4663)
  * Enable parallelization for analysis.dihedrals.{Dihedral,Ramachandran,Janin}
-   (Issue #4673)
+   (Issue #4673) 
  * Enables parallelization for analysis.dssp.dssp.DSSP (Issue #4674)
  * Enables parallelization for analysis.hydrogenbonds.hbond_analysis.HydrogenBondAnalysis (Issue #4664)
  * Improve error message for `AtomGroup.unwrap()` when bonds are not present.(Issue #4436, PR #4642)
@@ -261,9 +241,9 @@ Changes
  * As per NEP29, the minimum version of numpy has been raised to 1.23.
    We have opted to pin to 1.23.2 to ensure the same minimum numpy version
    is used from python 3.9 to 3.11 (Issue #4401, PR #4402)
- * updated tests that used assert_almost_equal(..., decimal={N}) with
+ * updated tests that used assert_almost_equal(..., decimal={N}) with 
    equivalent assert_allclose(...  rtol=0, atol=1.5e-{N}) (issue modernize
-   testing code #3743, PR Replaced numpy.testing.assert_almost_equal to
+   testing code #3743, PR Replaced numpy.testing.assert_almost_equal to 
    numpy.testing.assert_allclose #4438)
 
 Deprecations
@@ -295,8 +275,8 @@ Deprecations
    (PR #4403)
 
 
-12/26/23 IAlibay, ianmkenney, PicoCentauri, pgbarletta, p-j-smith,
-         richardjgowers, lilyminium, ALescoulie, hmacdope, HeetVekariya,
+12/26/23 IAlibay, ianmkenney, PicoCentauri, pgbarletta, p-j-smith, 
+         richardjgowers, lilyminium, ALescoulie, hmacdope, HeetVekariya, 
          JoStoe, jennaswa, ljwoods2
 
  * 2.7.0
@@ -314,7 +294,7 @@ Fixes
     RMSD of None
   * Fixes hydrogenbonds tutorial path to point to hbonds (Issue #4285, PR #4286)
   * Fix atom charge reading in PDBQT parser (Issue #4282, PR #4283)
-  * Fix doctests failing from separated blocks of imports and
+  * Fix doctests failing from separated blocks of imports and 
     execution and incorrectly formatted test output (Issue #3925, PR #4366)
 
 Enhancements
@@ -330,8 +310,8 @@ Enhancements
   * Added a warning about charge neutrality to the documentation of
     `DielectricConstant` (Issue #4262, PR #4263)
   * Add support for reading chainID info from prmtop amber topologies (PR #4007)
-  * Add support for reading chainID info from GROMACS TPR topologies
-    (Issue #3994, PR #4281)
+  * Add support for reading chainID info from GROMACS TPR topologies 
+    (Issue #3994, PR #4281)  
   * Add support for reading chainID info from Autodock PDBQT files (Issue #4207,
     PR #4284)
 
@@ -345,7 +325,7 @@ Changes
     replacing the now deprecated `xdrlib` core Python library
     (PR #4271)
   * ConverterBase class moved from coordinates/base.py to converters/base.py (Issue #3404)
-  * Results for WatsonCrickDist nucleic acids analysis are now stored in
+  * Results for WatsonCrickDist nucleic acids analysis are now stored in 
     `analysis.nucleicacids.WatsonCrickDist.results.distances` (Issue #3720, PR #3735)
 
 Deprecations
@@ -362,7 +342,7 @@ Deprecations
     in favor of using `ResidueGroup`: using `List[Residue]` will be removed in release
     3.0.0; instead use a `ResidueGroup` (Issue #3720, PR #3735)
   * In `nucleicacids.WatsonCrickDist` the result `results.pair_distances` was
-    deprecated and will be removed in 3.0.0; use `results.distances` (Issue #3720,
+    deprecated and will be removed in 3.0.0; use `results.distances` (Issue #3720, 
     PR #3735)
 
 
@@ -414,7 +394,7 @@ Changes
     of NEP29 (PR #4108).
   * Removed `-ffast-math` compiler flag to avoid potentially incorrect
     floating point results in MDAnalysis *and* other codes when
-    MDAnalysis shared libraries are loaded --- see
+    MDAnalysis shared libraries are loaded --- see 
     https://moyix.blogspot.com/2022/09/someones-been-messing-with-my-subnormals.html
     (Issues #3821, #4211)
   * Atom ID representation order updated to match that of associated
@@ -445,7 +425,7 @@ Deprecations
     32 bit compatible versions. (Issue #3858, PR #4176)
 
 
-05/28/23 IAlibay, pgbarletta, mglagolev, hmacdope, manuel.nuno.melo, chrispfae,
+05/28/23 IAlibay, pgbarletta, mglagolev, hmacdope, manuel.nuno.melo, chrispfae, 
          ooprathamm, MeetB7, BFedder, v-parmar, MoSchaeffler, jbarnoud, jandom,
          xhgchen, jaclark5, DrDomenicoMarson, AHMED-salah00, schlaicha,
          SophiaRuan, g2707, p-j-smith, tylerjereddy, xiki-tempula, richardjgowers,
@@ -509,13 +489,13 @@ Enhancements
   * Add distopia distance calculation library bindings as a selectable backend
     for `calc_bonds` in `MDA.lib.distances`. (Issue #3783, PR #3914)
   * AuxReaders are now pickle-able and copy-able (Issue #1785, PR #3887)
-  * Add pickling support for Atom, Residue, Segment, ResidueGroup
+  * Add pickling support for Atom, Residue, Segment, ResidueGroup 
     and SegmentGroup. (PR #3953)
 
 Changes
   * As per NEP29 the minimum supported Python version has been raised to 3.9
     (PR #4115).
-  * einsum method for Einstein summation convention introduced to increase
+  * einsum method for Einstein summation convention introduced to increase 
     performance (Issue #2063, PR #4089)
   * Add progress bars to track the progress of _conclude() functions
     (_conclude_simple() and _conclude_fft()) in msd.py (Issue #4070, PR #4072)
@@ -575,24 +555,24 @@ Fixes
  * 2.4.0
 
 Fixes
-  * Update hbond analysis doc string to use exclusive bound language
+  * Update hbond analysis doc string to use exclusive bound language 
     (Issue #3847)
   * XTC and TRR readers now fail with IOError when a status except EOK (=0) is
     detected on reading a frame instead of silently continuing
   * Consolidate license files across MDAnalysis library packages (PR #3939)
   * Kwargs passed through to `MemoryReader` when using `transfer_to_memory()`
     and `in_memory=True`.
-  * NoDataError raised on empty atomgroup provided to `DCDReader.timeseries`
+  * NoDataError raised on empty atomgroup provided to `DCDReader.timeseries` 
     was changed to ValueError (see #3901). A matching ValueError was added to
-    `MemoryReader.timeseries`(PR #3890).
+    `MemoryReader.timeseries`(PR #3890). 
   * Removes ``pbc`` kwarg from ``AtomGroup.wrap`` (Issue #3909)
   * LAMMPSDump Reader translates the box to the origin (#3741)
   * hbond analysis: access hbonds results through new results member in count_by_ids() and count_by_type()
   * Auxiliary; determination of representative frames: Removed undefined
     behaviour for cutoff values < -1 (PR # 3749)
   * Fixes distances.between not always returning AtomGroup (Issue #3794)
   * Upgrade to chemfiles 0.10.3 (Issue #3798)
-  * fixes guessed_attributes and read_attributes methods of the Topology class, as
+  * fixes guessed_attributes and read_attributes methods of the Topology class, as 
     those methods were giving unexpected results for some topology attributes
     (e.g. bonds, angles) (PR #3779).
 
@@ -601,24 +581,24 @@ Enhancements
     (Issue #3841, PR #3842)
   * MDAnalysis now follows PEP621 (PR #3528)
   * Added a reader for GROMACS TNG files based on PyTNG (PR #3765,
-    Issue #3237, partially addressing Issue #865)
-  * Added ability for hbond analysis to use types when resnames are not
+    Issue #3237, partially addressing Issue #865) 
+  * Added ability for hbond analysis to use types when resnames are not 
     present (Issue #3847)
   * Added explanatory warnings when hbond analysis doesn't find any hbonds
     (Issue #3847)
-  * Improve C content of libxdr Cython, add `read_direct` methods to read
+  * Improve C content of libxdr Cython, add `read_direct` methods to read 
     coordinates, velocities and forces directly into memoryviews of `Timestep`
     attributes, make `TRR` timestep have positions, velocities and forces on
     `__init__`. (Issue #3891 PR  #3892).
   * Improve C content of libdcd Cython (Issue #3882, PR #3888)
   * The timeseries method for exporting coordinates from multiple frames to a
     NumPy array was added to ProtoReader (PR #3890)
   * MDAnalysis now officially supports py3.11 (Issue #3878)
-  * LAMMPSDump Reader optionally unwraps trajectories with image flags upon
+  * LAMMPSDump Reader optionally unwraps trajectories with image flags upon 
     loading (Issue #3843)
   * LAMMPSDump Reader now imports velocities and forces (Issue #3843)
   * Minimum NumPy version for py3.11 is now set to 1.23.2.
-  * Added decorator to check for an empty atom group, applied in topological
+  * Added decorator to check for an empty atom group, applied in topological 
     attributes(Issue #3837)
   * AuxReaders now have a memory_limit parameter to control when memory usage
     warnings are issued. (PR # 3749)

diff --git a/package/MDAnalysis/analysis/rdf.py b/package/MDAnalysis/analysis/rdf.py
@@ -67,6 +67,11 @@
 .. math::
    n_{ab}(r) = \rho g_{ab}(r)
 
+.. versionadded:: 2.11.0
+   The RDF calculation now uses an optimized histogram implementation using
+   Cython and OpenMP, providing 10-15x speedup for large datasets. The optimization
+   uses parallel execution and cache-efficient memory access patterns.
+
 .. _`pair distribution functions`:
    https://en.wikipedia.org/wiki/Pair_distribution_function
 .. _`radial distribution functions`:
@@ -119,6 +124,10 @@ def flatten(arr):
     return aggregated_arr
 
 
+# Import optimized histogram
+from ..lib.c_histogram import histogram as optimized_histogram
+
+
 class InterRDF(AnalysisBase):
     r"""Radial distribution function
 
@@ -358,7 +367,13 @@ def _single_frame(self):
             mask = np.where(attr_ix_a != attr_ix_b)[0]
             dist = dist[mask]
 
-        count, _ = np.histogram(dist, **self.rdf_settings)
+        # Use optimized Cython histogram
+        count, _ = optimized_histogram(
+            dist,
+            bins=self.rdf_settings["bins"],
+            range_vals=self.rdf_settings["range"],
+            use_parallel=(len(dist) > 50000),
+        )
         self.results.count += count
 
         if self.norm == "rdf":
-Original file line number
+Diff line change
@@ Expand Up / @@ -52,3 +52,4 @@ benchmarks/results @@
     .idea
     .vscode
     *.lock
+    venv/