diff --git a/doc/specs/index.md b/doc/specs/index.md index 098f2ce21..0883ea07c 100644 --- a/doc/specs/index.md +++ b/doc/specs/index.md @@ -19,6 +19,7 @@ This is and index/directory of the specifications (specs) for each new module/fe - [logger](./stdlib_logger.html) - Runtime logging system - [optval](./stdlib_optval.html) - Fallback value for optional arguments - [quadrature](./stdlib_quadrature.html) - Numerical integration + - [sorting](.stdlib_sorting.html) - Sorting of rank one arrays - [stats](./stdlib_stats.html) - Descriptive Statistics - [stats_distribution_PRNG](./stdlib_stats_distribution_PRNG.html) - Probability Distributions random number generator - [string\_type](./stdlib_string_type.html) - Basic string support diff --git a/doc/specs/stdlib_sorting.md b/doc/specs/stdlib_sorting.md new file mode 100644 index 000000000..83429166d --- /dev/null +++ b/doc/specs/stdlib_sorting.md @@ -0,0 +1,644 @@ +--- +title: Sorting Procedures +--- + +# The `stdlib_sorting` module + +(TOC) + +## Overview of sorting + +The sorting of collections of data is useful in the analysis of those +collections. +With its absence of generics and limited polymorphism, it is +impractical, in current Fortran, to provide sorting routines for +arbitrary collections of arbitrary types of data. +However Fortran's arrays are by far its most widely used collection, +and arrays of arbitrary types of data can often be sorted in terms of +a single component of intrinsic type. +The Fortran Standard Library therefore provides a module, +`stdlib_sorting`, with procedures to sort arrays of simple intrinsic +numeric types, i.e. the different kinds of integers and reals, the +default assumed length character, and the `stdlib_string_type` +module's `string_type` type. + +## Overview of the module + +The module `stdlib_sorting` defines several public entities one +default integer parameter, `int_size`, and three overloaded +subroutines: `ORD_SORT`, `SORT`, and `SORT_INDEX`. The +overloaded subroutines also each have seven specific names for +versions corresponding to different types of array arguments. + +### The `int_size` parameter + +The `int_size` parameter is used to specify the kind of integer used +in indexing the various arrays. Currently the module sets `int_size` +to the value of `int64` from the `stdlib_kinds` module. + +### The module subroutines + +The `stdlib_sorting` module provides three different overloaded +subroutines intended to sort three different kinds of arrays of +data: +* `ORD_SORT` is intended to sort simple arrays of intrinsic data + that have significant sections that were partially ordered before + the sort; +* `SORT_INDEX` is based on ORD_SORT, but in addition to sorting the + input array, it returns indices that map the original array to its + sorted version. This enables related arrays to be re-ordered in the + same way; and +* `SORT` is intended to sort simple arrays of intrinsic data + that are effectively unordered before the sort. + +#### Licensing + +The Fortran Standard Library is distributed under the MIT +License. However components of the library may be based on code with +additional licensing restriction. In particular `ORD_SORT`, +`SORT_INDEX`, and `SORT` are translations of codes with their +own distribution restrictions. + +The `ORD_SORT` and `SORT_INDEX` subroutines are essentially +translations to Fortran 2008 of the `"rust" sort` of the Rust Language +distributed as part of +[`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs). +The header of the `slice.rs` file has as its licensing requirements: + + Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT + file at the top-level directory of this distribution and at + http://rust-lang.org/COPYRIGHT. + + Licensed under the Apache License, Version 2.0 or the MIT license + , at your + option. This file may not be copied, modified, or distributed + except according to those terms. + +so the license for the `slice.rs` code is compatible with the use of +modified versions of the code in the Fortran Standard Library under +the MIT license. + +The `SORT` subroutine is essentially a translation to Fortran +2008 of the +[`introsort`]((http://www.cs.rpi.edu/~musser/gp/introsort.ps) of David +Musser. David Musser has given permission to include a variant of +`introsort` in the Fortran Standard Library under the MIT license +provided we cite: + + Musser, D.R., “Introspective Sorting and Selection Algorithms,” + Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997). + +as the official source of the algorithm. + + +#### The `ORD_SORT` subroutine + +`ORD_SORT` is a translation of the `"Rust" sort` sorting algorithm +contained in [`slice.rs`] +(https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs). +`"Rust" sort`, in turn, is inspired by the [`timsort` algorithm] +(http://svn.python.org/projects/python/trunk/Objects/listsort.txt) +that Tim Peters created for the Python Language. +`ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`, +and `insertion sort`. It has always at worst O(N Ln(N)) runtime +performance in sorting random data, having a performance about 15-25% +slower than `SORT` on such data. However it has much better +performance than `SORT` on partially sorted data, having O(N) +performance on uniformly increasing or decreasing data. + + +`ORD_SORT` begins by traversing the array starting in its tail +attempting to identify `runs` in the array, where a run is either a +uniformly decreasing sequence, `ARRAY(i-1) > ARRAY(i)`, or a +non-decreasing, `ARRAY(i-1) <= ARRAY(i)`, sequence. Once delimitated +decreasing sequences are reversed in their order. Then, if the +sequence has less than `MIN_RUN` elements, previous elements in the +array are added to the run using `insertion sort` until the run +contains `MIN_RUN` elements or the array is completely processed. As +each run is identified the start and length of the run +is then pushed onto a stack and the stack is then processed using +`merge` until it obeys the stack invariants: + +1. len(i-2) > len(i-1) + len(i) +2. len(i-1) > len(i) + +ensuring that processing the stack is, at worst, of order `O(N +Ln(N))`. However, because of the identification of decreasing and +non-decreasing runs, processing of structured data can be much faster, +with processing of uniformly decreasing or non-decreasing arrays being +of order O(N). The result in our tests is that `ORD_SORT` is about +25% slower than `SORT` on purely random data, depending on +the compiler, but can be `Ln(N)` faster than `SORT` on highly +structured data. As a modified `merge sort`, `ORD_SORT` requires the +use of a "scratch" array, that may be provided as an optional `work` +argument or allocated internally on the stack. + +#### The `SORT_INDEX` subroutine + +The `SORT` and `ORD_SORT` subroutines can sort rank 1 isolated +arrays of intrinsic types, but do nothing for the coordinated sorting +of related data, e.g., multiple related rank 1 arrays, higher rank +arrays, or arrays of derived types. For such related data, what is +useful is an array of indices that maps a rank 1 array to its sorted +form. For such a sort, a stable sort is useful, therefore the module +provides a subroutine, `SORT_INDEX`, that generates such an array of +indices based on the `ORD_SORT` algorithm, in addition to sorting +the input array. + +The logic of `SORT_INDEX` parallels that of `ORD_SORT`, with +additional housekeeping to keep the array of indices consistent with +the sorted positions of the input array. Because of this additional +housekeeping it has slower runtime performance than `ORD_SORT`. +`SORT_INDEX` requires the use of two "scratch" arrays, that may be +provided as optional `work` and `iwork` arguments or allocated +internally on the stack. + +#### The `SORT` subroutine + +`SORT` uses the `introsort` sorting algorithm of David Musser. +`introsort` is a hybrid unstable comparison algorithm combining +`quicksort`, `insertion sort`, and `heap sort`. While this algorithm's +runtime performance is always O(N Ln(N)), it is relatively fast on +randomly ordered data, but does not show the improvement in +performance on partly sorted data found for `ORD_SORT`. + +As with `introsort`, `SORT` is an unstable hybrid algorithm. +First it examines the array and estimates the depth of recursion a +quick sort would require for ideal (random) data, `D = +Ceiling(Ln(N)/Ln(2))`. It then defines a limit to the number of +`quicksort` recursions to be allowed in processing, +`D_limit = factor * D`, where factor is currently 2, and +calls `introsort` proper. `introsort` proper then: + +1. Examines the number of elements remaining to be sorted, and, if + they are less than 16, sorts them using insertion sort and returns; +2. If they are not less than 16, checks whether the current depth of + recursion exceeds `D_limit` and, if it does, processes the remaining + elements with heap sort and returns; +3. If the current depth of recursion does not exceed `D_limit`, then + in effect does a `quicksort` step: + + * Partitions the remaining array using a median of three, + * Calls `introsort` proper on the leftmost partition, + * Calls `introsort` proper on the rightmost partition, and then + returns. + +The resulting algorithm is of order O(N Ln(N)) run time performance +for all inputs. Because it relies on `quicksort`, the coefficient of +the O(N Ln(N)) behavior is typically small compared to other sorting +algorithms on random data. On partially sorted data it can show either +slower `heap sort` performance, or enhanced performance by up to a +factor of six. Still, even when it shows enhanced performance, its +performance on partially sorted data is typically an order of +magnitude slower than `ORD_SORT`. Its memory requirements are also +low, being of order O(Ln(N)), while the memory requirements of +`ORD_SORT` and `SORT_INDEX` are of order O(N). + +### Tentative specifications of the `stdlib_sorting` procedures + +#### `ord_sort` - sorts an input array + +##### Status + +Experimental + +##### Description + +Returns an input `array` with the elements sorted in order of +increasing value. + +##### Syntax + +`call [[stdlib_sorting(module):ord_sort(subroutine)]]ord_sort ( array[, work ] )` + +##### Class + +Generic subroutine. + +##### Arguments + +`array` : shall be a rank one array of any of the types: +`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`, +`real(sp)`, `real(dp)`, `real(qp)`, `character(*)`, or +`type(string_type)`. It is an `intent(inout)` argument. On input it is +the array to be sorted. If both the type of `array` is real and at +least one of the elements is a `NaN`, then the ordering of the result +is undefined. Otherwise on return its elements will be sorted in order +of non-decreasing value. + +`work` (optional): shall be a rank one array of the same type as +array, and shall have at least `size(array)/2` elements. It is an +`intent(inout)` argument. It is intended to be used as "scratch" +memory for internal record keeping. If associated with an array in +static storage, its use can significantly reduce the stack memory +requirements for the code. Its contents on return are undefined. + +##### Notes + +`ORD_SORT` implements a hybrid sorting algorithm combining +`merge sort`, and `insertion sort`. For most purposes it behaves like +a `merge sort`, providing worst case `O(N Ln(N))` run time performance +for most random arrays, that is typically slower than `SORT`. +However, if the array has significant runs of decreasing or +non-decreasing values, performance can be much better than +`SORT`, with `O(N)` behavior on uniformly decreasing, or +non-decreasing arrays. The optional `work` array replaces "scratch" +memory that would otherwise be allocated on the stack. If `array` is of +any type `REAL` the order of its elements on return undefined if any +element of `array` is a `NaN`. Sorting of `CHARACTER(*)` and +`STRING_TYPE` arrays are based on the operator `>`, and not on the +function `LGT`. + + +##### Example + +```fortran + ... + ! Read arrays from sorted files + call read_sorted_file( 'dummy_file1', array1 ) + call read_sorted_file( 'dummy_file2', array2 ) + ! Concatenate the arrays + array = [ array1, array2 ] + ! Sort the resulting array + call ord_sort( array, work ) + ! Process the sorted array + call array_search( array, values ) + ... +``` + +#### `sort` - sorts an input array + +##### Status + +Experimental + +##### Description + +Returns an input array with the elements sorted in order of increasing +value. + +##### Syntax + +`call [[stdlib_sorting(module):sort(subroutine)]]sort ( array )` + +##### Class + +Pure generic subroutine. + +##### Arguments + +`array` : shall be a rank one array of any of the types: +`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`, +`real(sp)`, `real(dp)`, `real(qp)`. `character(*)`, or +`type(string_type)`. It is an `intent(inout)` argument. On return its +input elements will be sorted in order of non-decreasing value. + +##### Notes + +`SORT` implements a hybrid sorting algorithm combining +`quicksort`, `merge sort`, and `insertion sort`. For most purposes it +behaves like a `quicksort` with a median of three partition, providing +good, `O(N Ln(N))`, run time performance for most random arrays, but +defaulting to `merge sort` if the structure of the array results in +the `quicksort` not converging as rapidly as expected. If `array` is of +any type `REAL`, the behavior of the sorting is undefined if any +element of `array` is a `NaN`. Sorting of `CHARACTER(*)` and +`STRING_TYPE` arrays are based on the operators `<`, `<=`, `>`, and +`>=`, and not on the functions `LLT`, `LLE`, `LGT`, or `LGE`. + +##### Example + +```fortran + ... + ! Read random data from a file + call read_file( 'dummy_file', array ) + ! Sort the random data + call sort( array ) + ! Process the sorted data + call array_search( array, values ) + ... +``` + +#### `sort_index` - creates an array of sorting indices for an input array, while also sorting the array. + +##### Status + +Experimental + +##### Description + +Returns the input `array` sorted in the direction requested while +retaining order stability, and an integer array whose elements would +sort the input `array` to produce the output `array`. + +##### Syntax + +`call [[stdlib_sorting(module):sort_index(subroutine)]]sort_index ( array, index[, work, iwork, reverse ] )` + +##### Class + +Generic subroutine. + +##### Arguments + +`array`: shall be a rank one array of any of the types: +`integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`, +`real(sp)`, `real(dp)`, `real(qp)`, `character(*)`, or +`type(string_type)`. It is an `intent(inout)` argument. On input it +will be an array whose sorting indices are to be determined. On return +it will be the sorted array. + +`index`: shall be a rank one integer array of kind `int_size` and of +the size of `array`. It is an `intent(out)` argument. On return it +shall have values that are the indices needed to sort the original +array in the desired direction. + +`work` (optional): shall be a rank one array of any of the same type as +`array`, and shall have at least `size(array)/2` elements. It is an +`intent(inout)` argument. It is intended to be used as "scratch" +memory for internal record keeping. If associated with an array in +static storage, its use can significantly reduce the stack memory +requirements for the code. Its contents on return are undefined. + +`iwork` (optional): shall be a rank one integer array of kind +`int_size`, and shall have at least `size(array)/2` elements. It +is an `intent(inout)` argument. It is intended to be used as "scratch" +memory for internal record keeping. If associated with an array in +static storage, its use can significantly reduce the stack memory +requirements for the code. Its contents on return are undefined. + +`reverse` (optional): shall be a scalar of type default logical. It +is an `intent(in)` argument. If present with a value of `.true.` then +`index` will sort `array` in order of non-increasing values in stable +order. Otherwise index will sort `array` in order of non-decreasing +values in stable order. + +##### Notes + +`SORT_INDEX` implements the hybrid sorting algorithm of `ORD_SORT`, +keeping the values of `index` consistent with the elements of `array` +as it is sorted. As a `merge sort` based algorithm, it is a stable +sorting comparison algorithm. The optional `work` and `iwork` arrays +replace "scratch" memory that would otherwise be allocated on the +stack. If `array` is of any kind of `REAL` the order of the elements in +`index` and `array` on return are undefined if any element of `array` +is a `NaN`. Sorting of `CHARACTER(*)` and `STRING_TYPE` arrays are +based on the operator `>`, and not on the function `LGT`. + +It should be emphasized that the order of `array` will typically be +different on return + + +##### Examples + +Sorting a related rank one array: + +```Fortran + subroutine sort_related_data( a, b, work, index, iwork ) + ! Sort `a`, and also sort `b` to be reorderd the same way as `a` + integer, intent(inout) :: a(:) + integer(int32), intent(inout) :: b(:) ! The same size as a + integer(int32), intent(inout) :: work(:) + integer(int_size), intent(inout) :: index(:) + integer(int_size), intent(inout) :: iwork(:) + ! Find the indices to sort a + call sort_index(a, index(1:size(a)),& + work(1:size(a)/2), iwork(1:size(a)/2)) + ! Sort b based on the sorting of a + b(:) = b( index(1:size(a)) ) + end subroutine sort_related_data +``` + +Sorting a rank 2 array based on the data in a column + +```Fortran + subroutine sort_related_data( array, column, work, index, iwork ) + ! Reorder rows of `array` such that `array(:, column)` is sorted + integer, intent(inout) :: array(:,:) + integer(int32), intent(in) :: column + integer(int32), intent(inout) :: work(:) + integer(int_size), intent(inout) :: index(:) + integer(int_size), intent(inout) :: iwork(:) + integer, allocatable :: dummy(:) + integer :: i + allocate(dummy(size(array, dim=1))) + ! Extract a column of `array` + dummy(:) = array(:, column) + ! Find the indices to sort the column + call sort_index(dummy, index(1:size(dummy)),& + work(1:size(dummy)/2), iwork(1:size(dummy)/2)) + ! Sort a based on the sorting of its column + do i=1, size(array, dim=2) + array(:, i) = array(index(1:size(array, dim=1)), i) + end do + end subroutine sort_related_data +``` + +Sorting an array of a derived type based on the data in one component + +```fortran + subroutine sort_a_data( a_data, a, work, index, iwork ) + ! Sort `a_data` in terms or its component `a` + type(a_type), intent(inout) :: a_data(:) + integer(int32), intent(inout) :: a(:) + integer(int32), intent(inout) :: work(:) + integer(int_size), intent(inout) :: index(:) + integer(int_size), intent(inout) :: iwork(:) + ! Extract a component of `a_data` + a(1:size(a_data)) = a_data(:) % a + ! Find the indices to sort the component + call sort_index(a(1:size(a_data)), index(1:size(a_data)),& + work(1:size(a_data)/2), iwork(1:size(a_data)/2)) + ! Sort a_data based on the sorting of that component + a_data(:) = a_data( index(1:size(a_data)) ) + end subroutine sort_a_data +``` + +#### Specific procedures + +Usually the name of a generic procedure is the most convenient way of +invoking it. However sometimes it is necessary to pass a procedure as +an argument to another procedure. In that case it is usually necessary +to know the name of the specific procedure desired. The following +table lists the specific subroutines and the corresponding types of +their `array` arguments. + +| Generic Subroutine | Specific Subroutine | Array type | +|--------------------|---------------------|-----------------| +| `ORD_SORT` | `INT8_ORD_SORT` | `INTEGER(INT8)` | +| | `INT16_ORD_SORT` | `INTEGER(INT16)` | +| | `INT32_ORD_SORT` | `INTEGER(INT32)` | +| | `INT64_ORD_SORT` | `INTEGER(INT64)` | +| | `SP_ORD_SORT` | `REAL(SP)` | +| | `DP_ORD_SORT` | `REAL(DP)` | +| | `QP_ORD_SORT` | `REAL(QP)` | +| | `CHAR_ORD_SORT` | `CHARACTER(*)` | +| | `STRING_ORD_SORT` | `STRING_TYPE` | +| `SORT` | `INT8_SORT` | `INTEGER(INT8)` | +| | `INT16_SORT` | `INTEGER(INT16)` | +| | `INT32_SORT` | `INTEGER(INT32)` | +| | `INT64_SORT` | `INTEGER(INT64)` | +| | `SP_SORT` | `REAL(SP)` | +| | `DP_SORT` | `REAL(DP)` | +| | `QP_SORT` | `REAL(QP)` | +| | `CHAR_SORT` | `CHARACTER(*)` | +| | `STRING_SORT` | `STRING_TYPE` | +| `SORT_INDEX` | `INT8_SORT_INDEX` | `INTEGER(INT8)` | +| | `INT16_SORT_INDEX` | `INTEGER(INT16)` | +| | `INT32_SORT_INDEX` | `INTEGER(INT32)` | +| | `INT64_SORT_INDEX` | `INTEGER(INT64)` | +| | `SP_SORT_INDEX` | `REAL(SP)` | +| | `DP_SORT_INDEX` | `REAL(DP)` | +| | `QP_SORT_INDEX` | `REAL(QP)` | +| | `CHAR_SORT_INDEX` | `CHARACTER(*)` | +| | `STRING_SORT_INDEX` | `STRING_TYPE` | + + +### Performance benchmarks + +We have performed benchmarks of the procedures on nine different +integer arrays each of size `2**20`: + +* Blocks - the array is divided into six blocks, each of distinct + uniformly increasing integers. +* Decreasing - values decrease uniformly from `2**20-1` to `0`. +* Identical - all integers have the same value of 10. +* Increasing - values increase uniformly from `0` to `2**20-1`. +* Random dense - the integers are generated randomly from a set of + values from `0` to `2**18-1` so duplicates are dense. +* Random order - a set of integers from `0` to `2**20 - 1` in random + order. +* Random sparse - the integers are generated randomly from a set of + values from `0` to `2**22-1` so duplicates are sparse. +* Random-3 - the increasing array has 3 random exchanges of individual + elements. +* Random-10 - the final ten elements of the increasing array are + replaced by random values. + +On three different default character arrays, each of length 4 and of +size `26**4`: + +* Char. Decreasing - values decrease uniformly from `"zzzz"` to + `"aaaa"`. +* Char. Increasing - values decrease uniformly from `"aaaa"` to + `"zzzz"`. +* Char. Random - the set of strings from `"aaaa"` to `"zzzz"` in + random order. + +On three different `string_type` arrays, each of length 4 elements and +of size `26**3`: + +* String Decreasing - values decrease uniformly from `"zzz"` to + `"aaa"`. +* String Increasing - values decrease uniformly from `"aaa"` to + `"zzz"`. +* String Random - the set of strings from `"aaa"` to `"zzz"` in + random order. + +These benchmarks have been performed on two different compilers, both +on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8 +GB 2133 MHz LPDDR3 memory. The first compiler was GNU Fortran +(GCC) 10.2.0, with the following results: + +| Type | Elements | Array Name | Method | Time (s) | +|-------------|----------|-----------------|-------------|-----------| +| Integer | 1048576 | Blocks | Ord_Sort | 0.00738 | +| Integer | 1048576 | Decreasing | Ord_Sort | 0.00380 | +| Integer | 1048576 | Identical | Ord_Sort | 0.00220 | +| Integer | 1048576 | Increasing | Ord_Sort | 0.00209 | +| Integer | 1048576 | Random dense | Ord_Sort | 0.17972 | +| Integer | 1048576 | Random order | Ord_Sort | 0.17503 | +| Integer | 1048576 | Random sparse | Ord_Sort | 0.17340 | +| Integer | 1048576 | Random 3 | Ord_Sort | 0.00847 | +| Integer | 1048576 | Random 10 | Ord_Sort | 0.00484 | +| Character | 456976 | Char. Decrease | Ord_Sort | 0.00763 | +| Character | 456976 | Char. Increase | Ord_Sort | 0.00414 | +| Character | 456976 | Char. Random | Ord_Sort | 0.23746 | +| String_type | 17576 | String Decrease | Ord_Sort | 0.00543 | +| String_type | 17576 | String Increase | Ord_Sort | 0.00347 | +| String_type | 17576 | String Random | Ord_Sort | 0.09461 | +| Integer | 1048576 | Blocks | Sort | 0.10556 | +| Integer | 1048576 | Decreasing | Sort | 0.13348 | +| Integer | 1048576 | Identical | Sort | 0.15719 | +| Integer | 1048576 | Increasing | Sort | 0.05316 | +| Integer | 1048576 | Random dense | Sort | 0.15047 | +| Integer | 1048576 | Random order | Sort | 0.15176 | +| Integer | 1048576 | Random sparse | Sort | 0.15767 | +| Integer | 1048576 | Random 3 | Sort | 0.19907 | +| Integer | 1048576 | Random 10 | Sort | 0.34244 | +| Character | 456976 | Char. Decrease | Sort | 0.30723 | +| Character | 456976 | Char. Increase | Sort | 0.10984 | +| Character | 456976 | Char. Random | Sort | 0.20642 | +| String_type | 17576 | String Decrease | Sort | 0.15101 | +| String_type | 17576 | String Increase | Sort | 0.05569 | +| String_type | 17576 | String Random | Sort | 0.08499 | +| Integer | 1048576 | Blocks | Sort_Index | 0.01163 | +| Integer | 1048576 | Decreasing | Sort_Index | 0.00720 | +| Integer | 1048576 | Identical | Sort_Index | 0.00451 | +| Integer | 1048576 | Increasing | Sort_Index | 0.00452 | +| Integer | 1048576 | Random dense | Sort_Index | 0.20295 | +| Integer | 1048576 | Random order | Sort_Index | 0.20190 | +| Integer | 1048576 | Random sparse | Sort_Index | 0.20221 | +| Integer | 1048576 | Random 3 | Sort_Index | 0.01406 | +| Integer | 1048576 | Random 10 | Sort_Index | 0.00765 | +| Character | 456976 | Char. Decrease | Sort_Index | 0.00912 | +| Character | 456976 | Char. Increase | Sort_Index | 0.00515 | +| Character | 456976 | Char. Random | Sort_Index | 0.24693 | +| String_type | 17576 | String Decrease | Sort_Index | 0.00528 | +| String_type | 17576 | String Increase | Sort_Index | 0.00341 | +| String_type | 17576 | String Random | Sort_Index | 0.09554 | + +The second compiler was Intel(R) Fortran Intel(R) 64 Compiler Classic +for applications running on Intel(R) 64, Version 2021.2.0 Build +20210228_000000, with the following results: + +| Type | Elements | Array Name | Method | Time (s) | +|-------------|----------|-----------------|-------------|-----------| +| Integer | 1048576 | Blocks | Ord_Sort | 0.00320 | +| Integer | 1048576 | Decreasing | Ord_Sort | 0.00142 | +| Integer | 1048576 | Identical | Ord_Sort | 0.00102 | +| Integer | 1048576 | Increasing | Ord_Sort | 0.00158 | +| Integer | 1048576 | Random dense | Ord_Sort | 0.09859 | +| Integer | 1048576 | Random order | Ord_Sort | 0.09704 | +| Integer | 1048576 | Random sparse | Ord_Sort | 0.09599 | +| Integer | 1048576 | Random 3 | Ord_Sort | 0.00396 | +| Integer | 1048576 | Random 10 | Ord_Sort | 0.00183 | +| Character | 456976 | Char. Decrease | Ord_Sort | 0.00763 | +| Character | 456976 | Char. Increase | Ord_Sort | 0.00341 | +| Character | 456976 | Char. Random | Ord_Sort | 0.21991 | +| String_type | 17576 | String Decrease | Ord_Sort | 0.01957 | +| String_type | 17576 | String Increase | Ord_Sort | 0.00573 | +| String_type | 17576 | String Random | Ord_Sort | 0.37850 | +| Integer | 1048576 | Blocks | Sort | 0.03668 | +| Integer | 1048576 | Decreasing | Sort | 0.04073 | +| Integer | 1048576 | Identical | Sort | 0.03884 | +| Integer | 1048576 | Increasing | Sort | 0.01279 | +| Integer | 1048576 | Random dense | Sort | 0.06945 | +| Integer | 1048576 | Random order | Sort | 0.07151 | +| Integer | 1048576 | Random sparse | Sort | 0.07224 | +| Integer | 1048576 | Random 3 | Sort | 0.07954 | +| Integer | 1048576 | Random 10 | Sort | 0.14395 | +| Character | 456976 | Char. Decrease | Sort | 0.30367 | +| Character | 456976 | Char. Increase | Sort | 0.11316 | +| Character | 456976 | Char. Random | Sort | 0.20233 | +| String_type | 17576 | String Decrease | Sort | 0.64479 | +| String_type | 17576 | String Increase | Sort | 0.23737 | +| String_type | 17576 | String Random | Sort | 0.31361 | +| Integer | 1048576 | Blocks | Sort_Index | 0.00643 | +| Integer | 1048576 | Decreasing | Sort_Index | 0.00219 | +| Integer | 1048576 | Identical | Sort_Index | 0.00126 | +| Integer | 1048576 | Increasing | Sort_Index | 0.00130 | +| Integer | 1048576 | Random dense | Sort_Index | 0.12911 | +| Integer | 1048576 | Random order | Sort_Index | 0.13024 | +| Integer | 1048576 | Random sparse | Sort_Index | 0.12956 | +| Integer | 1048576 | Random 3 | Sort_Index | 0.00781 | +| Integer | 1048576 | Random 10 | Sort_Index | 0.00281 | +| Character | 456976 | Char. Decrease | Sort_Index | 0.00779 | +| Character | 456976 | Char. Increase | Sort_Index | 0.00393 | +| Character | 456976 | Char. Random | Sort_Index | 0.22561 | +| String_type | 17576 | String Decrease | Sort_Index | 0.01878 | +| String_type | 17576 | String Increase | Sort_Index | 0.00543 | +| String_type | 17576 | String Random | Sort_Index | 0.37748 | + + diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 96a6ebcce..92f06fc04 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -9,6 +9,10 @@ set(fppFiles stdlib_linalg.fypp stdlib_linalg_diag.fypp stdlib_optval.fypp + stdlib_sorting.fypp + stdlib_sorting_ord_sort.fypp + stdlib_sorting_sort.fypp + stdlib_sorting_sort_index.fypp stdlib_stats.fypp stdlib_stats_corr.fypp stdlib_stats_cov.fypp diff --git a/src/Makefile.manual b/src/Makefile.manual index 85541e038..fb176a235 100644 --- a/src/Makefile.manual +++ b/src/Makefile.manual @@ -9,6 +9,10 @@ SRCFYPP =\ stdlib_quadrature.fypp \ stdlib_quadrature_trapz.fypp \ stdlib_quadrature_simps.fypp \ + stdlib_sorting.fypp \ + stdlib_sorting_ord_sort.fypp \ + stdlib_sorting_sort.fypp \ + stdlib_sorting_sort_index.fypp \ stdlib_stats.fypp \ stdlib_stats_corr.fypp \ stdlib_stats_cov.fypp \ @@ -79,6 +83,15 @@ stdlib_quadrature_trapz.o: \ stdlib_quadrature.o \ stdlib_error.o \ stdlib_kinds.o +stdlib_sorting.o: \ + stdlib_kinds.o \ + stdlib_string_type.o +stdlib_sorting_ord_sort.o: \ + stdlib_sorting.o +stdlib_sorting_sort.o: \ + stdlib_sorting.o +stdlib_sorting_sort_index.o: \ + stdlib_sorting.o stdlib_stats.o: \ stdlib_kinds.o stdlib_stats_corr.o: \ diff --git a/src/stdlib_sorting.fypp b/src/stdlib_sorting.fypp new file mode 100644 index 000000000..8908ccfb7 --- /dev/null +++ b/src/stdlib_sorting.fypp @@ -0,0 +1,573 @@ +#:include "common.fypp" + +!! Licensing: +!! +!! This file is subjec† both to the Fortran Standard Library license, and +!! to additional licensing requirements as it contains translations of +!! other software. +!! +!! The Fortran Standard Library, including this file, is distributed under +!! the MIT license that should be included with the library's distribution. +!! +!! Copyright (c) 2021 Fortran stdlib developers +!! +!! Permission is hereby granted, free of charge, to any person obtaining a +!! copy of this software and associated documentation files (the +!! "Software"), to deal in the Software without restriction, including +!! without limitation the rights to use, copy, modify, merge, publish, +!! distribute, sublicense, and/or sellcopies of the Software, and to permit +!! persons to whom the Software is furnished to do so, subject to the +!! following conditions: +!! +!! The above copyright notice and this permission notice shall be included +!! in all copies or substantial portions of the Software. +!! +!! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +!! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +!! MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +!! IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +!! CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +!! TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +!! SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +!! +!! Two of the generic subroutines, `ORD_SORT` and `SORT_INDEX`, are +!! substantially translations to Fortran 2008 of the `"Rust" sort` sorting +!! routines in +!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs) +!! The `rust sort` implementation is distributed with the header: +!! +!! Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT +!! file at the top-level directory of this distribution and at +!! http://rust-lang.org/COPYRIGHT. +!! +!! Licensed under the Apache License, Version 2.0 or the MIT license +!! , at your +!! option. This file may not be copied, modified, or distributed +!! except according to those terms. +!! +!! so the license for the original`slice.rs` code is compatible with the use +!! of modified versions of the code in the Fortran Standard Library under +!! the MIT license. +!! +!! One of the generic subroutines, `SORT`, is substantially a +!! translation to Fortran 2008, of the `introsort` of David Musser. +!! David Musser has given permission to include a variant of `introsort` +!! in the Fortran Standard Library under the MIT license provided +!! we cite: +!! +!! Musser, D.R., “Introspective Sorting and Selection Algorithms,” +!! Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997). +!! +!! as the official source of the algorithm. + +module stdlib_sorting +!! This module implements overloaded sorting subroutines named `ORD_SORT`, +!! `SORT_INDEX`, and `SORT`, that each can be used to sort four kinds +!! of `INTEGER` arrays and three kinds of `REAL` arrays. By default, sorting +!! is in order of increasing value, though `SORT_INDEX` has the option of +!! sorting in order of decresasing value. All the subroutines have worst +!! case run time performance of `O(N Ln(N))`, but on largely sorted data +!! `ORD_SORT` and `SORT_INDEX` can have a run time performance of `O(N)`. +!! +!! `ORD_SORT` is a translation of the `"Rust" sort` sorting algorithm in +!! `slice.rs`: +!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs +!! which in turn is inspired by the `timsort` algorithm of Tim Peters, +!! http://svn.python.org/projects/python/trunk/Objects/listsort.txt. +!! `ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`, +!! and `insertion sort`. It is always at worst O(N Ln(N)) in sorting random +!! data, having a performance about 25% slower than `SORT` on such +!! data, but has much better performance than `SORT` on partially +!! sorted data, having O(N) performance on uniformly non-increasing or +!! non-decreasing data. +!! +!! `SORT_INDEX` is a modification of `ORD_SORT` so that in addition to +!! sorting the input array, it returns the indices that map to a +!! stable sort of the original array. These indices are +!! intended to be used to sort data that is correlated with the input +!! array, e.g., different arrays in a database, different columns of a +!! rank 2 array, different elements of a derived type. It is less +!! efficient than `ORD_SORT` at sorting a simple array. +!! +!! `SORT` uses the `INTROSORT` sorting algorithm of David Musser, +!! http://www.cs.rpi.edu/~musser/gp/introsort.ps. `introsort` is a hybrid +!! unstable comparison algorithm combining `quicksort`, `insertion sort`, and +!! `heap sort`. While this algorithm is always O(N Ln(N)) it is relatively +!! fast on randomly ordered data, but inconsistent in performance on partly +!! sorted data, sometimes having `merge sort` performance, sometimes having +!! better than `quicksort` performance. `UNORD_SOORT` is about 25% +!! more efficient than `ORD_SORT` at sorting purely random data, but af an +!! order of `Ln(N)` less efficient at sorting partially sorted data. + + use stdlib_kinds, only: & + int8, & + int16, & + int32, & + int64, & + sp, & + dp, & + qp + + use stdlib_string_type, only: string_type, assignment(=), operator(>), & + operator(>=), operator(<), operator(<=) + + implicit none + private + + public :: int_size + + integer, parameter :: int_size = int64 !! Integer kind for indexing + +! Constants for use by tim_sort + integer, parameter :: & +! The maximum number of entries in a run stack, good for an array of +! 2**64 elements see +! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + max_merge_stack = int( ceiling( log( 2._dp**64 ) / & + log(1.6180339887_dp) ) ) + + type run_type +!! Version: experimental +!! +!! Used to pass state around in a stack among helper functions for the +!! `ORD_SORT` and `SORT_INDEX` algorithms + integer(int_size) :: base = 0 + integer(int_size) :: len = 0 + end type run_type + + public ord_sort +!! Version: experimental +!! +!! The generic subroutine implementing the `ORD_SORT` algorithm to return +!! an input array with its elements sorted in order of non-decreasing +!! value. Its use has the syntax: +!! +!! call ord_sort( array[, work] ) +!! +!! with the arguments: +!! +!! * array: the rank 1 array to be sorted. It is an `intent(inout)` +!! argument of any of the types `integer(int8)`, `integer(int16)`, +!! `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`, +!! `real(real128)`, or `character(*)`. If both the type of `array` is +!! real and at least one of the elements is a `NaN`, then the ordering +!! of the result is undefined. Otherwise it is defined to be the +!! original elements in non-decreasing order. +!! +!! * work (optional): shall be a rank 1 array of the same type as +!! `array`, and shall have at least `size(array)/2` elements. It is an +!! `intent(inout)` argument to be used as "scratch" memory +!! for internal record keeping. If associated with an array in static +!! storage, its use can significantly reduce the stack memory requirements +!! for the code. Its value on return is undefined. +!! +!!#### Example +!! +!!```fortran +!! ... +!! ! Read arrays from sorted files +!! call read_sorted_file( 'dummy_file1', array1 ) +!! call read_sorted_file( 'dummy_file2', array2 ) +!! ! Concatenate the arrays +!! allocate( array( size(array1) + size(array2) ) ) +!! array( 1:size(array1) ) = array1(:) +!! array( size(array1)+1:size(array1)+size(array2) ) = array2(:) +!! ! Sort the resulting array +!! call ord_sort( array, work ) +!! ! Process the sorted array +!! call array_search( array, values ) +!! ... +!!``` + + public sort +!! Version: experimental +!! +!! The generic subroutine implementing the `SORT` algorithm to return +!! an input array with its elements sorted in order of non-decreasing +!! value. Its use has the syntax: +!! +!! call sort( array ) +!! +!! with the arguments: +!! +!! * array: the rank 1 array to be sorted. It is an `intent(inout)` +!! argument of any of the types `integer(int8)`, `integer(int16)`, +!! `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`, +!! `real(real128)`, or `character(*)`. If both the type of `array` is +!! real and at least one of the elements is a `NaN`, then the ordering +!! of the result is undefined. Otherwise it is defined to be the +!! original elements in non-decreasing order. +!! +!!#### Example +!! +!!```fortran +!! ... +!! ! Read random data from a file +!! call read_file( 'dummy_file', array ) +!! ! Sort the random data +!! call sort( array ) +!! ! Process the sorted data +!! call array_search( array, values ) +!! ... +!!``` + + public sort_index +!! Version: experimental +!! +!! The generic subroutine implementing the `SORT_INDEX` algorithm to +!! return an index array whose elements would sort the input array in the +!! desired direction. It is primarily intended to be used to sort a +!! derived type array based on the values of a component of the array. +!! Its use has the syntax: +!! +!! call sort_index( array, index[, work, iwork, reverse ] ) +!! +!! with the arguments: +!! +!! * array: the rank 1 array to be sorted. It is an `intent(inout)` +!! argument of any of the types `integer(int8)`, `integer(int16)`, +!! `integer(int32)`, `integer(int64)`, `real(real32)`, `real(real64)`, +!! `real(real128)`, or `character(*)`. If both the type of `array` is +!! real and at least one of the elements is a `NaN`, then the ordering +!! of the `array` and `index` results is undefined. Otherwise it is +!! defined to be as specified by reverse. +!! +!! * index: a rank 1 array of sorting indices. It is an `intent(inout)` +!! argument of the type `integer(int_size)`. Its size shall be the +!! same as `array`. On return, if defined, its elements would +!! sort the input `array` in the direction specified by `reverse`. +!! +!! * work (optional): shall be a rank 1 array of the same type as +!! `array`, and shall have at least `size(array)/2` elements. It is an +!! `intent(inout)` argument to be used as "scratch" memory +!! for internal record keeping. If associated with an array in static +!! storage, its use can significantly reduce the stack memory requirements +!! for the code. Its value on return is undefined. +!! +!! * iwork (optional): shall be a rank 1 integer array of kind `int_size`, +!! and shall have at least `size(array)/2` elements. It is an +!! `intent(inout)` argument to be used as "scratch" memory +!! for internal record keeping. If associated with an array in static +!! storage, its use can significantly reduce the stack memory requirements +!! for the code. Its value on return is undefined. +!! +!! * `reverse` (optional): shall be a scalar of type default logical. It +!! is an `intent(in)` argument. If present with a value of `.true.` then +!! `index` will sort `array` in order of non-increasing values in stable +!! order. Otherwise index will sort `array` in order of non-decreasing +!! values in stable order. +!! +!!#### Examples +!! +!! Sorting a related rank one array: +!! +!!```Fortran +!! subroutine sort_related_data( a, b, work, index, iwork ) +!! ! Sort `b` in terms or its related array `a` +!! integer, intent(inout) :: a(:) +!! integer(int32), intent(inout) :: b(:) ! The same size as a +!! integer(int32), intent(inout) :: work(:) +!! integer(int_size), intent(inout) :: index(:) +!! integer(int_size), intent(inout) :: iwork(:) +!! ! Find the indices to sort a +!! call sort_index(a, index(1:size(a)),& +!! work(1:size(a)/2), iwork(1:size(a)/2)) +!! ! Sort b based on the sorting of a +!! b(:) = b( index(1:size(a)) ) +!! end subroutine sort_related_data +!!``` +!! +!! Sorting a rank 2 array based on the data in a column +!! +!!```Fortran +!! subroutine sort_related_data( array, column, work, index, iwork ) +!! ! Sort `a_data` in terms or its component `a` +!! integer, intent(inout) :: a(:,:) +!! integer(int32), intent(in) :: column +!! integer(int32), intent(inout) :: work(:) +!! integer(int_size), intent(inout) :: index(:) +!! integer(int_size), intent(inout) :: iwork(:) +!! integer, allocatable :: dummy(:) +!! integer :: i +!! allocate(dummy(size(a, dim=1))) +!! ! Extract a component of `a_data` +!! dummy(:) = a(:, column) +!! ! Find the indices to sort the column +!! call sort_index(dummy, index(1:size(dummy)),& +!! work(1:size(dummy)/2), iwork(1:size(dummy)/2)) +!! ! Sort a based on the sorting of its column +!! do i=1, size(a, dim=2) +!! a(:, i) = a(index(1:size(a, dim=1)), i) +!! end do +!! end subroutine sort_related_data +!!``` +!! +!! Sorting an array of a derived type based on the dsta in one component +!!```fortran +!! subroutine sort_a_data( a_data, a, work, index, iwork ) +!! ! Sort `a_data` in terms or its component `a` +!! type(a_type), intent(inout) :: a_data(:) +!! integer(int32), intent(inout) :: a(:) +!! integer(int32), intent(inout) :: work(:) +!! integer(int_size), intent(inout) :: index(:) +!! integer(int_size), intent(inout) :: iwork(:) +!! ! Extract a component of `a_data` +!! a(1:size(a_data)) = a_data(:) % a +!! ! Find the indices to sort the component +!! call sort_index(a(1:size(a_data)), index(1:size(a_data)),& +!! work(1:size(a_data)/2), iwork(1:size(a_data)/2)) +!! ! Sort a_data based on the sorting of that component +!! a_data(:) = a_data( index(1:size(a_data)) ) +!! end subroutine sort_a_data +!!``` + +#:for k1 in INT_KINDS + public ${k1}$_ord_sort +!! Version: experimental +!! +!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array` +!! argument of type `integer(${k1}$)`. + + public ${k1}$_sort +!! Version: experimental +!! +!! `${k1}$_sort` is the specific `SORT` subroutine for an `array` +!! argument of type `integer(${k1}$)`. + + public ${k1}$_sort_index +!! Version: experimental +!! +!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array` +!! argument of type `integer(${k1}$)`. + +#:endfor + +#:for k1 in REAL_KINDS + public ${k1}$_ord_sort +!! Version: experimental +!! +!! `${k1}$_ord_sort` is the specific `ORD_SORT` subroutine for an `array` +!! argument of type `real(${k1}$)`. + + public ${k1}$_sort +!! Version: experimental +!! +!! `${k1}$_sort` is the specific `SORT` subroutine for an `array` +!! argument of type `real(${k1}$)`. + + public ${k1}$_sort_index +!! Version: experimental +!! +!! `${k1}$_sort_index` is the specific `SORT_INDEX` subroutine for an `array` +!! argument of type `real(${k1}$)`. + +#:endfor + + public char_ord_sort + + public char_sort + + public char_sort_index + + public string_ord_sort + + public string_sort + + public string_sort_index + + interface ord_sort +!! Version: experimental +!! +!! The generic subroutine interface implementing the `ORD_SORT` algorithm, +!! a translation to Fortran 2008, of the `"Rust" sort` algorithm found in +!! `slice.rs` +!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +!! `ORD_SORT` is a hybrid stable comparison algorithm combining `merge sort`, +!! and `insertion sort`. It is always at worst O(N Ln(N)) in sorting random +!! data, having a performance about 25% slower than `SORT` on such +!! data, but has much better performance than `SORT` on partially +!! sorted data, having O(N) performance on uniformly non-increasing or +!! non-decreasing data. + +#:for k1 in INT_KINDS + module subroutine ${k1}$_ord_sort( array, work ) +!! Version: experimental +!! +!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` + integer(${k1}$), intent(inout) :: array(0:) + integer(${k1}$), intent(inout), optional :: work(0:) + end subroutine ${k1}$_ord_sort + +#:endfor + +#:for k1 in REAL_KINDS + module subroutine ${k1}$_ord_sort( array, work ) + !! Version: experimental +!! +!! `${k1}$_ord_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` + real(${k1}$), intent(inout) :: array(0:) + real(${k1}$), intent(inout), optional :: work(0:) + end subroutine ${k1}$_ord_sort + +#:endfor + module subroutine char_ord_sort( array, work ) +!! Version: experimental +!! +!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` + character(len=*), intent(inout) :: array(0:) + character(len=len(array)), intent(inout), optional :: work(0:) + end subroutine char_ord_sort + + module subroutine string_ord_sort( array, work ) +!! Version: experimental +!! +!! `char_ord_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` + type(string_type), intent(inout) :: array(0:) + type(string_type), intent(inout), optional :: work(0:) + end subroutine string_ord_sort + + end interface ord_sort + + interface sort +!! Version: experimental +!! +!! The generic subroutine interface implementing the `SORT` algorithm, based +!! on the `introsort` of David Musser. + +#:for k1 in INT_KINDS + pure module subroutine ${k1}$_sort( array ) +!! Version: experimental +!! +!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)` +!! using a hybrid sort based on the `introsort` of David Musser. +!! The algorithm is of order O(N Ln(N)) for all inputs. +!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +!! behavior is small for random data compared to other sorting algorithms. + integer(${k1}$), intent(inout) :: array(0:) + end subroutine ${k1}$_sort + +#:endfor + +#:for k1 in REAL_KINDS + pure module subroutine ${k1}$_sort( array ) +!! Version: experimental +!! +!! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)` +!! using a hybrid sort based on the `introsort` of David Musser. +!! The algorithm is of order O(N Ln(N)) for all inputs. +!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +!! behavior is small for random data compared to other sorting algorithms. + real(${k1}$), intent(inout) :: array(0:) + end subroutine ${k1}$_sort + +#:endfor + + pure module subroutine char_sort( array ) +!! Version: experimental +!! +!! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)` +!! using a hybrid sort based on the `introsort` of David Musser. +!! The algorithm is of order O(N Ln(N)) for all inputs. +!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +!! behavior is small for random data compared to other sorting algorithms. + character(len=*), intent(inout) :: array(0:) + end subroutine char_sort + + pure module subroutine string_sort( array ) +!! Version: experimental +!! +!! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TYPE` +!! using a hybrid sort based on the `introsort` of David Musser. +!! The algorithm is of order O(N Ln(N)) for all inputs. +!! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +!! behavior is small for random data compared to other sorting algorithms. + type(string_type), intent(inout) :: array(0:) + end subroutine string_sort + + end interface sort + + interface sort_index +!! Version: experimental +!! +!! The generic subroutine interface implementing the `SORT_INDEX` algorithm, +!! based on the `"Rust" sort` algorithm found in `slice.rs` +!! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +!! but modified to return an array of indices that would provide a stable +!! sort of the rank one `ARRAY` input. The indices by default correspond to a +!! non-decreasing sort, but if the optional argument `REVERSE` is present +!! with a value of `.TRUE.` the indices correspond to a non-increasing sort. + +#:for k1 in INT_KINDS + module subroutine ${k1}$_sort_index( array, index, work, iwork, & + reverse ) +!! Version: experimental +!! +!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `INTEGER(${k1}$)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` +!! and returns the sorted `ARRAY` and an array `INDEX of indices in the +!! order that would sort the input `ARRAY` in the desired direction. + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + integer(${k1}$), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + end subroutine ${k1}$_sort_index + +#:endfor + +#:for k1 in REAL_KINDS + module subroutine ${k1}$_sort_index( array, index, work, iwork, & + reverse ) +!! Version: experimental +!! +!! `${k1}$_sort_index( array )` sorts an input `ARRAY` of type `REAL(${k1}$)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` +!! and returns the sorted `ARRAY` and an array `INDEX of indices in the +!! order that would sort the input `ARRAY` in the desired direction. + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + real(${k1}$), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + end subroutine ${k1}$_sort_index + +#:endfor + module subroutine char_sort_index( array, index, work, iwork, & + reverse ) +!! Version: experimental +!! +!! `char_sort_index( array )` sorts an input `ARRAY` of type `CHARACTER(*)` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` +!! and returns the sorted `ARRAY` and an array `INDEX of indices in the +!! order that would sort the input `ARRAY` in the desired direction. + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + character(len=len(array)), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + end subroutine char_sort_index + + module subroutine string_sort_index( array, index, work, iwork, & + reverse ) +!! Version: experimental +!! +!! `string_sort_index( array )` sorts an input `ARRAY` of type `STRING_TYPE` +!! using a hybrid sort based on the `'Rust" sort` algorithm found in `slice.rs` +!! and returns the sorted `ARRAY` and an array `INDEX of indices in the +!! order that would sort the input `ARRAY` in the desired direction. + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + type(string_type), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + end subroutine string_sort_index + + end interface sort_index + + +end module stdlib_sorting diff --git a/src/stdlib_sorting_ord_sort.fypp b/src/stdlib_sorting_ord_sort.fypp new file mode 100644 index 000000000..eb0e58480 --- /dev/null +++ b/src/stdlib_sorting_ord_sort.fypp @@ -0,0 +1,1419 @@ +#! Integer kinds to be considered during templating +#:set INT_KINDS = ["int8", "int16", "int32", "int64"] + +#! Real kinds to be considered during templating +#:set REAL_KINDS = ["sp", "dp", "qp"] + +!! Licensing: +!! +!! This file is subjec† both to the Fortran Standard Library license, and +!! to additional licensing requirements as it contains translations of +!! other software. +!! +!! The Fortran Standard Library, including this file, is distributed under +!! the MIT license that should be included with the library's distribution. +!! +!! Copyright (c) 2021 Fortran stdlib developers +!! +!! Permission is hereby granted, free of charge, to any person obtaining a +!! copy of this software and associated documentation files (the +!! "Software"), to deal in the Software without restriction, including +!! without limitation the rights to use, copy, modify, merge, publish, +!! distribute, sublicense, and/or sellcopies of the Software, and to permit +!! persons to whom the Software is furnished to do so, subject to the +!! following conditions: +!! +!! The above copyright notice and this permission notice shall be included +!! in all copies or substantial portions of the Software. +!! +!! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +!! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +!! MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +!! IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +!! CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +!! TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +!! SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +!! +!! The generic subroutine, `ORD_SORT`, is substantially a translation to +!! Fortran 2008 of the `"Rust" sort` sorting routines in +!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs) +!! The `rust sort` implementation is distributed with the header: +!! +!! Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT +!! file at the top-level directory of this distribution and at +!! http://rust-lang.org/COPYRIGHT. +!! +!! Licensed under the Apache License, Version 2.0 or the MIT license +!! , at your +!! option. This file may not be copied, modified, or distributed +!! except according to those terms. +!! +!! so the license for the original`slice.rs` code is compatible with the use +!! of modified versions of the code in the Fortran Standard Library under +!! the MIT license. + +submodule(stdlib_sorting) stdlib_sorting_ord_sort + + implicit none + +contains + +#:for k1 in INT_KINDS + + module subroutine ${k1}$_ord_sort( array, work ) +! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in +! `slice.rs` +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version in turn is a simplification of the Timsort algorithm +! described in +! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size and +! initially processed by an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original `listsort.txt`, and an optional `work` array to be used as +! scratch memory. + integer(${k1}$), intent(inout) :: array(0:) + integer(${k1}$), intent(inout), optional :: work(0:) + + integer(${k1}$), allocatable :: buf(:) + integer(int_size) :: array_size + integer :: stat + + if ( present(work) ) then +! Use the work array as scratch memory + call merge_sort( array, work ) + else +! Allocate a buffer to use as scratch memory. + array_size = size( array, kind=int_size ) + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of buffer failed." + call merge_sort( array, buf ) + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array ) +! Sorts `ARRAY` using an insertion sort. + integer(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, j + integer(${k1}$) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. + + integer(${k1}$), intent(inout) :: array(0:) + + integer(${k1}$) :: tmp + integer(int_size) :: i + + tmp = array(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + end do find_hole + array(i-1) = tmp + + end subroutine insert_head + + + subroutine merge_sort( array, buf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. + + integer(${k1}$), intent(inout) :: array(0:) + integer(${k1}$), intent(inout) :: buf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least +! min_run elements. Slices of up to this length are sorted using insertion +! sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array ) + return + end if + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + integer(${k1}$), intent(inout) :: buf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter ! check that it is stable + buf(0:array_len-mid-1) = array(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array ) +! Reverse a segment of an array in place + integer(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: lo, hi + integer(${k1}$) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine ${k1}$_ord_sort + +#:endfor + + +#:for k1 in REAL_KINDS + + module subroutine ${k1}$_ord_sort( array, work ) +! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in +! `slice.rs` +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version in turn is a simplification of the Timsort algorithm +! described in +! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size and +! initially processed by an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original `listsort.txt`, and an optional `work` array to be used as +! scratch memory. + real(${k1}$), intent(inout) :: array(0:) + real(${k1}$), intent(inout), optional :: work(0:) + + real(${k1}$), allocatable :: buf(:) + integer(int_size) :: array_size + integer :: stat + + if ( present(work) ) then +! Use the work array as scratch memory + call merge_sort( array, work ) + else +! Allocate a buffer to use as scratch memory. + array_size = size( array, kind=int_size ) + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of buffer failed." + call merge_sort( array, buf ) + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array ) +! Sorts `ARRAY` using an insertion sort. + real(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, j + real(${k1}$) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. + + real(${k1}$), intent(inout) :: array(0:) + + real(${k1}$) :: tmp + integer(int_size) :: i + + tmp = array(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + end do find_hole + array(i-1) = tmp + + end subroutine insert_head + + + subroutine merge_sort( array, buf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. + + real(${k1}$), intent(inout) :: array(0:) + real(${k1}$), intent(inout) :: buf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least +! min_run elements. Slices of up to this length are sorted using insertion +! sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array ) + return + end if + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + real(${k1}$), intent(inout) :: buf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter ! check that it is stable + buf(0:array_len-mid-1) = array(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array ) +! Reverse a segment of an array in place + real(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: lo, hi + real(${k1}$) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine ${k1}$_ord_sort + +#:endfor + + + module subroutine char_ord_sort( array, work ) +! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in +! `slice.rs` +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version in turn is a simplification of the Timsort algorithm +! described in +! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size and +! initially processed by an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original `listsort.txt`, and an optional `work` array to be used as +! scratch memory. + character(len=*), intent(inout) :: array(0:) + character(len=len(array)), intent(inout), optional :: work(0:) + + character(len=:), allocatable :: buf(:) + integer(int_size) :: array_size + integer :: stat + + if ( present(work) ) then +! Use the work array as scratch memory + call merge_sort( array, work ) + else +! Allocate a buffer to use as scratch memory. + array_size = size( array, kind=int_size ) + allocate( character(len=len(array)) :: buf(0:array_size/2-1), & + stat=stat ) + if ( stat /= 0 ) error stop "Allocation of buffer failed." + call merge_sort( array, buf ) + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array ) +! Sorts `ARRAY` using an insertion sort. + character(len=*), intent(inout) :: array(0:) + + integer(int_size) :: i, j + character(len=len(array)) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. + + character(len=*), intent(inout) :: array(0:) + + character(len=len(array)) :: tmp + integer(int_size) :: i + + tmp = array(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + end do find_hole + array(i-1) = tmp + + end subroutine insert_head + + + subroutine merge_sort( array, buf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. + + character(len=*), intent(inout) :: array(0:) + character(len=len(array)), intent(inout) :: buf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least +! min_run elements. Slices of up to this length are sorted using insertion +! sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array ) + return + end if + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + character(len=len(array)), intent(inout) :: buf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter ! check that it is stable + buf(0:array_len-mid-1) = array(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array ) +! Reverse a segment of an array in place + character(len=*), intent(inout) :: array(0:) + + integer(int_size) :: lo, hi + character(len=len(array)) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine char_ord_sort + + + module subroutine string_ord_sort( array, work ) +! A translation to Fortran 2008, of the `"Rust" sort` algorithm found in +! `slice.rs` +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version in turn is a simplification of the Timsort algorithm +! described in +! https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size and +! initially processed by an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original `listsort.txt`, and an optional `work` array to be used as +! scratch memory. + type(string_type), intent(inout) :: array(0:) + type(string_type), intent(inout), optional :: work(0:) + + type(string_type), allocatable :: buf(:) + integer(int_size) :: array_size + integer :: stat + + if ( present(work) ) then +! Use the work array as scratch memory + call merge_sort( array, work ) + else +! Allocate a buffer to use as scratch memory. + array_size = size( array, kind=int_size ) + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of buffer failed." + call merge_sort( array, buf ) + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array ) +! Sorts `ARRAY` using an insertion sort. + type(string_type), intent(inout) :: array(0:) + + integer(int_size) :: i, j + type(string_type) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. + + type(string_type), intent(inout) :: array(0:) + + type(string_type) :: tmp + integer(int_size) :: i + + tmp = array(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + end do find_hole + array(i-1) = tmp + + end subroutine insert_head + + + subroutine merge_sort( array, buf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. + + type(string_type), intent(inout) :: array(0:) + type(string_type), intent(inout) :: buf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least +! min_run elements. Slices of up to this length are sorted using insertion +! sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array ) + return + end if + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + type(string_type), intent(inout) :: buf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter ! check that it is stable + buf(0:array_len-mid-1) = array(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array ) +! Reverse a segment of an array in place + type(string_type), intent(inout) :: array(0:) + + integer(int_size) :: lo, hi + type(string_type) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine string_ord_sort + +end submodule stdlib_sorting_ord_sort + diff --git a/src/stdlib_sorting_sort.fypp b/src/stdlib_sorting_sort.fypp new file mode 100644 index 000000000..d0b803bf9 --- /dev/null +++ b/src/stdlib_sorting_sort.fypp @@ -0,0 +1,740 @@ +#! Integer kinds to be considered during templating +#:set INT_KINDS = ["int8", "int16", "int32", "int64"] + +#! Real kinds to be considered during templating +#:set REAL_KINDS = ["sp", "dp", "qp"] + +!! Licensing: +!! +!! This file is subjec† both to the Fortran Standard Library license, and +!! to additional licensing requirements as it contains translations of +!! other software. +!! +!! The Fortran Standard Library, including this file, is distributed under +!! the MIT license that should be included with the library's distribution. +!! +!! Copyright (c) 2021 Fortran stdlib developers +!! +!! Permission is hereby granted, free of charge, to any person obtaining a +!! copy of this software and associated documentation files (the +!! "Software"), to deal in the Software without restriction, including +!! without limitation the rights to use, copy, modify, merge, publish, +!! distribute, sublicense, and/or sellcopies of the Software, and to permit +!! persons to whom the Software is furnished to do so, subject to the +!! following conditions: +!! +!! The above copyright notice and this permission notice shall be included +!! in all copies or substantial portions of the Software. +!! +!! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +!! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +!! MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +!! IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +!! CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +!! TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +!! SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +!! +!! The generic subroutine, `SORT`, is substantially a +!! translation to Fortran 2008, of the `introsort` of David Musser. +!! David Musser has given permission to include a variant of `introsort` +!! in the Fortran Standard Library under the MIT license provided +!! we cite: +!! +!! Musser, D.R., “Introspective Sorting and Selection Algorithms,” +!! Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997). +!! +!! as the official source of the algorithm. + +submodule(stdlib_sorting) stdlib_sorting_sort +!! This submodule implements the overloaded sorting subroutine `SORT` +!! that can be used to sort four kinds of `INTEGER` arrays and three kinds +!! of `REAL` arrays. Sorting is in order of increasing value, with the worst +!! case run time performance of `O(N Ln(N))`. +!! +!! `SORT` uses the `INTROSORT` sorting algorithm of David Musser, +!! http://www.cs.rpi.edu/~musser/gp/introsort.ps. `introsort` is a hybrid +!! unstable comparison algorithm combining `quicksort`, `insertion sort`, and +!! `heap sort`. While this algorithm is always O(N Ln(N)) it is relatively +!! fast on randomly ordered data, but inconsistent in performance on partly +!! sorted data, sometimes having `merge sort` performance, sometimes having +!! better than `quicksort` performance. + + implicit none + +contains + + +#:for k1 in INT_KINDS + + pure module subroutine ${k1}$_sort( array ) +! `${k1}$_sort( array )` sorts the input `ARRAY` of type `INTEGER(${k1}$)` +! using a hybrid sort based on the `introsort` of David Musser. As with +! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison +! algorithm using `quicksort` for the main body of the sort tree, +! supplemented by `insertion sort` for the outer brances, but if +! `quicksort` is converging too slowly the algorithm resorts +! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs. +! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +! behavior is typically small compared to other sorting algorithms. + + integer(${k1}$), intent(inout) :: array(0:) + + integer(int32) :: depth_limit + + depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ), & + kind=dp) ) / log(2.0_dp) ), & + kind=int32 ) + call introsort(array, depth_limit) + + contains + + pure recursive subroutine introsort( array, depth_limit ) +! It devolves to `insertionsort` if the remaining number of elements +! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion +! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`, +! otherwise sorting is done by a `quicksort`. + integer(${k1}$), intent(inout) :: array(0:) + integer(int32), intent(in) :: depth_limit + + integer(int_size), parameter :: insert_size = 16_int_size + integer(int_size) :: index + + if ( size(array, kind=int_size) <= insert_size ) then + ! May be best at the end of SORT processing the whole array + ! See Musser, D.R., “Introspective Sorting and Selection + ! Algorithms,” Software—Practice and Experience, Vol. 27(8), + ! 983–993 (August 1997). + + call insertion_sort( array ) + else if ( depth_limit == 0 ) then + call heap_sort( array ) + else + call partition( array, index ) + call introsort( array(0:index-1), depth_limit-1 ) + call introsort( array(index+1:), depth_limit-1 ) + end if + + end subroutine introsort + + + pure subroutine partition( array, index ) +! quicksort partition using median of three. + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(out) :: index + + integer(${k1}$) :: u, v, w, x, y + integer(int_size) :: i, j + +! Determine median of three and exchange it with the end. + u = array( 0 ) + v = array( size(array, kind=int_size)/2-1 ) + w = array( size(array, kind=int_size)-1 ) + if ( (u > v) .neqv. (u > w) ) then + x = u + y = array(0) + array(0) = array( size( array, kind=int_size ) - 1 ) + array( size( array, kind=int_size ) - 1 ) = y + else if ( (v < u) .neqv. (v < w) ) then + x = v + y = array(size( array, kind=int_size )/2-1) + array( size( array, kind=int_size )/2-1 ) = & + array( size( array, kind=int_size )-1 ) + array( size( array, kind=int_size )-1 ) = y + else + x = w + end if +! Partition the array. + i = -1_int_size + do j = 0_int_size, size(array, kind=int_size)-2 + if ( array(j) <= x ) then + i = i + 1 + y = array(i) + array(i) = array(j) + array(j) = y + end if + end do + y = array(i+1) + array(i+1) = array(size(array, kind=int_size)-1) + array(size(array, kind=int_size)-1) = y + index = i + 1 + + end subroutine partition + + pure subroutine insertion_sort( array ) +! Bog standard insertion sort. + integer(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, j + integer(${k1}$) :: key + + do j=1_int_size, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 ) + if ( array(i) <= key ) exit + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + pure subroutine heap_sort( array ) +! A bog standard heap sort + integer(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, heap_size + integer(${k1}$) :: y + + heap_size = size( array, kind=int_size ) +! Build the max heap + do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size + call max_heapify( array, i, heap_size ) + end do + do i = heap_size-1, 1_int_size, -1_int_size +! Swap the first element with the current final element + y = array(0) + array(0) = array(i) + array(i) = y +! Sift down using max_heapify + call max_heapify( array, 0_int_size, i ) + end do + + end subroutine heap_sort + + pure recursive subroutine max_heapify( array, i, heap_size ) +! Transform the array into a max heap + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: i, heap_size + + integer(int_size) :: l, r, largest + integer(${k1}$) :: y + + largest = i + l = 2_int_size * i + 1_int_size + r = l + 1_int_size + if ( l < heap_size ) then + if ( array(l) > array(largest) ) largest = l + end if + if ( r < heap_size ) then + if ( array(r) > array(largest) ) largest = r + end if + if ( largest /= i ) then + y = array(i) + array(i) = array(largest) + array(largest) = y + call max_heapify( array, largest, heap_size ) + end if + + end subroutine max_heapify + + end subroutine ${k1}$_sort + +#:endfor + + +#:for k1 in REAL_KINDS + + pure module subroutine ${k1}$_sort( array ) + +! `${k1}$_sort( array )` sorts the input `ARRAY` of type `REAL(${k1}$)` +! using a hybrid sort based on the `introsort` of David Musser. As with +! `introsort`, `${k1}$_sort( array )` is an unstable hybrid comparison +! algorithm using `quicksort` for the main body of the sort tree, +! supplemented by `insertion sort` for the outer brances, but if +! `quicksort` is converging too slowly the algorithm resorts +! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs. +! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +! behavior is typically small compared to other sorting algorithms. + + real(${k1}$), intent(inout) :: array(0:) + integer(int32) :: depth_limit + + depth_limit = & + 2 * int( floor( log( real( size( array, kind=int_size ), & + kind=dp) ) / log(2.0_dp) ), & + kind=int32 ) + call introsort(array, depth_limit) + + contains + + pure recursive subroutine introsort( array, depth_limit ) +! It devolves to `insertionsort` if the remaining number of elements +! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion of +! the quicksort is too slow as estimated from `DEPTH_LIMIT`, otherwise +! sorting is done by a `quicksort`. + real(${k1}$), intent(inout) :: array(0:) + integer(int32), intent(in) :: depth_limit + + integer(int_size), parameter :: insert_size = 16_int_size + integer(int_size) :: index + + if ( size(array, kind=int_size) <= insert_size ) then + ! May be best at the end of SORT processing the whole array + ! See Musser, D.R., “Introspective Sorting and Selection + ! Algorithms,” Software—Practice and Experience, Vol. 27(8), + ! 983–993 (August 1997). + + call insertion_sort( array ) + + else if ( depth_limit == 0 ) then + call heap_sort( array ) + + else + call partition( array, index ) + call introsort( array(0:index-1), depth_limit-1 ) + call introsort( array(index+1:), depth_limit-1 ) + end if + + end subroutine introsort + + + pure subroutine partition( array, index ) +! quicksort partition using median of three. + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(out) :: index + + real(${k1}$) :: u, v, w, x, y + integer(int_size) :: i, j + +! Determine median of three and exchange it with the end. + u = array( 0 ) + v = array( size(array, kind=int_size)/2-1 ) + w = array( size(array, kind=int_size)-1 ) + if ( (u > v) .neqv. (u > w) ) then + x = u + y = array(0) + array(0) = array( size( array, kind=int_size ) - 1 ) + array( size( array, kind=int_size ) - 1 ) = y + else if ( (v < u) .neqv. (v < w) ) then + x = v + y = array(size( array, kind=int_size )/2-1) + array(size( array, kind=int_size )/2-1) = & + array(size( array, kind=int_size )-1) + array( size( array, kind=int_size )-1 ) = y + else + x = w + end if +! Partition the array + i = -1_int_size + do j = 0_int_size, size(array, kind=int_size)-2 + if ( array(j) <= x ) then + i = i + 1 + y = array(i) + array(i) = array(j) + array(j) = y + end if + end do + y = array(i+1) + array(i+1) = array(size(array, kind=int_size)-1) + array(size(array, kind=int_size)-1) = y + index = i + 1 + + end subroutine partition + + pure subroutine insertion_sort( array ) +! Bog standard insertion sort. + real(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, j + real(${k1}$) :: key + + do j=1_int_size, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 ) + if ( array(i) <= key ) exit + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + pure subroutine heap_sort( array ) +! A bog standard heap sort + real(${k1}$), intent(inout) :: array(0:) + + integer(int_size) :: i, heap_size + real(${k1}$) :: y + + heap_size = size( array, kind=int_size ) +! Build the max heap + do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size + call max_heapify( array, i, heap_size ) + end do + do i = heap_size-1, 1_int_size, -1_int_size +! Swap the first element with the current final element + y = array(0) + array(0) = array(i) + array(i) = y +! Sift down using max_heapify + call max_heapify( array, 0_int_size, i ) + end do + + end subroutine heap_sort + + pure recursive subroutine max_heapify( array, i, heap_size ) +! Transform the array into a max heap + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: i, heap_size + + integer(int_size) :: l, r, largest + real(${k1}$) :: y + + largest = i + l = 2_int_size * i + 1_int_size + r = l + 1_int_size + if ( l < heap_size ) then + if ( array(l) > array(largest) ) largest = l + end if + if ( r < heap_size ) then + if ( array(r) > array(largest) ) largest = r + end if + if ( largest /= i ) then + y = array(i) + array(i) = array(largest) + array(largest) = y + call max_heapify( array, largest, heap_size ) + end if + + end subroutine max_heapify + + end subroutine ${k1}$_sort + +#:endfor + + + pure module subroutine char_sort( array ) +! `char_sort( array )` sorts the input `ARRAY` of type `CHARACTER(*)` +! using a hybrid sort based on the `introsort` of David Musser. As with +! `introsort`, `char_sort( array )` is an unstable hybrid comparison +! algorithm using `quicksort` for the main body of the sort tree, +! supplemented by `insertion sort` for the outer brances, but if +! `quicksort` is converging too slowly the algorithm resorts +! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs. +! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +! behavior is typically small compared to other sorting algorithms. + + character(len=*), intent(inout) :: array(0:) + + integer(int32) :: depth_limit + + depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ), & + kind=dp) ) / log(2.0_dp) ), & + kind=int32 ) + call introsort(array, depth_limit) + + contains + + pure recursive subroutine introsort( array, depth_limit ) +! It devolves to `insertionsort` if the remaining number of elements +! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion +! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`, +! otherwise sorting is done by a `quicksort`. + character(len=*), intent(inout) :: array(0:) + integer(int32), intent(in) :: depth_limit + + integer(int_size), parameter :: insert_size = 16_int_size + integer(int_size) :: index + + if ( size(array, kind=int_size) <= insert_size ) then + ! May be best at the end of SORT processing the whole array + ! See Musser, D.R., “Introspective Sorting and Selection + ! Algorithms,” Software—Practice and Experience, Vol. 27(8), + ! 983–993 (August 1997). + + call insertion_sort( array ) + else if ( depth_limit == 0 ) then + call heap_sort( array ) + else + call partition( array, index ) + call introsort( array(0:index-1), depth_limit-1 ) + call introsort( array(index+1:), depth_limit-1 ) + end if + + end subroutine introsort + + + pure subroutine partition( array, index ) +! quicksort partition using median of three. + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(out) :: index + + integer(int_size) :: i, j + character(len=len(array)) :: u, v, w, x, y + +! Determine median of three and exchange it with the end. + u = array( 0 ) + v = array( size(array, kind=int_size)/2-1 ) + w = array( size(array, kind=int_size)-1 ) + if ( (u > v) .neqv. (u > w) ) then + x = u + y = array(0) + array(0) = array( size( array, kind=int_size ) - 1 ) + array( size( array, kind=int_size ) - 1 ) = y + else if ( (v < u) .neqv. (v < w) ) then + x = v + y = array(size( array, kind=int_size )/2-1) + array( size( array, kind=int_size )/2-1 ) = & + array( size( array, kind=int_size )-1 ) + array( size( array, kind=int_size )-1 ) = y + else + x = w + end if +! Partition the array. + i = -1_int_size + do j = 0_int_size, size(array, kind=int_size)-2 + if ( array(j) <= x ) then + i = i + 1 + y = array(i) + array(i) = array(j) + array(j) = y + end if + end do + y = array(i+1) + array(i+1) = array(size(array, kind=int_size)-1) + array(size(array, kind=int_size)-1) = y + index = i + 1 + + end subroutine partition + + pure subroutine insertion_sort( array ) +! Bog standard insertion sort. + character(len=*), intent(inout) :: array(0:) + + integer(int_size) :: i, j + character(len=len(array)) :: key + + do j=1_int_size, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 ) + if ( array(i) <= key ) exit + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + pure subroutine heap_sort( array ) +! A bog standard heap sort + character(len=*), intent(inout) :: array(0:) + + integer(int_size) :: i, heap_size + character(len=len(array)) :: y + + heap_size = size( array, kind=int_size ) +! Build the max heap + do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size + call max_heapify( array, i, heap_size ) + end do + do i = heap_size-1, 1_int_size, -1_int_size +! Swap the first element with the current final element + y = array(0) + array(0) = array(i) + array(i) = y +! Sift down using max_heapify + call max_heapify( array, 0_int_size, i ) + end do + + end subroutine heap_sort + + pure recursive subroutine max_heapify( array, i, heap_size ) +! Transform the array into a max heap + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(in) :: i, heap_size + + integer(int_size) :: l, r, largest + character(len=len(array)) :: y + + largest = i + l = 2_int_size * i + 1_int_size + r = l + 1_int_size + if ( l < heap_size ) then + if ( array(l) > array(largest) ) largest = l + end if + if ( r < heap_size ) then + if ( array(r) > array(largest) ) largest = r + end if + if ( largest /= i ) then + y = array(i) + array(i) = array(largest) + array(largest) = y + call max_heapify( array, largest, heap_size ) + end if + + end subroutine max_heapify + + end subroutine char_sort + + pure module subroutine string_sort( array ) +! `string_sort( array )` sorts the input `ARRAY` of type `STRING_TyPE` +! using a hybrid sort based on the `introsort` of David Musser. As with +! `introsort`, `string_sort( array )` is an unstable hybrid comparison +! algorithm using `quicksort` for the main body of the sort tree, +! supplemented by `insertion sort` for the outer brances, but if +! `quicksort` is converging too slowly the algorithm resorts +! to `heapsort`. The algorithm is of order O(N Ln(N)) for all inputs. +! Because it relies on `quicksort`, the coefficient of the O(N Ln(N)) +! behavior is typically small compared to other sorting algorithms. + + type(string_type), intent(inout) :: array(0:) + + integer(int32) :: depth_limit + + depth_limit = 2 * int( floor( log( real( size( array, kind=int64 ), & + kind=dp) ) / log(2.0_dp) ), & + kind=int32 ) + call introsort(array, depth_limit) + + contains + + pure recursive subroutine introsort( array, depth_limit ) +! It devolves to `insertionsort` if the remaining number of elements +! is fewer than or equal to `INSERT_SIZE`, `heapsort` if the completion +! of the `quicksort` is too slow as estimated from `DEPTH_LIMIT`, +! otherwise sorting is done by a `quicksort`. + type(string_type), intent(inout) :: array(0:) + integer(int32), intent(in) :: depth_limit + + integer(int_size), parameter :: insert_size = 16_int_size + integer(int_size) :: index + + if ( size(array, kind=int_size) <= insert_size ) then + ! May be best at the end of SORT processing the whole array + ! See Musser, D.R., “Introspective Sorting and Selection + ! Algorithms,” Software—Practice and Experience, Vol. 27(8), + ! 983–993 (August 1997). + + call insertion_sort( array ) + else if ( depth_limit == 0 ) then + call heap_sort( array ) + else + call partition( array, index ) + call introsort( array(0:index-1), depth_limit-1 ) + call introsort( array(index+1:), depth_limit-1 ) + end if + + end subroutine introsort + + + pure subroutine partition( array, index ) +! quicksort partition using median of three. + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(out) :: index + + integer(int_size) :: i, j + type(string_type) :: u, v, w, x, y + +! Determine median of three and exchange it with the end. + u = array( 0 ) + v = array( size(array, kind=int_size)/2-1 ) + w = array( size(array, kind=int_size)-1 ) + if ( (u > v) .neqv. (u > w) ) then + x = u + y = array(0) + array(0) = array( size( array, kind=int_size ) - 1 ) + array( size( array, kind=int_size ) - 1 ) = y + else if ( (v < u) .neqv. (v < w) ) then + x = v + y = array(size( array, kind=int_size )/2-1) + array( size( array, kind=int_size )/2-1 ) = & + array( size( array, kind=int_size )-1 ) + array( size( array, kind=int_size )-1 ) = y + else + x = w + end if +! Partition the array. + i = -1_int_size + do j = 0_int_size, size(array, kind=int_size)-2 + if ( array(j) <= x ) then + i = i + 1 + y = array(i) + array(i) = array(j) + array(j) = y + end if + end do + y = array(i+1) + array(i+1) = array(size(array, kind=int_size)-1) + array(size(array, kind=int_size)-1) = y + index = i + 1 + + end subroutine partition + + pure subroutine insertion_sort( array ) +! Bog standard insertion sort. + type(string_type), intent(inout) :: array(0:) + + integer(int_size) :: i, j + type(string_type) :: key + + do j=1_int_size, size(array, kind=int_size)-1 + key = array(j) + i = j - 1 + do while( i >= 0 ) + if ( array(i) <= key ) exit + array(i+1) = array(i) + i = i - 1 + end do + array(i+1) = key + end do + + end subroutine insertion_sort + + pure subroutine heap_sort( array ) +! A bog standard heap sort + type(string_type), intent(inout) :: array(0:) + + integer(int_size) :: i, heap_size + type(string_type) :: y + + heap_size = size( array, kind=int_size ) +! Build the max heap + do i = (heap_size-2)/2_int_size, 0_int_size, -1_int_size + call max_heapify( array, i, heap_size ) + end do + do i = heap_size-1, 1_int_size, -1_int_size +! Swap the first element with the current final element + y = array(0) + array(0) = array(i) + array(i) = y +! Sift down using max_heapify + call max_heapify( array, 0_int_size, i ) + end do + + end subroutine heap_sort + + pure recursive subroutine max_heapify( array, i, heap_size ) +! Transform the array into a max heap + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(in) :: i, heap_size + + integer(int_size) :: l, r, largest + type(string_type) :: y + + largest = i + l = 2_int_size * i + 1_int_size + r = l + 1_int_size + if ( l < heap_size ) then + if ( array(l) > array(largest) ) largest = l + end if + if ( r < heap_size ) then + if ( array(r) > array(largest) ) largest = r + end if + if ( largest /= i ) then + y = array(i) + array(i) = array(largest) + array(largest) = y + call max_heapify( array, largest, heap_size ) + end if + + end subroutine max_heapify + + end subroutine string_sort + +end submodule stdlib_sorting_sort diff --git a/src/stdlib_sorting_sort_index.fypp b/src/stdlib_sorting_sort_index.fypp new file mode 100644 index 000000000..222c2cd97 --- /dev/null +++ b/src/stdlib_sorting_sort_index.fypp @@ -0,0 +1,1688 @@ +#! Integer kinds to be considered during templating +#:set INT_KINDS = ["int8", "int16", "int32", "int64"] + +#! Real kinds to be considered during templating +#:set REAL_KINDS = ["sp", "dp", "qp"] + +!! Licensing: +!! +!! This file is subjec† both to the Fortran Standard Library license, and +!! to additional licensing requirements as it contains translations of +!! other software. +!! +!! The Fortran Standard Library, including this file, is distributed under +!! the MIT license that should be included with the library's distribution. +!! +!! Copyright (c) 2021 Fortran stdlib developers +!! +!! Permission is hereby granted, free of charge, to any person obtaining a +!! copy of this software and associated documentation files (the +!! "Software"), to deal in the Software without restriction, including +!! without limitation the rights to use, copy, modify, merge, publish, +!! distribute, sublicense, and/or sellcopies of the Software, and to permit +!! persons to whom the Software is furnished to do so, subject to the +!! following conditions: +!! +!! The above copyright notice and this permission notice shall be included +!! in all copies or substantial portions of the Software. +!! +!! THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +!! OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +!! MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +!! IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +!! CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +!! TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +!! SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +!! +!! The generic subroutine, `SORT_INDEX`, is substantially a translation to +!! Fortran 2008 of the `"Rust" sort` sorting routines in +!! [`slice.rs`](https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs) +!! The `rust sort` implementation is distributed with the header: +!! +!! Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT +!! file at the top-level directory of this distribution and at +!! http://rust-lang.org/COPYRIGHT. +!! +!! Licensed under the Apache License, Version 2.0 or the MIT license +!! , at your +!! option. This file may not be copied, modified, or distributed +!! except according to those terms. +!! +!! so the license for the original`slice.rs` code is compatible with the use +!! of modified versions of the code in the Fortran Standard Library under +!! the MIT license. + +submodule(stdlib_sorting) stdlib_sorting_sort_index + + implicit none + +contains + +#:for k1 in INT_KINDS + + module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse ) +! A modification of `${k1}$_ord_sort` to return an array of indices that +! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY` +! as desired. The indices by default +! correspond to a non-decreasing sort, but if the optional argument +! `REVERSE` is present with a value of `.TRUE.` the indices correspond to +! a non-increasing sort. The logic of the determination of indexing largely +! follows the `"Rust" sort` found in `slice.rs`: +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version is a simplification of the Timsort algorithm described +! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size, with +! an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original listsort.txt, and the optional `work` and `iwork` arraya to be +! used as scratch memory. + + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + integer(${k1}$), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + + integer(int_size) :: array_size, i, stat + integer(${k1}$), allocatable :: buf(:) + integer(int_size), allocatable :: ibuf(:) + + array_size = size(array, kind=int_size) + + do i = 0, array_size-1 + index(i) = i+1 + end do + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + +! If necessary allocate buffers to serve as scratch memory. + if ( present(work) ) then + if ( present(iwork) ) then + call merge_sort( array, index, work, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, work, ibuf ) + end if + else + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of array buffer failed." + if ( present(iwork) ) then + call merge_sort( array, index, buf, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, buf, ibuf ) + end if + end if + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array, index ) +! Sorts `ARRAY` using an insertion sort, while maintaining consistency in +! location of the indices in `INDEX` to the elements of `ARRAY`. + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: i, j, key_index + integer(${k1}$) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + key_index = index(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + index(i+1) = index(i) + i = i - 1 + end do + array(i+1) = key + index(i+1) = key_index + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array, index ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. +! Consistency of the indices in `index` with the elements of `array` +! are maintained. + + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(${k1}$) :: tmp + integer(int_size) :: i, tmp_index + + tmp = array(0) + tmp_index = index(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + index(i-1) = index(i) + end do find_hole + array(i-1) = tmp + index(i-1) = tmp_index + + end subroutine insert_head + + + subroutine merge_sort( array, index, buf, ibuf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. Consistency of the indices in `index` with the elements of +! `array` are maintained. + + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + integer(${k1}$), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least this +! many elements. Slices of up to this length are sorted using insertion sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array, index ) + return + end if + + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish), & + index(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish), index(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf, & + index( left % base: & + right % base + right % len - 1 ), ibuf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf, index, ibuf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + integer(${k1}$), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: index(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + ibuf(0:mid-1) = index(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + index(k) = ibuf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + index(k) = index(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + index(k+1:) = ibuf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter + buf(0:array_len-mid-1) = array(mid:array_len-1) + ibuf(0:array_len-mid-1) = index(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + index(k) = ibuf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + index(k) = index(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + index(0:k-1) = ibuf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array, index ) +! Reverse a segment of an array in place + integer(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: itemp, lo, hi + integer(${k1}$) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + itemp = index(lo) + index(lo) = index(hi) + index(hi) = itemp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine ${k1}$_sort_index + +#:endfor + + +#:for k1 in REAL_KINDS + + module subroutine ${k1}$_sort_index( array, index, work, iwork, reverse ) +! A modification of `${k1}$_ord_sort` to return an array of indices that +! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY` +! as desired. The indices by default +! correspond to a non-decreasing sort, but if the optional argument +! `REVERSE` is present with a value of `.TRUE.` the indices correspond to +! a non-increasing sort. The logic of the determination of indexing largely +! follows the `"Rust" sort` found in `slice.rs`: +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version is a simplification of the Timsort algorithm described +! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size, with +! an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original listsort.txt, and the optional `work` and `iwork` arraya to be +! used as scratch memory. + + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + real(${k1}$), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + + integer(int_size) :: array_size, i, stat + real(${k1}$), allocatable :: buf(:) + integer(int_size), allocatable :: ibuf(:) + + array_size = size(array, kind=int_size) + + do i = 0, array_size-1 + index(i) = i+1 + end do + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + +! If necessary allocate buffers to serve as scratch memory. + if ( present(work) ) then + if ( present(iwork) ) then + call merge_sort( array, index, work, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, work, ibuf ) + end if + else + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of array buffer failed." + if ( present(iwork) ) then + call merge_sort( array, index, buf, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, buf, ibuf ) + end if + end if + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array, index ) +! Sorts `ARRAY` using an insertion sort, while maintaining consistency in +! location of the indices in `INDEX` to the elements of `ARRAY`. + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: i, j, key_index + real(${k1}$) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + key_index = index(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + index(i+1) = index(i) + i = i - 1 + end do + array(i+1) = key + index(i+1) = key_index + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array, index ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. +! Consistency of the indices in `index` with the elements of `array` +! are maintained. + + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + real(${k1}$) :: tmp + integer(int_size) :: i, tmp_index + + tmp = array(0) + tmp_index = index(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + index(i-1) = index(i) + end do find_hole + array(i-1) = tmp + index(i-1) = tmp_index + + end subroutine insert_head + + + subroutine merge_sort( array, index, buf, ibuf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. Consistency of the indices in `index` with the elements of +! `array` are maintained. + + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + real(${k1}$), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least this +! many elements. Slices of up to this length are sorted using insertion sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array, index ) + return + end if + + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish), & + index(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish), index(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf, & + index( left % base: & + right % base + right % len - 1 ), ibuf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf, index, ibuf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + real(${k1}$), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: index(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + ibuf(0:mid-1) = index(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + index(k) = ibuf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + index(k) = index(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + index(k+1:) = ibuf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter + buf(0:array_len-mid-1) = array(mid:array_len-1) + ibuf(0:array_len-mid-1) = index(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + index(k) = ibuf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + index(k) = index(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + index(0:k-1) = ibuf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array, index ) +! Reverse a segment of an array in place + real(${k1}$), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: itemp, lo, hi + real(${k1}$) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + itemp = index(lo) + index(lo) = index(hi) + index(hi) = itemp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine ${k1}$_sort_index + +#:endfor + + module subroutine char_sort_index( array, index, work, iwork, reverse ) +! A modification of `char_ord_sort` to return an array of indices that +! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY` +! as desired. The indices by default +! correspond to a non-decreasing sort, but if the optional argument +! `REVERSE` is present with a value of `.TRUE.` the indices correspond to +! a non-increasing sort. The logic of the determination of indexing largely +! follows the `"Rust" sort` found in `slice.rs`: +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version is a simplification of the Timsort algorithm described +! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size, with +! an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original listsort.txt, and the optional `work` and `iwork` arraya to be +! used as scratch memory. + + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + character(len=len(array)), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + + integer(int_size) :: array_size, i, stat + character(len=:), allocatable :: buf(:) + integer(int_size), allocatable :: ibuf(:) + + array_size = size(array, kind=int_size) + + do i = 0, array_size-1 + index(i) = i+1 + end do + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + +! If necessary allocate buffers to serve as scratch memory. + if ( present(work) ) then + if ( present(iwork) ) then + call merge_sort( array, index, work, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, work, ibuf ) + end if + else + allocate( character(len=len(array)) :: buf(0:array_size/2-1), & + stat=stat ) + if ( stat /= 0 ) error stop "Allocation of array buffer failed." + if ( present(iwork) ) then + call merge_sort( array, index, buf, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, buf, ibuf ) + end if + end if + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array, index ) +! Sorts `ARRAY` using an insertion sort, while maintaining consistency in +! location of the indices in `INDEX` to the elements of `ARRAY`. + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: i, j, key_index + character(len=len(array)) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + key_index = index(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + index(i+1) = index(i) + i = i - 1 + end do + array(i+1) = key + index(i+1) = key_index + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array, index ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. +! Consistency of the indices in `index` with the elements of `array` +! are maintained. + + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + character(len=len(array)) :: tmp + integer(int_size) :: i, tmp_index + + tmp = array(0) + tmp_index = index(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + index(i-1) = index(i) + end do find_hole + array(i-1) = tmp + index(i-1) = tmp_index + + end subroutine insert_head + + + subroutine merge_sort( array, index, buf, ibuf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. Consistency of the indices in `index` with the elements of +! `array` are maintained. + + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + character(len=len(array)), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least this +! many elements. Slices of up to this length are sorted using insertion sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array, index ) + return + end if + + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish), & + index(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish), index(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf, & + index( left % base: & + right % base + right % len - 1 ), ibuf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf, index, ibuf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + character(len=len(array)), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: index(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + ibuf(0:mid-1) = index(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + index(k) = ibuf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + index(k) = index(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + index(k+1:) = ibuf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter + buf(0:array_len-mid-1) = array(mid:array_len-1) + ibuf(0:array_len-mid-1) = index(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + index(k) = ibuf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + index(k) = index(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + index(0:k-1) = ibuf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array, index ) +! Reverse a segment of an array in place + character(len=*), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: itemp, lo, hi + character(len=len(array)) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + itemp = index(lo) + index(lo) = index(hi) + index(hi) = itemp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine char_sort_index + + module subroutine string_sort_index( array, index, work, iwork, reverse ) +! A modification of `string_ord_sort` to return an array of indices that +! would perform a stable sort of the `ARRAY` as input, and also sort `ARRAY` +! as desired. The indices by default +! correspond to a non-decreasing sort, but if the optional argument +! `REVERSE` is present with a value of `.TRUE.` the indices correspond to +! a non-increasing sort. The logic of the determination of indexing largely +! follows the `"Rust" sort` found in `slice.rs`: +! https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs#L2159 +! The Rust version is a simplification of the Timsort algorithm described +! in https://svn.python.org/projects/python/trunk/Objects/listsort.txt, as +! it drops both the use of 'galloping' to identify bounds of regions to be +! sorted and the estimation of the optimal `run size`. However it remains +! a hybrid sorting algorithm combining an iterative Merge sort controlled +! by a stack of `RUNS` identified by regions of uniformly decreasing or +! non-decreasing sequences that may be expanded to a minimum run size, with +! an insertion sort. +! +! Note the Fortran implementation simplifies the logic as it only has to +! deal with Fortran arrays of intrinsic types and not the full generality +! of Rust's arrays and lists for arbitrary types. It also adds the +! estimation of the optimal `run size` as suggested in Tim Peters' +! original listsort.txt, and the optional `work` and `iwork` arraya to be +! used as scratch memory. + + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + type(string_type), intent(inout), optional :: work(0:) + integer(int_size), intent(inout), optional :: iwork(0:) + logical, intent(in), optional :: reverse + + integer(int_size) :: array_size, i, stat + type(string_type), allocatable :: buf(:) + integer(int_size), allocatable :: ibuf(:) + + array_size = size(array, kind=int_size) + + do i = 0, array_size-1 + index(i) = i+1 + end do + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + +! If necessary allocate buffers to serve as scratch memory. + if ( present(work) ) then + if ( present(iwork) ) then + call merge_sort( array, index, work, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, work, ibuf ) + end if + else + allocate( buf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of array buffer failed." + if ( present(iwork) ) then + call merge_sort( array, index, buf, iwork ) + else + allocate( ibuf(0:array_size/2-1), stat=stat ) + if ( stat /= 0 ) error stop "Allocation of index buffer failed." + call merge_sort( array, index, buf, ibuf ) + end if + end if + + if ( present(reverse) ) then + if ( reverse ) then + call reverse_segment( array, index ) + end if + end if + + contains + + pure function calc_min_run( n ) result(min_run) +!! Returns the minimum length of a run from 32-63 so that N/MIN_RUN is +!! less than or equal to a power of two. See +!! https://svn.python.org/projects/python/trunk/Objects/listsort.txt + integer(int_size) :: min_run + integer(int_size), intent(in) :: n + + integer(int_size) :: num, r + + num = n + r = 0_int_size + + do while( num >= 64 ) + r = ior( r, iand(num, 1_int_size) ) + num = ishft(num, -1_int_size) + end do + min_run = num + r + + end function calc_min_run + + + pure subroutine insertion_sort( array, index ) +! Sorts `ARRAY` using an insertion sort, while maintaining consistency in +! location of the indices in `INDEX` to the elements of `ARRAY`. + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: i, j, key_index + type(string_type) :: key + + do j=1, size(array, kind=int_size)-1 + key = array(j) + key_index = index(j) + i = j - 1 + do while( i >= 0 .and. array(i) > key ) + array(i+1) = array(i) + index(i+1) = index(i) + i = i - 1 + end do + array(i+1) = key + index(i+1) = key_index + end do + + end subroutine insertion_sort + + + pure function collapse( runs ) result ( r ) +! Examine the stack of runs waiting to be merged, identifying adjacent runs +! to be merged until the stack invariants are restablished: +! +! 1. len(-3) > len(-2) + len(-1) +! 2. len(-2) > len(-1) + + integer(int_size) :: r + type(run_type), intent(in), target :: runs(0:) + + integer(int_size) :: n + logical :: test + + n = size(runs, kind=int_size) + test = .false. + if (n >= 2) then + if ( runs( n-1 ) % base == 0 .or. & + runs( n-2 ) % len <= runs(n-1) % len ) then + test = .true. + else if ( n >= 3 ) then ! X exists + if ( runs(n-3) % len <= & + runs(n-2) % len + runs(n-1) % len ) then + test = .true. +! |X| <= |Y| + |Z| => will need to merge due to rho1 or rho2 + else if( n >= 4 ) then + if ( runs(n-4) % len <= & + runs(n-3) % len + runs(n-2) % len ) then + test = .true. +! |W| <= |X| + |Y| => will need to merge due to rho1 or rho3 + end if + end if + end if + end if + if ( test ) then +! By default merge Y & Z, rho2 or rho3 + if ( n >= 3 ) then + if ( runs(n-3) % len < runs(n-1) % len ) then + r = n - 3 +! |X| < |Z| => merge X & Y, rho1 + return + end if + end if + r = n - 2 +! |Y| <= |Z| => merge Y & Z, rho4 + return + else + r = -1 + end if + + end function collapse + + + pure subroutine insert_head( array, index ) +! Inserts `array(0)` into the pre-sorted sequence `array(1:)` so that the +! whole `array(0:)` becomes sorted, copying the first element into +! a temporary variable, iterating until the right place for it is found. +! copying every traversed element into the slot preceding it, and finally, +! copying data from the temporary variable into the resulting hole. +! Consistency of the indices in `index` with the elements of `array` +! are maintained. + + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + type(string_type) :: tmp + integer(int_size) :: i, tmp_index + + tmp = array(0) + tmp_index = index(0) + find_hole: do i=1, size(array, kind=int_size)-1 + if ( array(i) >= tmp ) exit find_hole + array(i-1) = array(i) + index(i-1) = index(i) + end do find_hole + array(i-1) = tmp + index(i-1) = tmp_index + + end subroutine insert_head + + + subroutine merge_sort( array, index, buf, ibuf ) +! The Rust merge sort borrows some (but not all) of the ideas from TimSort, +! which is described in detail at +! (http://svn.python.org/projects/python/trunk/Objects/listsort.txt). +! +! The algorithm identifies strictly descending and non-descending +! subsequences, which are called natural runs. Where these runs are less +! than a minimum run size they are padded by adding additional samples +! using an insertion sort. The merge process is driven by a stack of +! pending unmerged runs. Each newly found run is pushed onto the stack, +! and then pairs of adjacentd runs are merged until these two invariants +! are satisfied: +! +! 1. for every `i` in `1..size(runs)-1`: `runs(i - 1)%len > runs(i)%len` +! 2. for every `i` in `2..size(runs)-1`: `runs(i - 2)%len > +! runs(i - 1)%len + runs(i)%len` +! +! The invariants ensure that the total running time is `O(n log n)` +! worst-case. Consistency of the indices in `index` with the elements of +! `array` are maintained. + + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + type(string_type), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_size, finish, min_run, r, r_count, & + start + type(run_type) :: runs(0:max_merge_stack-1), left, right + + array_size = size(array, kind=int_size) + +! Very short runs are extended using insertion sort to span at least this +! many elements. Slices of up to this length are sorted using insertion sort. + min_run = calc_min_run( array_size ) + + if ( array_size <= min_run ) then + if ( array_size >= 2 ) call insertion_sort( array, index ) + return + end if + + +! Following Rust sort, natural runs in `array` are identified by traversing +! it backwards. By traversing it backward, merges more often go in the +! opposite direction (forwards). According to developers of Rust sort, +! merging forwards is slightly faster than merging backwards. Therefore +! identifying runs by traversing backwards should improve performance. + r_count = 0 + finish = array_size - 1 + do while ( finish >= 0 ) +! Find the next natural run, and reverse it if it's strictly descending. + start = finish + if ( start > 0 ) then + start = start - 1 + if ( array(start+1) < array(start) ) then + Descending: do while ( start > 0 ) + if ( array(start) >= array(start-1) ) & + exit Descending + start = start - 1 + end do Descending + call reverse_segment( array(start:finish), & + index(start:finish) ) + else + Ascending: do while( start > 0 ) + if ( array(start) < array(start-1) ) exit Ascending + start = start - 1 + end do Ascending + end if + end if + +! If the run is too short insert some more elements using an insertion sort. + Insert: do while ( start > 0 ) + if ( finish - start >= min_run - 1 ) exit Insert + start = start - 1 + call insert_head( array(start:finish), index(start:finish) ) + end do Insert + if ( start == 0 .and. finish == array_size - 1 ) return + + runs(r_count) = run_type( base = start, & + len = finish - start + 1 ) + finish = start-1 + r_count = r_count + 1 + +! Determine whether pairs of adjacent runs need to be merged to satisfy +! the invariants, and, if so, merge them. + Merge_loop: do + r = collapse( runs(0:r_count - 1) ) + if ( r < 0 .or. r_count <= 1 ) exit Merge_loop + left = runs( r + 1 ) + right = runs( r ) + call merge( array( left % base: & + right % base + right % len - 1 ), & + left % len, buf, & + index( left % base: & + right % base + right % len - 1 ), ibuf ) + + runs(r) = run_type( base = left % base, & + len = left % len + right % len ) + if ( r == r_count - 3 ) runs(r+1) = runs(r+2) + r_count = r_count - 1 + + end do Merge_loop + end do + if ( r_count /= 1 ) & + error stop "MERGE_SORT completed without RUN COUNT == 1." + + end subroutine merge_sort + + + pure subroutine merge( array, mid, buf, index, ibuf ) +! Merges the two non-decreasing runs `ARRAY(0:MID-1)` and `ARRAY(MID:)` +! using `BUF` as temporary storage, and stores the merged runs into +! `ARRAY(0:)`. `MID` must be > 0, and < `SIZE(ARRAY)-1`. Buffer `BUF` +! must be long enough to hold the shorter of the two runs. + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(in) :: mid + type(string_type), intent(inout) :: buf(0:) + integer(int_size), intent(inout) :: index(0:) + integer(int_size), intent(inout) :: ibuf(0:) + + integer(int_size) :: array_len, i, j, k + + array_len = size(array, kind=int_size) + +! Merge first copies the shorter run into `buf`. Then, depending on which +! run was shorter, it traces the copied run and the longer run forwards +! (or backwards), comparing their next unprocessed elements and then +! copying the lesser (or greater) one into `array`. + + if ( mid <= array_len - mid ) then ! The left run is shorter. + buf(0:mid-1) = array(0:mid-1) + ibuf(0:mid-1) = index(0:mid-1) + i = 0 + j = mid + merge_lower: do k = 0, array_len-1 + if ( buf(i) <= array(j) ) then + array(k) = buf(i) + index(k) = ibuf(i) + i = i + 1 + if ( i >= mid ) exit merge_lower + else + array(k) = array(j) + index(k) = index(j) + j = j + 1 + if ( j >= array_len ) then + array(k+1:) = buf(i:mid-1) + index(k+1:) = ibuf(i:mid-1) + exit merge_lower + end if + end if + end do merge_lower + else ! The right run is shorter + buf(0:array_len-mid-1) = array(mid:array_len-1) + ibuf(0:array_len-mid-1) = index(mid:array_len-1) + i = mid - 1 + j = array_len - mid -1 + merge_upper: do k = array_len-1, 0, -1 + if ( buf(j) >= array(i) ) then + array(k) = buf(j) + index(k) = ibuf(j) + j = j - 1 + if ( j < 0 ) exit merge_upper + else + array(k) = array(i) + index(k) = index(i) + i = i - 1 + if ( i < 0 ) then + array(0:k-1) = buf(0:j) + index(0:k-1) = ibuf(0:j) + exit merge_upper + end if + end if + end do merge_upper + end if + end subroutine merge + + + pure subroutine reverse_segment( array, index ) +! Reverse a segment of an array in place + type(string_type), intent(inout) :: array(0:) + integer(int_size), intent(inout) :: index(0:) + + integer(int_size) :: itemp, lo, hi + type(string_type) :: temp + + lo = 0 + hi = size( array, kind=int_size ) - 1 + do while( lo < hi ) + temp = array(lo) + array(lo) = array(hi) + array(hi) = temp + itemp = index(lo) + index(lo) = index(hi) + index(hi) = itemp + lo = lo + 1 + hi = hi - 1 + end do + + end subroutine reverse_segment + + end subroutine string_sort_index + +end submodule stdlib_sorting_sort_index diff --git a/src/tests/CMakeLists.txt b/src/tests/CMakeLists.txt index 288445de9..381226222 100644 --- a/src/tests/CMakeLists.txt +++ b/src/tests/CMakeLists.txt @@ -12,6 +12,7 @@ add_subdirectory(io) add_subdirectory(linalg) add_subdirectory(logger) add_subdirectory(optval) +add_subdirectory(sorting) add_subdirectory(stats) add_subdirectory(string) add_subdirectory(system) diff --git a/src/tests/Makefile.manual b/src/tests/Makefile.manual index 773066b1c..57f663bef 100644 --- a/src/tests/Makefile.manual +++ b/src/tests/Makefile.manual @@ -6,6 +6,7 @@ all test clean: $(MAKE) -f Makefile.manual --directory=io $@ $(MAKE) -f Makefile.manual --directory=logger $@ $(MAKE) -f Makefile.manual --directory=optval $@ + $(MAKE) -f Makefile.manual --directory=sorting $@ $(MAKE) -f Makefile.manual --directory=quadrature $@ $(MAKE) -f Makefile.manual --directory=stats $@ $(MAKE) -f Makefile.manual --directory=string $@ diff --git a/src/tests/sorting/CMakeLists.txt b/src/tests/sorting/CMakeLists.txt new file mode 100644 index 000000000..bb4b27896 --- /dev/null +++ b/src/tests/sorting/CMakeLists.txt @@ -0,0 +1,2 @@ +ADDTEST(sorting) + diff --git a/src/tests/sorting/Makefile.manual b/src/tests/sorting/Makefile.manual new file mode 100644 index 000000000..39657c8f3 --- /dev/null +++ b/src/tests/sorting/Makefile.manual @@ -0,0 +1,3 @@ +PROGS_SRC = test_sorting.f90 + +include ../Makefile.manual.test.mk diff --git a/src/tests/sorting/test_sorting.f90 b/src/tests/sorting/test_sorting.f90 new file mode 100644 index 000000000..4025aef42 --- /dev/null +++ b/src/tests/sorting/test_sorting.f90 @@ -0,0 +1,625 @@ +program test_sorting + + use, intrinsic :: iso_fortran_env, only: & + compiler_version + use stdlib_kinds, only: int32, int64, dp, sp + use stdlib_sorting + use stdlib_string_type, only: string_type, assignment(=), operator(>), & + write(formatted) + + implicit none + + integer(int32), parameter :: test_size = 2_int32**20 + integer(int32), parameter :: char_size = 26**4 + integer(int32), parameter :: string_size = 26**3 + integer(int32), parameter :: block_size = test_size/6 + integer, parameter :: repeat = 8 + + integer(int32) :: & + blocks(0:test_size-1), & + decrease(0:test_size-1), & + identical(0:test_size-1), & + increase(0:test_size-1), & + rand0(0:test_size-1), & + rand1(0:test_size-1), & + rand2(0:test_size-1), & + rand3(0:test_size-1), & + rand10(0:test_size-1) + character(len=4) :: & + char_decrease(0:char_size-1), & + char_increase(0:char_size-1), & + char_rand(0:char_size-1) + type(string_type) :: & + string_decrease(0:string_size-1), & + string_increase(0:string_size-1), & + string_rand(0:string_size-1) + + integer(int32) :: dummy(0:test_size-1) + character(len=4) :: char_dummy(0:char_size-1) + type(string_type) :: string_dummy(0:string_size-1) + integer(int_size) :: index(0:test_size-1) + integer(int32) :: work(0:test_size/2-1) + character(len=4) :: char_work(0:char_size/2-1) + type(string_type) :: string_work(0:string_size/2-1) + integer(int_size) :: iwork(0:test_size/2-1) + integer :: count, i, index1, index2, j, k, l, temp + real(sp) :: arand, brand + character(*), parameter :: filename = 'test_sorting.txt' + integer :: lun + character(len=4) :: char_temp + type(string_type) :: string_temp + +! Create the test arrays + identical(:) = 10 + do i=0, test_size-1 + increase(i) = i + decrease(i) = test_size - 1 - i + call random_number( arand ) + rand0(i) = int( floor( 4 * arand * test_size ), kind=int32 ) + rand1(i) = int( floor( arand * test_size / 4 ), kind=int32 ) + end do + blocks(:) = increase(:) + blocks(0:block_size-1) = increase(4*block_size:5*block_size-1) + blocks(block_size:2*block_size-1) = increase(0:block_size-1) + blocks(2*block_size:3*block_size-1) = increase(2*block_size:3*block_size-1) + blocks(3*block_size:4*block_size-1) = increase(block_size:2*block_size-1) + blocks(4*block_size:5*block_size-1) = increase(3*block_size:4*block_size-1) + rand2(:) = increase(:) + do i=0, test_size-1 + call random_number( arand ) + index1 = int( floor( arand * test_size ), kind=int32 ) + temp = rand2(i) + rand2(i) = rand2(index1) + rand2(index1) = temp + end do + rand3(:) = increase(:) + do i=0, 2 + call random_number( arand ) + call random_number( brand ) + index1 = int( floor( arand * test_size ), kind=int32 ) + index2 = int( floor( brand * test_size ), kind=int32 ) + temp = rand3(index1) + rand3(index1) = rand3(index2) + rand3(index2) = temp + end do + rand10(:) = increase(:) + do i=test_size-10, test_size-1 + call random_number( arand ) + rand10(i) = int( floor( arand * test_size ), kind=int32 ) + end do + + count = 0 + do i=0, 25 + do j=0, 25 + do k=0, 25 + do l=0, 25 + char_increase(count) = achar(97+i) // achar(97+j) // & + achar(97+k) // achar(97+l) + count = count + 1 + end do + end do + end do + end do + + do i=0, char_size-1 + char_decrease(char_size-1-i) = char_increase(i) + end do + + char_rand(:) = char_increase(:) + do i=0, char_size-1 + call random_number( arand ) + index1 = int( floor( arand * char_size ), kind=int32 ) + char_temp = char_rand(i) + char_rand(i) = char_rand(index1) + char_rand(index1) = char_temp + end do + + count = 0 + do i=0, 25 + do j=0, 25 + do k=0, 25 + string_increase(count) = achar(97+i) // achar(97+j) // & + achar(97+k) + count = count + 1 + end do + end do + end do + + do i=0, string_size-1 + string_decrease(string_size - 1 - i) = char_increase(i) + end do + + string_rand(:) = string_increase(:) + do i=0, string_size-1 + call random_number( arand ) + index1 = int( floor( arand * string_size ), kind=int32 ) + string_temp = string_rand(i) + string_rand(i) = string_rand(index1) + string_rand(index1) = string_temp + end do + +! Create and intialize file to report the results of the sortings + open( newunit=lun, file=filename, access='sequential', action='write', & + form='formatted', status='replace' ) + write( lun, '(a)' ) trim(compiler_version()) + write( lun, * ) + write( lun, '("| Type | Elements | Array Name | Method ' // & + ' | Time (s) |")' ) + write( lun, '("|-------------|----------|-----------------|-----------' // & + '--|-----------|")' ) + +! test the sorting routines on the test arrays + call test_int_ord_sorts( ) + + call test_char_ord_sorts( ) + + call test_string_ord_sorts( ) + + call test_int_sorts( ) + + call test_char_sorts( ) + + call test_string_sorts( ) + + call test_int_sort_indexes( ) + + call test_char_sort_indexes( ) + + call test_string_sort_indexes( ) + +contains + + subroutine test_int_ord_sorts( ) + + call test_int_ord_sort( blocks, "Blocks" ) + call test_int_ord_sort( decrease, "Decreasing" ) + call test_int_ord_sort( identical, "Identical" ) + call test_int_ord_sort( increase, "Increasing" ) + call test_int_ord_sort( rand1, "Random dense" ) + call test_int_ord_sort( rand2, "Random order" ) + call test_int_ord_sort( rand0, "Random sparse" ) + call test_int_ord_sort( rand3, "Random 3" ) + call test_int_ord_sort( rand10, "Random 10" ) + + end subroutine test_int_ord_sorts + + + subroutine test_int_ord_sort( a, a_name ) + integer(int32), intent(in) :: a(:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + dummy = a + call system_clock( t0, rate ) + call ord_sort( dummy, work ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_sort( dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "ORD_SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i) + end if + write( lun, '("| Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + test_size, a_name, "Ord_Sort", tdiff/rate + + end subroutine test_int_ord_sort + + subroutine test_char_ord_sorts( ) + + call test_char_ord_sort( char_decrease, "Char. Decrease" ) + call test_char_ord_sort( char_increase, "Char. Increase" ) + call test_char_ord_sort( char_rand, "Char. Random" ) + + end subroutine test_char_ord_sorts + + subroutine test_char_ord_sort( a, a_name ) + character(len=4), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + char_dummy = a + call system_clock( t0, rate ) + call ord_sort( char_dummy, char_work ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_char_sort( char_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "ORD_SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i) + end if + write( lun, '("| Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + char_size, a_name, "Ord_Sort", tdiff/rate + + end subroutine test_char_ord_sort + + subroutine test_string_ord_sorts( ) + + call test_string_ord_sort( string_decrease, "String Decrease" ) + call test_string_ord_sort( string_increase, "String Increase" ) + call test_string_ord_sort( string_rand, "String Random" ) + + end subroutine test_string_ord_sorts + + subroutine test_string_ord_sort( a, a_name ) + type(string_type), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + string_dummy = a + call system_clock( t0, rate ) + call ord_sort( string_dummy, string_work ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_string_sort( string_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "ORD_SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', & + string_dummy(i-1:i) + end if + write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + string_size, a_name, "Ord_Sort", tdiff/rate + + end subroutine test_string_ord_sort + + + subroutine test_int_sorts( ) + + call test_int_sort( blocks, "Blocks" ) + call test_int_sort( decrease, "Decreasing" ) + call test_int_sort( identical, "Identical" ) + call test_int_sort( increase, "Increasing" ) + call test_int_sort( rand1, "Random dense" ) + call test_int_sort( rand2, "Random order" ) + call test_int_sort( rand0, "Random sparse" ) + call test_int_sort( rand3, "Random 3" ) + call test_int_sort( rand10, "Random 10" ) + + end subroutine test_int_sorts + + subroutine test_int_sort( a, a_name ) + integer(int32), intent(in) :: a(:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + dummy = a + call system_clock( t0, rate ) + call sort( dummy ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_sort( dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a12, 2i7)') 'dummy(i-1:i) = ', dummy(i-1:i) + end if + write( lun, '("| Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + test_size, a_name, "Sort", tdiff/rate + + end subroutine test_int_sort + + subroutine test_char_sorts( ) + + call test_char_sort( char_decrease, "Char. Decrease" ) + call test_char_sort( char_increase, "Char. Increase" ) + call test_char_sort( char_rand, "Char. Random" ) + + end subroutine test_char_sorts + + subroutine test_char_sort( a, a_name ) + character(len=4), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + char_dummy = a + call system_clock( t0, rate ) + call sort( char_dummy ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_char_sort( char_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i) + end if + write( lun, '("| Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + char_size, a_name, "Sort", tdiff/rate + + end subroutine test_char_sort + + subroutine test_string_sorts( ) + + call test_string_sort( string_decrease, "String Decrease" ) + call test_string_sort( string_increase, "String Increase" ) + call test_string_sort( string_rand, "String Random" ) + + end subroutine test_string_sorts + + subroutine test_string_sort( a, a_name ) + type(string_type), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + string_dummy = a + call system_clock( t0, rate ) + call sort( string_dummy ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_string_sort( string_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', & + string_dummy(i-1:i) + end if + write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + string_size, a_name, "Sort", tdiff/rate + + end subroutine test_string_sort + + subroutine test_int_sort_indexes( ) + + call test_int_sort_index( blocks, "Blocks" ) + call test_int_sort_index( decrease, "Decreasing" ) + call test_int_sort_index( identical, "Identical" ) + call test_int_sort_index( increase, "Increasing" ) + call test_int_sort_index( rand1, "Random dense" ) + call test_int_sort_index( rand2, "Random order" ) + call test_int_sort_index( rand0, "Random sparse" ) + call test_int_sort_index( rand3, "Random 3" ) + call test_int_sort_index( rand10, "Random 10" ) + + end subroutine test_int_sort_indexes + + subroutine test_int_sort_index( a, a_name ) + integer(int32), intent(inout) :: a(:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + dummy = a + call system_clock( t0, rate ) + call sort_index( dummy, index, work, iwork ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + dummy = a(index) + call verify_sort( dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT_INDEX did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a18, 2i7)') 'a(index(i-1:i)) = ', a(index(i-1:i)) + end if + write( lun, '("| Integer |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + test_size, a_name, "Sort_Index", tdiff/rate + + dummy = a + call sort_index( dummy, index, work, iwork, reverse=.true. ) + dummy = a(index) + call verify_reverse_sort( dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT_INDEX did not reverse sort " // & + a_name // "." + write(*,*) 'i = ', i + write(*,'(a18, 2i7)') 'a(index(i-1:i)) = ', a(index(i-1:i)) + end if + + end subroutine test_int_sort_index + + subroutine test_char_sort_indexes( ) + + call test_char_sort_index( char_decrease, "Char. Decrease" ) + call test_char_sort_index( char_increase, "Char. Increase" ) + call test_char_sort_index( char_rand, "Char. Random" ) + + end subroutine test_char_sort_indexes + + subroutine test_char_sort_index( a, a_name ) + character(len=4), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + char_dummy = a + call system_clock( t0, rate ) + call sort_index( char_dummy, index, char_work, iwork ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_char_sort( char_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT_INDEX did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'char_dummy(i-1:i) = ', char_dummy(i-1:i) + end if + write( lun, '("| Character |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + char_size, a_name, "Sort_Index", tdiff/rate + + end subroutine test_char_sort_index + + subroutine test_string_sort_indexes( ) + + call test_string_sort_index( string_decrease, "String Decrease" ) + call test_string_sort_index( string_increase, "String Increase" ) + call test_string_sort_index( string_rand, "String Random" ) + + end subroutine test_string_sort_indexes + + subroutine test_string_sort_index( a, a_name ) + type(string_type), intent(in) :: a(0:) + character(*), intent(in) :: a_name + + integer(int64) :: t0, t1, tdiff + real(dp) :: rate + integer(int64) :: i + logical :: valid + + tdiff = 0 + do i = 1, repeat + string_dummy = a + call system_clock( t0, rate ) + call sort_index( string_dummy, index, string_work, iwork ) + call system_clock( t1, rate ) + tdiff = tdiff + t1 - t0 + end do + tdiff = tdiff/repeat + + call verify_string_sort( string_dummy, valid, i ) + if ( .not. valid ) then + write( *, * ) "SORT_INDEX did not sort " // a_name // "." + write(*,*) 'i = ', i + write(*,'(a17, 2(1x,a4))') 'string_dummy(i-1:i) = ', & + string_dummy(i-1:i) + end if + write( lun, '("| String_type |", 1x, i7, 2x, "|", 1x, a15, " |", ' // & + 'a12, " |", F10.5, " |" )' ) & + string_size, a_name, "Sort_Index", tdiff/rate + + end subroutine test_string_sort_index + + + subroutine verify_sort( a, valid, i ) + integer(int32), intent(in) :: a(0:) + logical, intent(out) :: valid + integer(int64), intent(out) :: i + + integer(int64) :: n + + n = size( a, kind=int64 ) + valid = .false. + do i=1, n-1 + if ( a(i-1) > a(i) ) return + end do + valid = .true. + + end subroutine verify_sort + + + subroutine verify_string_sort( a, valid, i ) + type(string_type), intent(in) :: a(0:) + logical, intent(out) :: valid + integer(int64), intent(out) :: i + + integer(int64) :: n + + n = size( a, kind=int64 ) + valid = .false. + do i=1, n-1 + if ( a(i-1) > a(i) ) return + end do + valid = .true. + + end subroutine verify_string_sort + + subroutine verify_char_sort( a, valid, i ) + character(len=4), intent(in) :: a(0:) + logical, intent(out) :: valid + integer(int64), intent(out) :: i + + integer(int64) :: n + + n = size( a, kind=int64 ) + valid = .false. + do i=1, n-1 + if ( a(i-1) > a(i) ) return + end do + valid = .true. + + end subroutine verify_char_sort + + + subroutine verify_reverse_sort( a, valid, i ) + integer(int32), intent(in) :: a(0:) + logical, intent(out) :: valid + integer(int64), intent(out) :: i + + integer(int64) :: n + + n = size( a, kind=int64 ) + valid = .false. + do i=1, n-1 + if ( a(i-1) < a(i) ) return + end do + valid = .true. + + end subroutine verify_reverse_sort + +end program test_sorting