Skip to content

Commit adc27a4

Browse files
Implement unique function that returns only the unique values in a vector (Issue fortran-lang#940)
1 parent 2bdc50e commit adc27a4

8 files changed

+598
-5
lines changed

doc/specs/stdlib_sorting_unique.md

+176
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
title: unique function
3+
---
4+
5+
# The `unique` function
6+
7+
[TOC]
8+
9+
## Introduction
10+
11+
This function returns an array containing only the unique values extracted from an input array. This is useful for removing duplicates from datasets and finding the distinct elements in a collection.
12+
13+
## Status
14+
15+
The `unique` function is currently in **experimental** status.
16+
17+
## Version History
18+
19+
|Version|Change|
20+
|---|---|
21+
|v0.1.0|Initial functionality in experimental status|
22+
23+
## Requirements
24+
25+
This function has been designed to handle arrays of different types, including intrinsic numeric types, character arrays, and `string_type` arrays. The function should be efficient while maintaining an easy-to-use interface.
26+
27+
## Usage
28+
29+
```fortran
30+
! Get unique values from an integer array
31+
integer :: x(5) = [1, 2, 3, 3, 4]
32+
integer, allocatable :: y(:)
33+
y = unique(x) ! y will be [1, 2, 3, 4]
34+
35+
! Get sorted unique values from a real array
36+
real :: a(8) = [3.1, 2.5, 7.2, 3.1, 2.5, 8.0, 7.2, 9.5]
37+
real, allocatable :: b(:)
38+
b = unique(a, sorted=.true.) ! b will be [2.5, 3.1, 7.2, 8.0, 9.5]
39+
```
40+
41+
## API
42+
43+
### `unique` - Returns unique values from an array
44+
45+
#### Interface
46+
47+
```fortran
48+
pure function unique(array, sorted) result(unique_values)
49+
<type>, intent(in) :: array(:)
50+
logical, intent(in), optional :: sorted
51+
<type>, allocatable :: unique_values(:)
52+
end function unique
53+
```
54+
55+
where `<type>` can be any of:
56+
* `integer(int8)`, `integer(int16)`, `integer(int32)`, `integer(int64)`
57+
* `real(sp)`, `real(dp)`, `real(xdp)`, `real(qp)`
58+
* `complex(sp)`, `complex(dp)`, `complex(xdp)`, `complex(qp)`
59+
* `character(len=*)`
60+
* `type(string_type)`
61+
62+
#### Arguments
63+
64+
`array`: Array whose unique values need to be extracted.
65+
66+
`sorted` (optional): Whether the output vector needs to be sorted or not. Default is `.false.`.
67+
68+
#### Result
69+
70+
The function returns an allocatable array containing only the unique values from the input array.
71+
72+
If `sorted` is `.true.`, the returned array will be sorted in order of non-decreasing values.
73+
74+
If `sorted` is `.false.` (the default), the order of elements is unspecified but generally reflects the order of first appearance of each unique value in the input array.
75+
76+
## Examples
77+
78+
### Example 1: Basic usage with integers
79+
80+
```fortran
81+
program example_unique_integers
82+
use stdlib_sorting, only: unique
83+
implicit none
84+
85+
integer :: data(10) = [1, 2, 3, 3, 4, 5, 5, 6, 6, 6]
86+
integer, allocatable :: unique_values(:)
87+
88+
! Get unique values
89+
unique_values = unique(data)
90+
91+
! Print the results
92+
print *, "Original array: ", data
93+
print *, "Unique values: ", unique_values
94+
95+
end program example_unique_integers
96+
```
97+
98+
Expected output:
99+
```
100+
Original array: 1 2 3 3 4 5 5 6 6 6
101+
Unique values: 1 2 3 4 5 6
102+
```
103+
104+
### Example 2: Using the sorted option with real values
105+
106+
```fortran
107+
program example_unique_reals
108+
use stdlib_kinds, only: sp
109+
use stdlib_sorting, only: unique
110+
implicit none
111+
112+
real(sp) :: data(8) = [3.1, 2.5, 7.2, 3.1, 2.5, 8.0, 7.2, 9.5]
113+
real(sp), allocatable :: unique_values(:)
114+
115+
! Get unique values in sorted order
116+
unique_values = unique(data, sorted=.true.)
117+
118+
! Print the results
119+
print *, "Original array: ", data
120+
print *, "Sorted unique values: ", unique_values
121+
122+
end program example_unique_reals
123+
```
124+
125+
Expected output:
126+
```
127+
Original array: 3.1 2.5 7.2 3.1 2.5 8.0 7.2 9.5
128+
Sorted unique values: 2.5 3.1 7.2 8.0 9.5
129+
```
130+
131+
### Example 3: Working with character arrays
132+
133+
```fortran
134+
program example_unique_strings
135+
use stdlib_sorting, only: unique
136+
implicit none
137+
138+
character(len=6) :: data(7) = ["apple ", "banana", "cherry", "apple ", "date ", "banana", "cherry"]
139+
character(len=6), allocatable :: unique_values(:)
140+
integer :: i
141+
142+
! Get unique values
143+
unique_values = unique(data)
144+
145+
! Print the results
146+
print *, "Original array:"
147+
do i = 1, size(data)
148+
print *, data(i)
149+
end do
150+
151+
print *, "Unique values:"
152+
do i = 1, size(unique_values)
153+
print *, unique_values(i)
154+
end do
155+
156+
end program example_unique_strings
157+
```
158+
159+
## Implementation Notes
160+
161+
The implementation uses a sorting-based approach to identify unique elements efficiently. When `sorted=.true.`, the algorithm sorts the input array and then identifies adjacent duplicate elements. When `sorted=.false.`, the function still uses sorting internally but ensures that the order of first appearance is preserved.
162+
163+
## Future Enhancements
164+
165+
Future versions might include additional features:
166+
167+
1. Return the indices of the first occurrence of each unique element
168+
2. Return indices that can reconstruct the original array from the unique elements
169+
3. Support for multi-dimensional arrays
170+
4. Tolerance parameter for floating-point comparisons
171+
172+
## Related Functions
173+
174+
* `sort` - Sorts an array in ascending or descending order
175+
* `sort_index` - Creates index array that would sort an array
176+
* `ord_sort` - Performs a stable sort on an array

example/sorting/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@ ADD_EXAMPLE(sort)
33
ADD_EXAMPLE(sort_index)
44
ADD_EXAMPLE(radix_sort)
55
ADD_EXAMPLE(sort_bitset)
6+
ADD_EXAMPLE(unique)

example/sorting/example_unique.f90

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
program example_unique
2+
use stdlib_kinds, only: dp, sp
3+
use stdlib_sorting, only: unique
4+
use stdlib_string_type, only: string_type
5+
implicit none
6+
7+
! Example with integer array
8+
integer :: int_array(10) = [1, 2, 3, 3, 4, 5, 5, 6, 6, 6]
9+
integer, allocatable :: int_unique(:)
10+
11+
! Example with real array
12+
real(sp) :: real_array(8) = [3.1, 2.5, 7.2, 3.1, 2.5, 8.0, 7.2, 9.5]
13+
real(sp), allocatable :: real_unique(:)
14+
15+
! Example with character array
16+
character(len=6) :: char_array(7) = ["apple ", "banana", "cherry", "apple ", "date ", "banana", "cherry"]
17+
character(len=6), allocatable :: char_unique(:)
18+
19+
! Example with string_type array
20+
type(string_type) :: string_array(8), string_unique_sorted(4)
21+
type(string_type), allocatable :: string_unique(:)
22+
23+
integer :: i
24+
25+
! Setup string array
26+
string_array(1) = "apple"
27+
string_array(2) = "banana"
28+
string_array(3) = "cherry"
29+
string_array(4) = "apple"
30+
string_array(5) = "date"
31+
string_array(6) = "banana"
32+
string_array(7) = "cherry"
33+
string_array(8) = "apple"
34+
35+
! Get unique integer values
36+
int_unique = unique(int_array)
37+
print *, "Unique integers:", int_unique
38+
39+
! Get sorted unique integer values
40+
int_unique = unique(int_array, sorted=.true.)
41+
print *, "Sorted unique integers:", int_unique
42+
43+
! Get unique real values
44+
real_unique = unique(real_array)
45+
print *, "Unique reals:", real_unique
46+
47+
! Get sorted unique real values
48+
real_unique = unique(real_array, sorted=.true.)
49+
print *, "Sorted unique reals:", real_unique
50+
51+
! Get unique character values
52+
char_unique = unique(char_array)
53+
print *, "Unique strings:"
54+
do i = 1, size(char_unique)
55+
print *, char_unique(i)
56+
end do
57+
58+
! Get unique string_type values (sorted)
59+
string_unique = unique(string_array, sorted=.true.)
60+
print *, "Sorted unique string_type values:"
61+
do i = 1, size(string_unique)
62+
print *, string_unique(i)
63+
end do
64+
end program example_unique

src/CMakeLists.txt

+2
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ set(fppFiles
4848
stdlib_sorting_ord_sort.fypp
4949
stdlib_sorting_sort.fypp
5050
stdlib_sorting_sort_index.fypp
51+
stdlib_sorting_unique.fypp
52+
stdlib_sorting_unique_impl.fypp
5153
stdlib_sparse_constants.fypp
5254
stdlib_sparse_conversion.fypp
5355
stdlib_sparse_kinds.fypp

src/stdlib_sorting.fypp

+6-3
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@
7070
!! in the Fortran Standard Library under the MIT license provided
7171
!! we cite:
7272
!!
73-
!! Musser, D.R., Introspective Sorting and Selection Algorithms,
73+
!! Musser, D.R., "Introspective Sorting and Selection Algorithms,"
7474
!! Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
7575
!!
7676
!! as the official source of the algorithm.
@@ -135,13 +135,16 @@ module stdlib_sorting
135135

136136
use stdlib_bitsets, only: bitset_64, bitset_large, &
137137
assignment(=), operator(>), operator(>=), operator(<), operator(<=)
138-
138+
139+
use stdlib_sorting_unique, only: unique
140+
139141
implicit none
140142
private
141143

142144
integer, parameter, public :: int_index = int64 !! Integer kind for indexing
143145
integer, parameter, public :: int_index_low = int32 !! Integer kind for indexing using less than `huge(1_int32)` values
144-
146+
147+
public :: unique
145148

146149
! Constants for use by tim_sort
147150
integer, parameter :: &

0 commit comments

Comments
 (0)