Skip to content

Commit 5d4c8e7

Browse files
committed
Adding joss paper.md
1 parent 9c9fb19 commit 5d4c8e7

File tree

2 files changed

+118
-0
lines changed

2 files changed

+118
-0
lines changed

docs/paper/paper.bib

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
@article{dalcin18,
2+
author = {Lisandro Dalc{\'{\i}}n and
3+
Mikael Mortensen and
4+
David E. Keyes},
5+
title = {Fast parallel multidimensional {FFT} using advanced {MPI}},
6+
journal = {CoRR},
7+
volume = {abs/1804.09536},
8+
year = {2018},
9+
url = {http://arxiv.org/abs/1804.09536},
10+
archivePrefix = {arXiv},
11+
eprint = {1804.09536}
12+
}
13+
@article{mortensen_joss,
14+
author = {Mortensen, Mikael},
15+
year = 2018,
16+
title = {Shenfun: High performance spectral Galerkin computing platform},
17+
journal = {Journal of Open Source Software},
18+
volume = 3,
19+
number = 31,
20+
pages = 1071,
21+
doi = https://doi.org/10.21105/joss.01071
22+
}
23+
24+
@inproceedings{mortensen17,
25+
author = {Mortensen, Mikael},
26+
booktitle = {MekIT'17 - Ninth national conference on Computational Mechanics},
27+
isbn = {978-84-947311-1-2},
28+
pages = {273--298},
29+
publisher = {International Center for Numerical Methods in Engineering (CIMNE)},
30+
title = {Shenfun - automating the spectral Galerkin method},
31+
editor = {Skallerud, Bjorn Helge and Andersson, Helge Ingolf},
32+
year = {2017}
33+
}
34+
35+
@article{strang,
36+
ISSN = {00030996},
37+
URL = {http://www.jstor.org/stable/29775194},
38+
author = {Gilbert Strang},
39+
journal = {American Scientist},
40+
number = {3},
41+
pages = {250--255},
42+
publisher = {Sigma Xi, The Scientific Research Society},
43+
title = {Wavelets},
44+
volume = {82},
45+
year = {1994}
46+
}

docs/paper/paper.md

+72
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
title: 'Mpi4py-fft'
3+
tags:
4+
- Fast Fourier transforms
5+
- Fast Chebyshev transforms
6+
- MPI
7+
- Python
8+
authors:
9+
- name: Mikael Mortensen
10+
orcid: 0000-0002-3293-7573
11+
affiliation: "1"
12+
- name: Lisandro Dalcin
13+
orcid: 0000-0001-8086-0155
14+
affiliation: "2"
15+
affiliations:
16+
- name: University of Oslo, Department of Mathematics
17+
index: 1
18+
- name: King Abdullah University of Science and Technology, Extreme Computing Research Center
19+
index: 2
20+
date: 7 November 2018
21+
bibliography: paper.bib
22+
---
23+
24+
# Summary
25+
26+
The fast Fourier transform (FFT) is an algorithm that efficiently computes the
27+
discrete Fourier transform. The FFT is one of the most important algorithms
28+
utilized throughout science and society and it has been named *the most
29+
important numerical algorith of our time* by Prof Gilbert Strang [@strang].
30+
31+
``Mpi4py-fft`` (https://bitbucket.org/mpi4py/mpi4py-fft) is an open-source
32+
Python package for computing (in parallel) FFTs of possibly very large and
33+
distributed multidimensional arrays.
34+
A multidimensional FFT is computed sequentially over all axes, one axis at the time.
35+
A problem with parallel FFTs is that, to fit in the memory of multiple processors,
36+
multidimensional arrays will be distributed along some, but not all, of its axes.
37+
Consequently, parallel FFTs are computed as sequential (serial) transforms over
38+
non-distributed axes, combined with global redistributions (using MPI) that
39+
realign the arrays for further serial transforms. A parallel FFT is, in other
40+
words, computed as a combination of serial FFTs and global redistributions.
41+
42+
For global redistribution ``mpi4py-fft`` makes use of a new and completely
43+
generic algorithm [@dalcin18] that allows for any index sets of a
44+
multidimensional array to be distributed. We can distribute just one index
45+
(a slab decomposition), two index sets (pencil decomposition) or even more for
46+
higher-dimensional arrays. The required MPI communications are always handled
47+
under the hood by MPI for Python. For serial transforms
48+
``mpi4py-fft`` wraps most of the FFTW library using Cython, making it callable
49+
from Python. We include wrappers for complex-to-complex, real-to-complex,
50+
complex-to-real and real-to-real transforms.
51+
52+
``Mpi4py-fft`` is highly configurable in how it distributes and redistributes
53+
arrays. Large arrays may be globally redistributed for alignement
54+
along any given axis, whenever needed by the user. This
55+
flexibility has enabled the development of ``shenfun``
56+
[@mortensen_joss,@mortensen17], which is a computing platform
57+
for solving partial differential equations (PDEs) by the spectral Galerkin method.
58+
In ``shenfun`` it is possible to solve PDEs of any given dimensionality, by creating
59+
tensor product bases as outer products of one-dimensional bases. This leads to
60+
large multidimensional arrays that are distributed effortlessly using ``mpi4py-fft``.
61+
62+
``Mpi4py-fft`` can be utilized by anyone that needs to perform FFTs on large
63+
multidimensional arrays. It is installable from ``pypi`` and conda-forge, and
64+
released under a permissive 2-clause BSD-license, in the hope that it will be
65+
useful.
66+
67+
# Acknowledgements
68+
69+
M Mortensen acknowledges support from the 4DSpace Strategic Research Initiative at the
70+
University of Oslo
71+
72+
# References

0 commit comments

Comments
 (0)