diff --git a/_data/learning.yml b/_data/learning.yml index def5dd289..46abe662f 100644 --- a/_data/learning.yml +++ b/_data/learning.yml @@ -53,6 +53,23 @@ books: - link: /learn/os_setup/ides - link: /learn/os_setup/tips + - title: Fortran Best Practices + description: This tutorial collects a modern canonical way of doing things in Fortran. + category: Getting started + link: /learn/best_practices + pages: + - link: /learn/best_practices/style_guide + - link: /learn/best_practices/floating_point + - link: /learn/best_practices/integer_division + - link: /learn/best_practices/modules_programs + - link: /learn/best_practices/arrays + - link: /learn/best_practices/multidim_arrays + - link: /learn/best_practices/element_operations + - link: /learn/best_practices/allocatable_arrays + - link: /learn/best_practices/file_io + - link: /learn/best_practices/callbacks + - link: /learn/best_practices/type_casting + # Web links listed at the bottom of the 'Learn' landing page # reference-links: diff --git a/learn/best_practices.md b/learn/best_practices.md deleted file mode 100644 index 7ecfc63ef..000000000 --- a/learn/best_practices.md +++ /dev/null @@ -1,5 +0,0 @@ ---- -layout: page -title: Fortran best practices -permalink: /learn/best_practices ---- \ No newline at end of file diff --git a/learn/best_practices/allocatable_arrays.md b/learn/best_practices/allocatable_arrays.md new file mode 100644 index 000000000..7bf5e9296 --- /dev/null +++ b/learn/best_practices/allocatable_arrays.md @@ -0,0 +1,131 @@ +--- +layout: book +title: Allocatable Arrays +permalink: /learn/best_practices/allocatable_arrays +--- + +The ``allocatable`` attribute provides safe way for memory handling. +In comparison to variables with ``pointer`` attribute the memory is managed +automatically and will be deallocated automatically once the variable goes +out-of-scope. Using ``allocatable`` variables removes the possibility to +create memory leaks in an application. + +They can be used in subroutines to create scratch or work arrays, where +automatic arrays would become too large to fit on the stack. + +```fortran +real(dp), allocatable :: temp(:) +allocate(temp(10)) +``` + +The allocation status can be checked using the ``allocated`` intrinsic +to avoid uninitialized access + +```fortran +subroutine show_arr(arr) + integer, allocatable, intent(in) :: arr(:) + + if (allocated(arr)) then + print *, arr + end if +end subroutine show_arr +``` + +To allocate variables inside a procedure the dummy argument has to carry +the ``allocatable`` attribute. Using it in combination with ``intent(out)`` +will deallocate previous allocations before entering the procedure: + +```fortran +subroutine foo(lam) + real(dp), allocatable, intent(out) :: lam(:) + allocate(lam(5)) +end subroutine foo +``` + +The allocated array can be used afterwards like a normal array + +```fortran +real(dp), allocatable :: lam(:) +call foo(lam) +``` + +An already allocated array cannot be allocated again without prior deallocation. +Similarly, deallocation can only be invoked for allocated arrays. To reallocate +an array use + +```fortran +if (allocated(lam)) deallocate(lam) +allocate(lam(10)) +``` + +Passing allocated arrays to procedures does not require the ``allocatable`` attribute +for the dummy arguments anymore. + +```fortran +subroutine show_arr(arr) + integer, intent(in) :: arr(:) + + print *, arr +end subroutine show_arr + +subroutine proc + integer :: i + integer, allocatable :: arr + + allocate(arr(5)) + + do i = 1, size(arr) + arr(i) = 2*i + 1 + end do + call show_arr(arr) +end subroutine proc +``` + +Passing an unallocated array in this context will lead to an invalid memory access. +Allocatable arrays can be passed to ``optional`` dummy arguments, if they are unallocated +the argument will not be present. The ``allocatable`` attribute is not limited to +arrays and can also be associated with scalars, which can be useful in combination +with ``optional`` dummy arguments. + +Allocations can be moved between different arrays with ``allocatable`` attribute +using the ``move_alloc`` intrinsic subroutine. + +```fortran +subroutine resize(var, n) + real(wp), allocatable, intent(inout) :: var(:) + integer, intent(in), optional :: n + integer :: this_size, new_size + integer, parameter :: inital_size = 16 + + if (allocated(var)) then + this_size = size(var, 1) + call move_alloc(var, tmp) + else + this_size = initial_size + end if + + if (present(n)) then + new_size = n + else + new_size = this_size + this_size/2 + 1 + end if + + allocate(var(new_size)) + + if (allocated(tmp)) then + this_size = min(size(tmp, 1), size(var, 1)) + var(:this_size) = tmp(:this_size) + end if +end subroutine resize +``` + +Finally, allocations do not initialize the array, the content of the uninitialized +array is most likely just the bytes of whatever was previously at the respective address. +The allocation supports initialization using the source attribute: + +```fortran +real(dp), allocatable :: arr(:) +allocate(arr(10), source=0.0_dp) +``` + +The ``source`` keyword supports scalar and array valued variables and constants. diff --git a/learn/best_practices/arrays.md b/learn/best_practices/arrays.md new file mode 100644 index 000000000..05d302154 --- /dev/null +++ b/learn/best_practices/arrays.md @@ -0,0 +1,200 @@ +--- +layout: book +title: Arrays +permalink: /learn/best_practices/arrays +--- + +Arrays are a central object in Fortran. The creation of dynamic sized arrays +is discussed in the [allocatable arrays arrays](./allocatable_arrays.html). + +To pass arrays to procedures four ways are available + +1. *assumed-shape* arrays +4. *assumed-rank* arrays +2. *explicit-shape* arrays +3. *assumed-size* arrays + +The preferred way to pass arrays to procedures is as *assumed-shape* arrays + +```fortran +subroutine f(r) + real(dp), intent(out) :: r(:) + integer :: n, i + n = size(r) + do i = 1, n + r(i) = 1.0_dp / i**2 + end do +end subroutine f +``` + +Higher dimensional arrays can be passed in a similar way. + +```fortran +subroutine g(A) + real(dp), intent(in) :: A(:, :) + ... +end subroutine g +``` + +The array is simply passed by + +```fortran +real(dp) :: r(5) +call f(r) +``` + +In this case no array copy is done, which has the advantage that the shape and size +information is automatically passed along and checked at compile and optionally at +runtime. +Similarly, array strides can be passed without requiring a copy of the array but as +*assumed-shape* discriptor: + +```fortran +real(dp) :: r(10) +call f(r(1:10:2)) +call f(r(2:10:2)) +``` + +This should always be your default way of passing arrays in and out of subroutines. +Avoid passing arrays as whole slices, as it obfuscates the actual intent of the code: + +```fortran +real(dp) :: r(10) +call f(r(:)) +``` + +In case more general arrays should be passed to a procedure the *assumed-rank* +functionality introduced in the Fortran 2018 standard can be used + +```fortran +subroutine h(r) + real(dp), intent(in) :: r(..) + select rank(r) + rank(1) + ! ... + rank(2) + ! ... + end select +end subroutine h +``` + +The actual rank can be queried at runtime using the ``select rank`` construct. +This easily allows to create more generic functions that have to deal with +differnet array ranks. + +*Explicit-shape* arrays can be useful for returning data from functions. +Most of their functionality can be provided by *assumed-shape* and *assumed-rank* +arrays but they find frequent use for interfacing with C or in legacy Fortran +procedures, therefore they will be discussed briefly here. + +To use *explicit-shape* arrays, the dimension has to be passed explicitly as dummy +argument like in the example below + +``` fortran +subroutine f(n, r) + integer, intent(in) :: n + real(dp), intent(out) :: r(n) + integer :: i + do i = 1, n + r(i) = 1.0_dp / i**2 + end do +end subroutine +``` + +For high-dimensional arrays additional indices have to be passed. + +``` fortran +subroutine g(m, n, A) + integer, intent(in) :: m, n + real(dp), intent(in) :: A(m, n) + ... +end subroutine +``` + +The routines can be invoked by + +``` fortran +real(dp) :: r(5), s(3, 4) +call f(size(r), r) +call g(size(s, 1), size(s, 2), s) +``` + +Note that the shape is not checked, therefore the following would be valid code +with will potentially yield incorrect results: + +```fortran +real(dp) :: s(3, 4) +call g(size(s), 1, s) ! s(12, 1) in g +call g(size(s, 2), size(s, 1), s) ! s(4, 3) in g +``` + +In this case the memory layout is preserved but the shape is changed. +Also, *explicit-shape* arrays require contiguous memory and will create temporary +arrays in case non-contiguous array strides are passed. + +To return an array from a function with *explicit-shape* use + +``` fortran +function f(n) result(r) + integer, intent(in) :: n + real(dp) :: r(n) + integer :: i + do i = 1, n + r(i) = 1.0_dp / i**2 + end do +end function +``` + +Finally, there are *assumed-size* arrays, which provide the least compile and runtime +checking and can be found be found frequently in legacy code, they should be avoided +in favour of *assumed-shape* or *assumed-rank* arrays. +An *assumed-size* array dummy argument is identified by an asterisk as the last dimension, +this disables the usage of this array with many intrinsic functions, like ``size`` or +``shape``. + +To check for the correct size and shape of an *assumed-shape* array the ``size`` and +``shape`` intrinsic functions can be used to query for those properties + +```fortran +if (size(r) /= 4) error stop "Incorrect size of 'r'" +if (any(shape(r) /= [2, 2])) error stop "Incorrect shape of 'r'" +``` + +Note that ``size`` returns the total size of all dimensions, to obtain the shape of +a specific dimension add it as second argument to the function. + +Arrays can be initialized by using an array constructor + +```fortran +integer :: r(5) +r = [1, 2, 3, 4, 5] +``` + +The array constructor can be annoted with the type of the constructed array + +```fortran +real(dp) :: r(5) +r = [real(dp) :: 1, 2, 3, 4, 5] +``` + +Implicit do loops can be used inside an array constructor as well + +```fortran +integer :: i +real(dp) :: r(5) +r = [(real(i**2, dp), i = 1, size(r))] +``` + +In order for the array to start with different index than 1, do: + +```fortran +subroutine print_eigenvalues(kappa_min, lam) + integer, intent(in) :: kappa_min + real(dp), intent(in) :: lam(kappa_min:) + + integer :: kappa + do kappa = kappa_min, ubound(lam, 1) + print *, kappa, lam(kappa) + end do +end subroutine print_eigenvalues +``` diff --git a/learn/best_practices/callbacks.md b/learn/best_practices/callbacks.md new file mode 100644 index 000000000..be7121e35 --- /dev/null +++ b/learn/best_practices/callbacks.md @@ -0,0 +1,93 @@ +--- +layout: book +title: Callbacks +permalink: /learn/best_practices/callbacks +--- + +A callback is a function that is passed as an argument to another function. + +The preferred way of creating such a callback is to provide an *abstract interface* +declaring the signature of the callback. This allows to use compile time checks +for the passed callback. + +```fortran +module integrals + use types, only: dp + implicit none + private + public :: simpson, integratable_function + + abstract interface + function integratable_function(x) result(func) + import :: dp + real(dp), intent(in) :: x + real(dp) :: func + end function + end interface + +contains + + function simpson(f, a, b) result(s) + real(dp), intent(in) :: a, b + procedure(integratable_function) :: f + real(dp) :: s + + s = (b-a) / 6 * (f(a) + 4*f((a+b)/2) + f(b)) + end function simpson + +end module integrals +``` + +The function can than be used with a callback by importing the module +as shown in the following example + +```fortran +module demo_functions + use types, only: dp + implicit none + private + public :: test_integral + +contains + + subroutine test_integral(a, k) + real(dp), intent(in) :: a, k + + print *, simpson(f, 0._dp, pi) + print *, simpson(f, 0._dp, 2*pi) + contains + + function f(x) result(y) + real(dp), intent(in) :: x + real(dp) :: y + y = a*sin(k*x) + end function f + end subroutine test_integral + +end module demo_functions +``` + +Exporting the abstract interface allows to create procedure pointers with the +correct signature and also to extend the callback further like shown here + +```fortran +module demo_integrals + use types, only: dp + use integrals, only: simpson, integratable_function + implicit none + private + public :: simpson2, integratable_function + +contains + + function simpson2(f, a, b) result(s) + real(dp), intent(in) :: a, b + procedure(integratable_function) :: f + real(dp) :: s + real(dp) :: mid + mid = (a + b)/2 + s = simpson(f, a, mid) + simpson(f, mid, b) + end function simpson2 + +end module demo_integrals +``` diff --git a/learn/best_practices/element_operations.md b/learn/best_practices/element_operations.md new file mode 100644 index 000000000..241657412 --- /dev/null +++ b/learn/best_practices/element_operations.md @@ -0,0 +1,128 @@ +--- +layout: book +title: Element-wise Operations on Arrays +permalink: /learn/best_practices/element_operations +--- + +There are three approaches to perform element-wise operations on arrays when using subroutines and functions: + +- `elemental` procedures +- *explicit-shape* arrays +- implementing the operation for vectors and write simple wrapper + subroutines (that use `reshape` internally) for each array shape + +In the first approach, one uses the `elemental` keyword to create a +function like this: + +``` fortran +real(dp) elemental function nroot(n, x) result(y) + integer, intent(in) :: n + real(dp), intent(in) :: x + y = x**(1._dp / n) +end function +``` + +All arguments (in and out) must be scalars. You can then use this +function with arrays of any (compatible) shape, for example: + +``` fortran +print *, nroot(2, 9._dp) +print *, nroot(2, [1._dp, 4._dp, 9._dp, 10._dp]) +print *, nroot(2, reshape([1._dp, 4._dp, 9._dp, 10._dp], [2, 2])) +print *, nroot([2, 3, 4, 5], [1._dp, 4._dp, 9._dp, 10._dp]) +print *, nroot([2, 3, 4, 5], 4._dp) +``` + +The output will be: + +``` fortran +3.0000000000000000 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +1.0000000000000000 1.5874010519681994 1.7320508075688772 1.5848931924611136 +2.0000000000000000 1.5874010519681994 1.4142135623730951 1.3195079107728942 +``` + +In the above, typically `n` is a parameter and `x` is the array of an +arbitrary shape, but as you can see, Fortran does not care as long as +the final operation makes sense (if one argument is an array, then the +other arguments must be either arrays of the same shape or scalars). If +it does not, you will get a compiler error. + +The `elemental` keyword implies the `pure` keyword, so the procedure +must be pure. It results that `elemental procedures` can only use `pure` procedures and have no side effects. + +If the elemental procedure algorithm can be made faster using array +operations inside, or if for some reasons the arguments must be arrays of +incompatible shapes, then one should use the other two approaches. One +can make `nroot` operate on a vector and write a simple wrapper for +other array shapes, e.g.: + +``` fortran +function nroot(n, x) result(y) + integer, intent(in) :: n + real(dp), intent(in) :: x(:) + real(dp) :: y(size(x)) + y = x**(1._dp / n) +end function + +function nroot_0d(n, x) result(y) + integer, intent(in) :: n + real(dp), intent(in) :: x + real(dp) :: y + real(dp) :: tmp(1) + tmp = nroot(n, [x]) + y = tmp(1) +end function + +function nroot_2d(n, x) result(y) + integer, intent(in) :: n + real(dp), intent(in) :: x(:, :) + real(dp) :: y(size(x, 1), size(x, 2)) + y = reshape(nroot(n, reshape(x, [size(x)])), [size(x, 1), size(x, 2)]) +end function +``` + +And use as follows: + +``` fortran +print *, nroot_0d(2, 9._dp) +print *, nroot(2, [1._dp, 4._dp, 9._dp, 10._dp]) +print *, nroot_2d(2, reshape([1._dp, 4._dp, 9._dp, 10._dp], [2, 2])) +``` + +This will print: + +``` fortran +3.0000000000000000 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +``` + +Or one can use *explicit-shape* arrays as +follows: + +``` fortran +function nroot(n, k, x) result(y) + integer, intent(in) :: n, k + real(dp), intent(in) :: x(k) + real(dp) :: y(k) + y = x**(1._dp / n) +end function +``` + +Use as follows: + +``` fortran +print *, nroot(2, 1, [9._dp]) +print *, nroot(2, 4, [1._dp, 4._dp, 9._dp, 10._dp]) +print *, nroot(2, 4, reshape([1._dp, 4._dp, 9._dp, 10._dp], [2, 2])) +``` + +The output is the same as before: + +``` fortran +3.0000000000000000 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +1.0000000000000000 2.0000000000000000 3.0000000000000000 3.1622776601683795 +``` diff --git a/learn/best_practices/file_io.md b/learn/best_practices/file_io.md new file mode 100644 index 000000000..00c893160 --- /dev/null +++ b/learn/best_practices/file_io.md @@ -0,0 +1,101 @@ +--- +layout: book +title: File Input/Output +permalink: /learn/best_practices/file_io +--- + +In Fortran files are managed by unit identifiers. Interaction with the filesystem +mainly happens through the ``open`` and ``inquire`` built-in procedures. +Generally, the workflow is to open a file to a unit identifier, read and/or write +to it and close it again. + +```fortran +integer :: io +open(newunit=io, file="log.txt") +! ... +close(io) +``` + +By default the file will be created if it is not existing already and opened for +both reading and writing. Writing to an existing file will start in the first +record (line) and therefore overwrite the file by default. + +To create a read-only access to a file the ``status`` and ``action`` have to be +specified with + +```fortran +integer :: io +open(newunit=io, file="log.txt", status="old", action="read") +read(io, *) a, b +close(io) +``` + +In case the file is not present a runtime error will occur. To check for the existence +of a file prior to opening it the ``inquire`` function can be used + +```fortran +logical :: exists +inquire(file="log.txt", exist=exists) +if (exists) then + ! ... +end if +``` + +Alternatively, the ``open`` procedure can return an optional *iostat* and *iomsg*: + +```fortran +integer :: io, stat +character(len=512) :: msg +open(newunit=io, file="log.txt", status="old", action="read", & + iostat=stat, iomsg=msg) +if (stat /= 0) then + print *, trim(msg) +end if +``` + +Note that *iomsg* requires a fixed-length character variable with sufficent storage +size to hold the error message. + +Similarly, writing to a file happens by using the *status* and *action* keyword. +To create a new file use + +```fortran +integer :: io +open(newunit=io, file="log.txt", status="new", action="write") +write(io, *) a, b +close(io) +``` + +Alternatively, ``status="replace"`` can be used to overwrite an existing file, +it is highly recommended to first check for the existence of a file before deciding +on the *status* to use. +To append to an output file the *position* keyword can be specified explicitly with + +```fortran +integer :: io +open(newunit=io, file="log.txt", position="append", & + & status="old", action="write") +write(io, *) size(v) +write(io, *) v(:) +close(io) +``` + +To reset the position in a file the built-in procedures ``rewind`` and ``backspace`` +can be used. ``rewind`` will reset to the first record (line), while ``backspace`` will +return to the previous record (line). + +Finally, to delete a file the file has to be opened and can be deleted after closing +with + +```fortran +logical :: exists +integer :: io, stat +inquire(file="log.txt", exist=exists) +if (exists) then + open(file="log.txt", newunit=io, iostat=stat) + if (stat == 0) close(io, status="delete", iostat=stat) +end if +``` + +A useful IO feature are scratch files, which can be opened with ``status="scratch"``. +They are automatically deleted after closing the unit identifier. diff --git a/learn/best_practices/floating_point.md b/learn/best_practices/floating_point.md new file mode 100644 index 000000000..2ecd9dbc0 --- /dev/null +++ b/learn/best_practices/floating_point.md @@ -0,0 +1,87 @@ +--- +layout: book +title: Floating Point Numbers +permalink: /learn/best_practices/floating_point +--- + +The default representation of floating point numbers is using single precision +(usually 32 bits / 4 bytes). For most application a higher precision is required. +For this purpose a custom kind parameter can be defined. +The recommended way of defining kind parameters is to use + +```fortran +integer, parameter :: dp = selected_real_kind(15) +``` + +For many purposes it also suffices to directly infer the kind parameter from +a literal like here + +```fortran +integer, parameter :: dp = kind(0.0d0) +``` + +or to rename the imported kind parameter from the ``iso_fortran_env`` module + +```fortran +use, intrinsic :: iso_fortran_env, only : dp => real64 +``` + +For some insightful thoughts on kind parameters see +Doctor Fortran in it takes all KINDs. + +It is recommended to have a central module to define kind parameters and include +them with use as necessary. An example for such a module is given with + +``` +!> Numerical storage size parameters for real and integer values +module kind_parameter + implicit none + public + + !> Single precision real numbers, 6 digits, range 10⁻³⁷ to 10³⁷-1; 32 bits + integer, parameter :: sp = selected_real_kind(6, 37) + !> Double precision real numbers, 15 digits, range 10⁻³⁰⁷ to 10³⁰⁷-1; 64 bits + integer, parameter :: dp = selected_real_kind(15, 307) + !> Quadruple precision real numbers, 33 digits, range 10⁻⁴⁹³¹ to 10⁴⁹³¹-1; 128 bits + integer, parameter :: qp = selected_real_kind(33, 4931) + + !> Char length for integers, range -2⁷ to 2⁷-1; 8 bits + integer, parameter :: i1 = selected_int_kind(2) + !> Short length for integers, range -2¹⁵ to 2¹⁵-1; 16 bits + integer, parameter :: i2 = selected_int_kind(4) + !> Length of default integers, range -2³¹ to 2³¹-1; 32 bits + integer, parameter :: i4 = selected_int_kind(9) + !> Long length for integers, range -2⁶³ to 2⁶³-1; 64 bits + integer, parameter :: i8 = selected_int_kind(18) + +end module kind_parameter +``` + +Floating point constants should always be declared including a kind parameter suffix: + +```fortran +real(dp) :: a, b, c +a = 1.0_dp +b = 3.5_dp +c = 1.34e8_dp +``` + +It is safe to assign integers to floating point numbers without losing accuracy: + +```fortran +real(dp) :: a +a = 3 +``` + +In order to impose floating point division (as opposed to integer +division `1/2` equal to `0`), one can convert the integer to a floating +point number by: + +```fortran +real(dp) :: a +a = real(1, dp) / 2 ! 'a' is equal to 0.5_dp +``` + +To print floating point numbers without losing precision use the unlimited +format specificer ``(g0)`` or the exponential representation ``(es24.16e3)``, +which will give you 17 significant digits of printout. diff --git a/learn/best_practices/index.md b/learn/best_practices/index.md new file mode 100644 index 000000000..2abed3bce --- /dev/null +++ b/learn/best_practices/index.md @@ -0,0 +1,12 @@ +--- +layout: book +title: Fortran Best Practices +permalink: /learn/best_practices +author: Ondřej Čertík, John Pask, Jed Brown, Matthew Emmett, Juan Luis Cano Rodríguez, Neil Carlson, Andrea Vigliotti, Pierre Haessig, Vincent Magnin, Sebastian Ehlert, Jeremie Vandenplas +--- + +This mini-book collects a modern canonical way of doing things in Fortran. +It serves as a style guide and best practice recommendation for popular topics +and common tasks. Generally, a cononical solution or pattern is presented and +discussed. It is meant for programmers with basic familarity of the Fortran syntax +and programming in general. diff --git a/learn/best_practices/integer_division.md b/learn/best_practices/integer_division.md new file mode 100644 index 000000000..b3fb26f6c --- /dev/null +++ b/learn/best_practices/integer_division.md @@ -0,0 +1,32 @@ +--- +layout: book +title: Integer Division +permalink: /learn/best_practices/integer_division +--- + +Fortran distinguishs between floating point and integer arithmetic. +It is important to note that division for integers is always using +integer arithmetic. +Consider the following example for integer division of an odd number: + +```fortran +integer :: n +n = 3 +print *, n / 2 ! prints 1 +print *, n*(n + 1)/2 ! prints 6 +print *, n/2*(n + 1) ! prints 4 +n = -3 +print *, n / 2 ! prints -1 +``` + +Be careful about whether you want to actually use integer arithmetic +in this context. If you want to use floating point arithmetic instead +make sure to cast to reals before using the division operator: + +```fortran +integer :: n +n = 3 +print *, real(n, dp) / 2 ! prints 1.5 +n = -3 +print *, real(n, dp) / 2 ! prints -1.5 +``` diff --git a/learn/best_practices/modules_programs.md b/learn/best_practices/modules_programs.md new file mode 100644 index 000000000..a8a702ee5 --- /dev/null +++ b/learn/best_practices/modules_programs.md @@ -0,0 +1,148 @@ +--- +layout: book +title: Modules and Programs +permalink: /learn/best_practices/modules_programs +--- + +Modules are the preferred way create modern Fortran libraries and applications. +As a convention, one source file should always contain only one module, while +the module name should match the filepath to allow easy navigation in larger +projects. It is also recommended to prefix module names with the library name +to avoid nameclashes when used as dependency in other projects. + +An example for such a module file is given here + +``` +!> Interface to TOML processing library. +!> +!> ... +module fpm_toml + use fpm_error, only : error_t, fatal_error, file_not_found_error + use fpm_strings, only : string_t + use tomlf, only : toml_table, toml_array, toml_key, toml_stat, get_value, & + & set_value, toml_parse, toml_error, new_table, add_table, add_array, & + & toml_serializer, len + implicit none + private + + public :: read_package_file + public :: toml_table, toml_array, toml_key, toml_stat, get_value, set_value + public :: new_table, add_table, add_array, len + public :: toml_error, toml_serializer, toml_parse + +contains + + !> Process the configuration file to a TOML data structure + subroutine read_package_file(table, manifest, error) + !> TOML data structure + type(toml_table), allocatable, intent(out) :: table + !> Name of the package configuration file + character(len=*), intent(in) :: manifest + !> Error status of the operation + type(error_t), allocatable, intent(out) :: error + ! ... + end subroutine read_package_file + +end module fpm_toml +``` + +There are a few things in this example module to highlight. First, every module +starts with comments documenting the purpose and content of the module, +similarly, every procedure starts with a comment briefly describing its +purse and the intent of the dummy arguments. Documentation is one of the most +important parts of creating long-living software, regardless of language. + +Second, imports (*use*) and exports (*public*) are explicitly given, this +allows on a glance at the module source to check the used and available +procedures, constants and derived types. The imports are usually limited +to the module scope rather than reimported in every procedure or interface +scope. Similarly, exports are made explicitly by adding a *private* statement +one a single line and explicitly listing all exported symbols in *public* +statements. + +Finally, the `implicit none` statement works for the whole module and there +is no need to repeat it within each procedure. + +Variables inside a module are static (*implicitly saved*), it is highly +recommended to limit the usage of module variables to constant expressions, +like parameters or enumerators only or export them as *protected* rather +than *public*. + +Submodules can be used to break long dependency chains and shorten recompilation +cascades in Fortran programs. They also offer the possibility to provide specialized +and optimized implementations without requiring the use of preprocessor. + +An example from the Fortran standard library is the quadrature module, which +only defines interfaces to module procedures, but no implementations + +```fortran +!> Numerical integration +!> +!> ... +module stdlib_quadrature + use stdlib_kinds, only: sp, dp, qp + implicit none + private + + public :: trapz + ! ... + + !> Integrates sampled values using trapezoidal rule + interface trapz + pure module function trapz_dx_dp(y, dx) result(integral) + real(dp), intent(in) :: y(:) + real(dp), intent(in) :: dx + real(dp) :: integral + end function trapz_dx_dp + module function trapz_x_dp(y, x) result(integral) + real(dp), intent(in) :: y(:) + real(dp), intent(in) :: x(:) + real(dp) :: integral + end function trapz_x_dp + end interface trapz + + ! ... +end module stdlib_quadrature +``` + +While the implementation is provided in separate submodules like the one for the +trapezoidal integration rule given here. + +```fortran +!> Actual implementation of the trapezoidal integration rule +!> +!> ... +submodule (stdlib_quadrature) stdlib_quadrature_trapz + use stdlib_error, only: check + implicit none + +contains + + pure module function trapz_dx_dp(y, dx) result(integral) + real(dp), intent(in) :: y(:) + real(dp), intent(in) :: dx + real(dp) :: integral + integer :: n + + n = size(y) + select case (n) + case (0:1) + integral = 0.0_dp + case (2) + integral = 0.5_dp*dx*(y(1) + y(2)) + case default + integral = dx*(sum(y(2:n-1)) + 0.5_dp*(y(1) + y(n))) + end select + end function trapz_dx_dp + + ! ... +end submodule stdlib_quadrature_trapz +``` + +Note that the module procedures do not have to be implemented in the same submodule. +Several submodules can be used to reduce the compilation load for huge modules. + +Finally, when setting up a program, it is recommended to keep the actual implementations +in the program body at minimum. Reusing implementations from modules allows to write +reusable code and focus in the program unit only on translating user input to the +respective library functions and objects. diff --git a/learn/best_practices/multidim_arrays.md b/learn/best_practices/multidim_arrays.md new file mode 100644 index 000000000..0d64ee0cf --- /dev/null +++ b/learn/best_practices/multidim_arrays.md @@ -0,0 +1,70 @@ +--- +layout: book +title: Multidimensional Arrays +permalink: /learn/best_practices/multidim_arrays +--- + +Multidimensional arrays are stored in column-major order. This means the +left-most (inner-most) index addresses elements contiguously. +From a practical point this means that the array slice ``V(:, 1)`` is +contiguous, while the stride between elements in the slice ``V(1, :)`` +is the dimension of the columns. This is important when passing array +slices to procedures which expect to work on contiguous data. + +The locality of the memory is important to consider depending on +your application, usually when performing operations on a multidimensional +the sequential access should always advance in unity strides. + +In the following example the inverse distance between two sets of points +is evaluated, note that the points are stored contiguously in the arrays +``xyz1``/``xyz2``, while the inner-most loop is advancing the left-most +index of the matrix ``a``. + +```fortran +subroutine coulomb_matrix(xyz1, xyz2, a) + real(dp), intent(in) :: xyz1(:, :) + real(dp), intent(in) :: xyz2(:, :) + real(dp), intent(out) :: a(:, :) + integer :: i, j + do i = 1, size(a, 2) + do j = 1, size(a, 1) + a(j, i) = 1.0_dp/norm2(xyz1(:, j) - xyz2(:, i)) + end do + end do +end subroutine coulomb_matrix +``` + +Another example would be the contraction of the third dimension of a rank +three array: + +```fortran +do i = 1, size(amat, 3) + do j = 1, size(amat, 2) + do k = 1, size(amat, 1) + cmat(k, j) = cmat(k, j) + amat(k, j, i) * bvec(i) + end do + end do +end do +``` + +Contiguous array slices can be used in array-bound remapping to allow usage +of higher rank arrays as lower rank arrays without requiring to reshape +and potentially create a temporary copy of the array. + +For example this can be used to contract the third dimension of a rank +three array using a matrix-vector operation: + +```fortran +subroutine matmul312(amat, bvec, cmat) + real(dp), contiguous, intent(in) :: amat(:, :, :) + real(dp), intent(in) :: bvec(:) + real(dp), contiguous, intent(out) :: cmat(:, :) + real(dp), pointer :: aptr(:, :) + real(dp), pointer :: cptr(:) + + aptr(1:size(amat, 1)*size(amat, 2), 1:size(amat, 3)) => amat + cptr(1:size(cmat)) => cmat + + cptr = matmul(aptr, bvec) +end subroutine matmul312 +``` diff --git a/learn/best_practices/style_guide.md b/learn/best_practices/style_guide.md new file mode 100644 index 000000000..b871714e7 --- /dev/null +++ b/learn/best_practices/style_guide.md @@ -0,0 +1,58 @@ +--- +layout: book +title: Fortran Style Guide +permalink: /learn/best_practices/style_guide +--- + +Naming Convention +----------------- + +Ultimately this is a matter of preference. Here is a style guide that we +like and that seems to be prevalent in most scientific codes (as well as +the Fortran standard library) and you are welcome to follow it. + +1. Use lowercase for all Fortran constructs (`do`, `subroutine`, + `module`, ...). +2. Follow short mathematical notation for mathematical + variables/functions (`Ylm`, `Gamma`, `gamma`, `Enl`, `Rnl`, ...). +3. For other names use all lowercase: try to keep names to one or two + syllables; if more are required, use underscores to clarify + (`sortpair`, `whitechar`, `meshexp`, `numstrings`, `linspace`, + `meshgrid`, `argsort`, `spline`, `spline_interp`, + `spline_interpolate`, `stoperr`, `stop_error`, `meshexp_der`). + +For example "spline interpolation" can be shortened to +`spline_interpolation`, `spline_interpolate`, `spline_interp`, `spline`, +but not to `splineint` ("int" could mean integration, integer, etc. --- +too much ambiguity, even in the clear context of a computational code). +This is in contrast to `get_argument()` where `getarg()` is perfectly +clean and clear. + +The above are general guidelines. In general, choosing the right name +certainly depends on the word being truncated as to whether the first +syllable is sufficient. Usually it is but clearly not always. Thus some +thought should go into step "try to keep names to 2 syllables or less" +since it can really affect the indicativeness and simplicity. Simple +consistent naming rules are a real help in this regard -- for both +collaboration and for one's own sanity when going back to some old code +you haven't seen in while. + +Indentation +----------- + +Use a consistent indentation to make your code readable. +The amount of indentation is a matter of preference, the most common choices +are two, three or four spaces. + +Comparison to Other Languages +----------------------------- + +On the other hand, in most of the rest of the programming world, where +the main focus is, in one form or another, on defining and using large +sets of complex objects, with tons of properties and behaviors, known +only in the code in which they are defined (as opposed to defined by the +same notation throughout the literature), it makes more sense to use +longer, more descriptive names. The naming conventions one sees used in +more general-purpose languages such as C++ and Python, therefore, are +perfectly consistent with their more general-purpose missions. But +Fortran has a different mission (numerical scientific computing). diff --git a/learn/best_practices/type_casting.md b/learn/best_practices/type_casting.md new file mode 100644 index 000000000..c53544240 --- /dev/null +++ b/learn/best_practices/type_casting.md @@ -0,0 +1,494 @@ +--- +layout: book +title: Type Casting in Callbacks +permalink: /learn/best_practices/type_casting +--- + +There are essentially five different ways to do type casting, each with its own +advantages and disadvantages. + +The methods I, II and V can be used both in C and Fortran. The methods +III and IV are only available in Fortran. The method VI is obsolete and +should not be used. + +Work Arrays +----------- + +Pass a "work array" which is packed with everything needed by +the caller and unpacked by the called routine. This is the old way -- +e.g., how LAPACK does it. + +Integrator: + +``` fortran +module integrals + use types, only: dp + implicit none + private + public simpson + +contains + +real(dp) function simpson(f, a, b, data) result(s) + real(dp), intent(in) :: a, b + interface + real(dp) function func(x, data) + use types, only: dp + implicit none + real(dp), intent(in) :: x + real(dp), intent(inout) :: data(:) + end function + end interface + procedure(func) :: f + real(dp), intent(inout) :: data(:) + s = (b-a) / 6 * (f(a, data) + 4*f((a+b)/2, data) + f(b, data)) +end function + +end module +``` + +Usage: + +``` fortran +module test + use types, only: dp + use integrals, only: simpson + implicit none + private + public foo + +contains + +real(dp) function f(x, data) result(y) + real(dp), intent(in) :: x + real(dp), intent(inout) :: data(:) + real(dp) :: a, k + a = data(1) + k = data(2) + y = a*sin(k*x) +end function + +subroutine foo(a, k) + real(dp) :: a, k + real(dp) :: data(2) + data(1) = a + data(2) = k + print *, simpson(f, 0._dp, pi, data) + print *, simpson(f, 0._dp, 2*pi, data) +end subroutine + +end module +``` + +General Structure +----------------- + +Define a general structure which encompass the variations you +actually need (or are even remotely likely to need going forward). This +single structure type can then change if needed as future +needs/ideas permit but won't likely need to change from passing, say, +real numbers to, say, and instantiation of a text editor. + +Integrator: + +``` fortran +module integrals + use types, only: dp + implicit none + private + public simpson, context + + type context + ! This would be adjusted according to the problem to be solved. + ! For example: + real(dp) :: a, b, c, d + integer :: i, j, k, l + real(dp), pointer :: x(:), y(:) + integer, pointer :: z(:) + end type + +contains + +real(dp) function simpson(f, a, b, data) result(s) + real(dp), intent(in) :: a, b + interface + real(dp) function func(x, data) + use types, only: dp + implicit none + real(dp), intent(in) :: x + type(context), intent(inout) :: data + end function + end interface + procedure(func) :: f + type(context), intent(inout) :: data + s = (b-a) / 6 * (f(a, data) + 4*f((a+b)/2, data) + f(b, data)) +end function + +end module +``` + +Usage: + +``` fortran +module test + use types, only: dp + use integrals, only: simpson, context + implicit none + private + public foo + +contains + +real(dp) function f(x, data) result(y) + real(dp), intent(in) :: x + type(context), intent(inout) :: data + real(dp) :: a, k + a = data%a + k = data%b + y = a*sin(k*x) +end function + +subroutine foo(a, k) + real(dp) :: a, k + type(context) :: data + data%a = a + data%b = k + print *, simpson(f, 0._dp, pi, data) + print *, simpson(f, 0._dp, 2*pi, data) +end subroutine + +end module +``` + +There is only so much flexibility really needed. For example, you could +define two structure types for this purpose, one for Schroedinger and +one for Dirac. Each would then be sufficiently general and contain all +the needed pieces with all the right labels. + +Point is: it needn't be "one abstract type to encompass all" or bust. +There are natural and viable options between "all" and "none". + +Private Module Variables +------------------------ + +Hide the variable arguments completely by passing in module variables. + +Integrator: + +``` fortran +module integrals + use types, only: dp + implicit none + private + public simpson + +contains + +real(dp) function simpson(f, a, b) result(s) + real(dp), intent(in) :: a, b + interface + real(dp) function func(x) + use types, only: dp + implicit none + real(dp), intent(in) :: x + end function + end interface + procedure(func) :: f + s = (b-a) / 6 * (f(a) + 4*f((a+b)/2) + f(b)) +end function + +end module +``` + +Usage: + +``` fortran +module test + use types, only: dp + use integrals, only: simpson + implicit none + private + public foo + + real(dp) :: global_a, global_k + +contains + +real(dp) function f(x) result(y) + real(dp), intent(in) :: x + y = global_a*sin(global_k*x) +end function + +subroutine foo(a, k) + real(dp) :: a, k + global_a = a + global_k = k + print *, simpson(f, 0._dp, pi) + print *, simpson(f, 0._dp, 2*pi) +end subroutine + +end module +``` + +However it is best to avoid such global variables -- even though really +just semi-global -- if possible. But sometimes it may be the simplest +cleanest way. However, with a bit of thought, usually there is a better, +safer, more explicit way along the lines of II or IV. + +Nested functions +---------------- + +Integrator: + +``` fortran +module integrals + use types, only: dp + implicit none + private + public simpson + +contains + +real(dp) function simpson(f, a, b) result(s) + real(dp), intent(in) :: a, b + interface + real(dp) function func(x) + use types, only: dp + implicit none + real(dp), intent(in) :: x + end function + end interface + procedure(func) :: f + s = (b-a) / 6 * (f(a) + 4*f((a+b)/2) + f(b)) +end function + +end module +``` + +Usage: + +``` fortran +subroutine foo(a, k) +use integrals, only: simpson +real(dp) :: a, k +print *, simpson(f, 0._dp, pi) +print *, simpson(f, 0._dp, 2*pi) + +contains + +real(dp) function f(x) result(y) +real(dp), intent(in) :: x +y = a*sin(k*x) +end function f + +end subroutine foo +``` + +Using type(c\_ptr) Pointer +-------------------------- + +In C, one would use the `void *` pointer. In Fortran, one can use +`type(c_ptr)` for exactly the same purpose. + +Integrator: + +``` fortran +module integrals + use types, only: dp + use iso_c_binding, only: c_ptr + implicit none + private + public simpson + +contains + +real(dp) function simpson(f, a, b, data) result(s) + real(dp), intent(in) :: a, b + interface + real(dp) function func(x, data) + use types, only: dp + implicit none + real(dp), intent(in) :: x + type(c_ptr), intent(in) :: data + end function + end interface + procedure(func) :: f + type(c_ptr), intent(in) :: data + s = (b-a) / 6 * (f(a, data) + 4*f((a+b)/2, data) + f(b, data)) +end function + +end module +``` + +Usage: + +``` fortran +module test + use types, only: dp + use integrals, only: simpson + use iso_c_binding, only: c_ptr, c_loc, c_f_pointer + implicit none + private + public foo + + type f_data + ! Only contains data that we need for our particular callback. + real(dp) :: a, k + end type + +contains + +real(dp) function f(x, data) result(y) + real(dp), intent(in) :: x + type(c_ptr), intent(in) :: data + type(f_data), pointer :: d + call c_f_pointer(data, d) + y = d%a * sin(d%k * x) +end function + +subroutine foo(a, k) + real(dp) :: a, k + type(f_data), target :: data + data%a = a + data%k = k + print *, simpson(f, 0._dp, pi, c_loc(data)) + print *, simpson(f, 0._dp, 2*pi, c_loc(data)) +end subroutine + +end module +``` + +As always, with the advantages of such re-casting, as Fortran lets you +do if you really want to, come also the disadvantages that fewer +compile- and run-time checks are possible to catch errors; and with +that, inevitably more leaky, bug-prone code. So one always has to +balance the costs and benefits. + +Usually, in the context of scientific programming, where the main thrust +is to represent and solve precise mathematical formulations (as opposed +to create a GUI with some untold number of buttons, drop-downs, and +other interface elements), simplest, least bug-prone, and fastest is to +use one of the previous approaches. + +transfer() Intrinsic Function +----------------------------- + +Before Fortran 2003, the only way to do type casting was using the +`transfer` intrinsic function. It is functionally equivalent to the +method V, but more verbose and more error prone. It is now obsolete and +one should use the method V instead. + +Examples: + + + + + + + +Object Oriented Approach +------------------------ + +The module: + +``` fortran +module integrals + + use types, only: dp + implicit none + private + + public :: integrand, simpson + + ! User extends this type + type, abstract :: integrand + contains + procedure(func), deferred :: eval + end type + + abstract interface + function func(this, x) result(fx) + import :: integrand, dp + class(integrand) :: this + real(dp), intent(in) :: x + real(dp) :: fx + end function + end interface + +contains + +real(dp) function simpson(f, a, b) result(s) + class(integrand) :: f + real(dp), intent(in) :: a, b + s = ((b-a)/6) * (f%eval(a) + 4*f%eval((a+b)/2) + f%eval(b)) +end function + +end module +``` + +The abstract type prescribes exactly what the integration routine needs, +namely a method to evaluate the function, but imposes nothing else on +the user. The user extends this type, providing a concrete +implementation of the eval type bound procedure and adding necessary +context data as components of the extended type. + +Usage: + +``` fortran +module example_usage + + use types, only: dp + use integrals, only: integrand, simpson + implicit none + private + + public :: foo + + type, extends(integrand) :: my_integrand + real(dp) :: a, k + contains + procedure :: eval => f + end type + +contains + +function f(this, x) result(fx) + class(my_integrand) :: this + real(dp), intent(in) :: x + real(dp) :: fx + fx = this%a*sin(this%k*x) +end function + +subroutine foo(a, k) + real(dp) :: a, k + type(my_integrand) :: my_f + my_f%a = a + my_f%k = k + print *, simpson(my_f, 0.0_dp, 1.0_dp) + print *, simpson(my_f, 0.0_dp, 2.0_dp) +end subroutine + +end module +``` + +Complete Example of void \* vs type(c\_ptr) and transfer() +---------------------------------------------------------- + +Here are three equivalent codes: one in C using `void *` and two codes +in Fortran using `type(c_ptr)` and `transfer()`: + +| Language   | Method | Link | +|-----------------|----------------------|-----------------------------------| +| C | `void *` | | +| Fortran | `type(c_ptr)`   | | +| Fortran | `transfer()` | | + +The C code uses the standard C approach for writing extensible libraries +that accept callbacks and contexts. The two Fortran codes show how to do +the same. The `type(c_ptr)` method is equivalent to the C version and +that is the approach that should be used. + +The `transfer()` method is here for completeness only (before Fortran +2003, it was the only way) and it is a little cumbersome, because the +user needs to create auxiliary conversion functions for each of his +types. As such, the `type(c_ptr)` method should be used instead.