Skip to content

Commit

Permalink
Added 3 exercises + some text
Browse files Browse the repository at this point in the history
  • Loading branch information
cgeroux committed Apr 23, 2024
1 parent 640a7c5 commit 4daf5ad
Show file tree
Hide file tree
Showing 7 changed files with 173 additions and 16 deletions.
6 changes: 4 additions & 2 deletions _episodes/derived_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ keypoints:
- "Access modifiers can be applied within derived types as well as within modules to restrict access to members of that derived type."
---

While modules allow you to package variables together in such a way that you can directly refer to those variables, derived types allow you to package together variables in such a way as to form a new compound variable. With this new compound variable you can refer to it as a group rather than only by the individual components.
While modules allow you to package variables together in such a way that you can directly refer to those variables, derived types allow you to package together variables in such a way as to form a new compound variable. With this new compound variable you can refer to it as a group rather than only by the individual components. This can greatly simplify passing a group of related variables to a procedure as only one variable of a given derived type would need to be passed.

To create a new derived type you use the following format.
~~~
Expand All @@ -29,7 +29,7 @@ You can then declare new variables of this derived type as shown.
type(<type name>):: my_variable
~~~
{: .fortran}
Finally individual elements or members of a derived type variable, or **object**, can be accessed using the `%` operator.
Individual elements or members of a derived type variable, or **object**, can be accessed using the `%` operator.
~~~
my_variable%member1
~~~
Expand Down Expand Up @@ -80,6 +80,8 @@ $ ./derived_types
~~~
{: .output}

In the rest of this workshop we will build on this `t_vector` derived type adding functionality as we go. We will create a set of procedures and operators which will allow us to do common operations one might want to perform on a vector abstracting away the details, such as memory management, allowing us to write code at a higher level.

### Creating new objects
Creating new vectors is a pretty common thing that we want to do. Lets add some functions to create vectors to reduce the amount of repeated code. Lets create one to make empty vectors, `create_empty_vector` and one to create a vector of a given size allocating the required memory to hold all the elements of the vector, `create_sized_vector`.

Expand Down
71 changes: 69 additions & 2 deletions _episodes/destructors.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Destructors"
teaching: 10
exercises: 0
exercises: 5
questions:
- "What is a destructor?"
objectives:
Expand All @@ -10,7 +10,7 @@ keypoints:
- "A destructor is used to perform clean up when an object goes out of scope."
- "To create a destructor use the **final** keyword when declaring at type bound procedure instead of the procedure keyword."
---
One aspect I have been ignoring until now is that the memory we allocate for our vectors is never explicitly freed by our program. So far our program has been simple enough that this is not a serious issue. We have only created a few objects that allocate memory within our main program. When the program execution has completed, that memory is returned to the operating system. However, if we had a long running loop inside our program that created new objects with allocated memory and we never deallocated that memory we would have a problem as our program would steadily increase its memory usage. This is referred to as a memory leak as was mentioned in the first half of this workshop. We can manually deallocate memory as we did with allocating memory before we created a function to create new `t_vector` and `t_vector_3` objects, however there is a way to create a new special type bound procedure that is automatically called when the object goes out of scope to deallocate this memory for us. To do this we use the **final** keyword within the type definition.
One aspect I have been ignoring until now is that the memory we allocate for our vectors is never explicitly freed by our program. So far our program has been simple enough that this is not a serious issue. We have only created a few objects that allocate memory within our main program. When the program execution has completed, that memory is returned to the operating system. However, if we had a long running loop inside our program that created new objects with allocated memory and we never deallocated that memory we would have a problem as our program would steadily increase its memory usage. This is referred to as a memory leak as was mentioned in the first half of this workshop. We can manually deallocate memory as we did with allocating memory, however there is a way to create a new special type bound procedure that is automatically called when the object goes out of scope to deallocate this memory for us. To do this we use the **final** keyword within the type definition.

~~~
type <type-name>
Expand Down Expand Up @@ -148,5 +148,72 @@ $ ./destructor
0.00000000
~~~
{: .output}

> ## No allocated check?
> What happens if we don't check that memory is allocated before de-allocating it in our destructor? Lets copy our last code and comment out those lines and see.
> ~~~
> $ cp destructor.f90 destructor_no_check.f90
> $ nano destructor_no_check.f90
> ~~~
> {: .bash}
> ~~~
> ...
> subroutine destructor_vector(self)
> implicit none
> type(t_vector):: self
>
> !if(allocated(self%elements)) then
> deallocate(self%elements)
> !endif
> end subroutine
>
> subroutine destructor_vector_3(self)
> implicit none
> type(t_vector_3):: self
>
> !if(allocated(self%elements)) then
> deallocate(self%elements)
> !endif
> end subroutine
> ...
> ~~~
> {: .fortran}
> ~~~
> $ gfortran -g destructor_no_check.f90 -o destructor_no_check
> $ ./destructor_no_check
> ~~~
> {: .bash}
> What happens and why?<br/> Note: the `-g` option provides extra debugging information, such as file and line numbers in the backtrace.
> > ## Solution
> > ~~~
> > t_vector:
> > num_elements= 0
> > elements=
> > t_vector:
> > num_elements= 4
> > elements=
> > 2.00000000
> > 0.00000000
> > 0.00000000
> > 0.00000000
> > At line 37 of file destructor.f90
> > Fortran runtime error: Attempt to DEALLOCATE unallocated 'self'
> >
> > Error termination. Backtrace:
> > #0 0x7fd37fc21730 in ???
> > #1 0x7fd37fc22289 in ???
> > #2 0x7fd37fc22906 in ???
> > #3 0x401929 in __m_vector_MOD_destructor_vector
> > at /home/user100/fortran_oop/destructor.f90:37
> > #4 0x40100d in __m_vector_MOD___final_m_vector_T_vector
> > at /home/user100/fortran_oop/destructor.f90:86
> > #5 0x401d86 in MAIN__
> > at /home/user100/fortran_oop/destructor.f90:101
> > #6 0x401e15 in main
> > at /home/user100/fortran_oop/destructor.f90:89
> > ~~~
> > {: .output}
> {: .solution}
{: .challenge}
{% include links.md %}
42 changes: 40 additions & 2 deletions _episodes/extending_types.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Extended Types"
teaching: 10
exercises: 0
exercises: 5
questions:
- "How do you extend a type?"
- "Why can it be useful to extend a type?"
Expand All @@ -14,7 +14,7 @@ keypoints:

It is pretty common to use vectors to represent positions in 3D space. Lets create a new derived type which always has only three components. However, it would be really nice if we could reuse our more general vector type to represent one of these specific 3 component vectors. You can do this by using type extension. Type extension allows you to add new members (or not) to an existing type to create a new derived type.

To create a new extended derived type has the following format.
To create a new extended derived type use the following format.
~~~
type,extends(<parent type name>):: <child type name>
<member variable declarations>
Expand Down Expand Up @@ -106,4 +106,42 @@ $ ./type_extension
~~~
{: .output}

> ## Which type is being extended?
> In the following code snippet which type is being extended?
> ~~~
> ...
> type, extends(B):: A
> end type
> ...
> type(C) function D()
> implicit none
> D%thing=1.0
> end function
> ...
> ~~~
> {: .fortran}
> <ol type="a">
> <li markdown="1">`A`
> </li>
> <li markdown="1">`B`
> </li>
> <li markdown="1">`C`
> </li>
> <li markdown="1">`D`
> </li>
> </ol>
> > ## Solution
> > <ol type="a">
> > <li markdown="1">**No**: close, but `A` is the new derived type which extends the existing `B` derived type.
> > </li>
> > <li markdown="1">**Yes**: the existing derived type `B` is being extended to create a new derived type `A`.
> > </li>
> > <li markdown="1">**No**: `C` is a derived type, but from this code snippet it is impossible to tell if it has been extended to a new derived type somewhere else in the code.
> > </li>
> > <li markdown="1">**No**: `D` is actually a function name, not a derived type at all.
> > </li>
> > </ol>
> {: .solution}
{: .challenge}
{% include links.md %}
48 changes: 44 additions & 4 deletions _episodes/interfaces.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Interface Blocks"
teaching: 10
exercises: 0
exercises: 5
questions:
- "What is an interface block?"
- "How can one be used to create a constructor for a derived type?"
Expand All @@ -12,7 +12,7 @@ keypoints:
- "Procedures that are part of the same generic interface block must be distinguishable from each other based on the number, order, and type of arguments passed."
---

As mentioned previously it is a very common task to create new objects of a derived type and perform some initialization of the members. We have already created some functions to do this. One of the functions, `create_empty_vector` takes no arguments and creates a new empty vector, while `create_sized_vector` creates a new vector of a specific size passed to the function. Both of these functions do the same thing, create a new `t_vector` object and initialize it. One might imagine many different such initialization routines, perhaps ones that take another `t_vectors` or `t_vector_3` objects to use to initialize a new `t_vector` object as a copy of the passed vector. All of these creation, or initialization, functions do basically the same thing but in a slightly different way depending on the arguments passed to them. It starts to get a bit tedious to have to remember all the names of these different initialization functions. If the compiler could somehow distinguish these functions automatically based on the number and type of arguments rather than the procedure name so that we could call the same generic procedure name and it would pick the correct procedure implementation based on the arguments we passed it.
As mentioned previously it is a very common task to create new objects of a derived type and perform some initialization of the members. We have already created some functions to do this. One of the functions, `create_empty_vector` takes no arguments and creates a new empty vector, while `create_sized_vector` creates a new vector of a specific size passed to the function. Both of these functions do the same thing, create a new `t_vector` object and initialize it. One might imagine many different such initialization routines, perhaps ones that take another `t_vector` or `t_vector_3` objects to use to initialize a new `t_vector` object as a copy of the passed vector. All of these creation, or initialization, functions do basically the same thing but in a slightly different way depending on the arguments passed to them. It starts to get a bit tedious to have to remember all the names of these different initialization functions. If the compiler could somehow distinguish these functions automatically based on the number and type of arguments rather than the procedure name so that we could call the same generic procedure name and it would pick the correct procedure implementation based on the arguments we passed it.

It turns out there was a feature added to Fortran 2003, called **interface blocks** which allow multiple procedures to be mapped to one name. The basic syntax of an interface block is as follows.
~~~
Expand All @@ -24,9 +24,9 @@ end interface
~~~
{: .fortran}

This allows one to call the procedure `<new-procedure-name>` and will be mapped onto different procedure implementations `<existing-procedure-name1>`, `<existing-procedure-name-2>`, etc. based on the type, number, and order of arguments passed to the procedure when calling it. Since the number, type, and order of arguments is the only way for the compiler to know which procedure to call, all procedures listed in the interface block must have different types and or number of arguments.
This allows one to call the procedure `<new-procedure-name>` and it will be mapped onto different procedure implementations `<existing-procedure-name1>`, `<existing-procedure-name-2>`, etc. based on the type, number, and order of arguments passed to the procedure when calling it. Since the number, type, and order of arguments is the only way for the compiler to know which procedure to call, all procedures listed in the interface block must have different types and or number of arguments.

Lets use interface blocks to group our creation functions for `t_vector` and `t_vector_3` into one procedure name to initialize each of the derived types. It is common to use the name of the derived type as the name of creation function which returns a new object of that type. Functions defined in this way are referred to as **constructors** as they *construct* new objects of the derived type.
Lets use interface blocks to group our creation functions for `t_vector` and `t_vector_3` into one procedure name to initialize each of the derived types. It is common to use the name of the derived type as the name of the creation function which returns a new object of that type. Functions defined in this way are referred to as **constructors** as they *construct* new objects of the derived type.

~~~
$ cp type_extension.f90 interface_blocks.f90
Expand Down Expand Up @@ -114,5 +114,45 @@ The way we have used interface blocks above is what is called a **generic interf

There is another way to use interface blocks without specifying a generic procedure name to map the listed procedures to, referred to as **explicit interfaces**. Explicit interface blocks can be used to define a procedure without actually listing the implementation of it. These are useful when using procedures declared in different compilation units which will be linked into the final program later. This is a bit like a forward declaration or a function prototype in C/C++. If your procedures are declared inside a module, as we have been doing, these explicit interfaces are created for you.

> ## Add a procedure to an interface
> What happens if we add the `create_size_3_vector` to the `t_vector` generic interface? Make a copy of our last source file and add the line `procedure:: create_size_3_vector` to the `t_vector` interface as shown below.
> ~~~
> $ cp interface_blocks.f90 interface_test.f90
> $ nano interface_test.f90
> ~~~
> {: .bash}
> ~~~
> module m_vector
> implicit none
>
> type t_vector
> ...
> end type
>
> interface t_vector
> procedure:: create_empty_vector
> procedure:: create_sized_vector
> procedure:: create_size_3_vector
> end interface
> ...
> ~~~
> {: .fortran}
> ~~~
> $ gfortran -o interface_test interface_test.f90
> ~~~
> {: .bash}
> What happens when you try to compile and run it and why?
> > ## Solution
> > During compilation the following error message is printed out
> > ~~~
> > ...
> > Error: Ambiguous interfaces in generic interface 't_vector' for ‘create_empty_vector’ at (1) and ‘create_size_3_vector’ at (2)
> > ...
> > ~~~
> > {: .output}
> > This is because the `create_empty_vector` and the `create_size_3_vector` functions both have no arguments. Since the compiler uses the number and type of arguments to decide which function to call there is no way for the compiler to know which function should be called when no arguments are given.
> {: .solution}
{: .challenge}
{% include links.md %}
11 changes: 8 additions & 3 deletions _episodes/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,23 @@ There are some basic principles of OOP:
* **Abstraction**: wrap up complex actions into simple verbs.
* **Encapsulation**: keep state and logic internal.
* **Inheritance**: new types can inherit properties and function from existing types and modify and extend those.
* **Polylmorphism**: different types of objects can have the same methods that are internally handled different depending on the type.
* **Polymorphism**: different types of objects can have the same methods that are internally handled differently depending on the type.

Many features have been added to Fortran over the years which has allowed Fortran programs to be written with increasingly object-oriented designs if desired.

However, I should note that care should be taken while designing programs to make use of OOP practices. If not carefully designed it can result in very bad outcomes for performance. It has been stated that premature optimization is the root of all evil, and to some degree that is true. In that you shouldn't worry too much about optimizing code until you know what is important to optimize, e.g. what takes most of the time. However, when using OOP designs some consideration needs to be given up front to performance otherwise significant re-writing will have to happen to allow for later optimizations. For example, it is often a bad idea to have an array of objects and one should instead favor a object containing arrays.
However, I should note that care should be taken while designing programs to make use of OOP practices. If not carefully designed OOP can result in very bad outcomes for performance. It has been stated that premature optimization is the root of all evil, and to some degree that is true. In that you shouldn't worry too much about optimizing code until you know what is important to optimize, e.g. what takes most of the time. However, when using OOP designs some consideration needs to be given up front to performance otherwise significant re-writing will have to happen to allow for later optimizations. For example, it is often a bad idea to have an array of objects and one should instead favor an object containing arrays.

The above principles of OOP should be taken with a grain of salt rather than perfect rules to always follow.
Here is an example, as a grad student I was writing a hydrodynamics code for a course. I made every cell in my grid an object with members like, density, pressure and velocities. To create my grid I made an array of cell objects and for every cell I called a constructor to initialize it. More over, when updating a property such as density over the whole grid, all the other cell properties had to be loaded into the smaller and faster caches along with it. This would increase the likely hood of cache misses and cause data to be loaded/unloaded into the caches more than otherwise needed. I got very poor performance as compared to other students who didn't use an OOP style. That isn't to say I couldn't have used an OOP style and gotten comparable performance, if I for example made the grid an object which contained arrays of the basic properties of the fluid. So blindly applying the ideas of OOP can have some serious performance implications and require some major structural changes to code if left to later optimization stages. Some up front thought about performance isn't always a bad thing. The trick is not to worry about details too early, but rather try to limit it to larger code design decisions.

There are of course benefits to the basic OOP principles, in particular they help to reduce the mental load when dealing with large complex code bases by allowing the person reading the code to get a high level understanding of what is going on by abstracting away details. It also helps by reducing the need for duplicating code by using inheritance and polymorphism to allow the same code to operate on different data types. This reduces the amount of code that needs to be debugged and maintained.

The above principles of OOP should be considered as general guidelines rather than perfect rules to always follow.

* Fortran 90 introduced:
* modules
* derived data types
* interface blocks
* operator overloading
* Fortran 2003 introduced:
* type extension
* type-bound procedures
Expand Down
Loading

0 comments on commit 4daf5ad

Please sign in to comment.