Skip to content

Commit 6215da3

Browse files
committed
add information on documenting functions using docstrings
1 parent 27beb98 commit 6215da3

File tree

1 file changed

+96
-7
lines changed

1 file changed

+96
-7
lines changed

_episodes/06-functions.md

Lines changed: 96 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ keypoints:
1717

1818
Most code is organized into blocks of code which perform a particular task. These code blocks are called *functions*. A commercial software package likely has hundreds of thousands or millions of functions. Functions break up our code into smaller, more easily understandable statements, and also allow our code to be more *modular*, meaning we can take pieces and reuse them. Functions also make your code easier to test, which we will see in a later lesson.
1919

20-
In general, each function should perform only one computational task.
20+
**In general, each function should perform only one computational task.**
2121

2222
## Defining and running a function
2323

@@ -35,7 +35,7 @@ Functions are defined using the `def` keyword, followed by the name of the funct
3535
## Writing functions into our geometry analysis project
3636

3737
Let's go back and consider a possible solution for the geometry analysis project.
38-
```
38+
~~~
3939
import numpy
4040
import os
4141
@@ -54,31 +54,31 @@ for num1 in range(0,num_atoms):
5454
bond_length_12 = numpy.sqrt(x_distance**2+y_distance**2+z_distance**2)
5555
if bond_length_12 > 0 and bond_length_12 <= 1.5:
5656
print(F'{symbols[num1]} to {symbols[num2]} : {bond_length_12:.3f}')
57-
```
57+
~~~
5858
{: .language-python}
5959

6060
To think about where we should write functions in this code, let's think about parts we may want to use again or in other places. One of the first places we might think of is in the bond distance calculation. Perhaps we'd want to calculate a bond distance in some other script. We can reduce the likelihood of errors in our code by defining this in a function (so that if we wanted to change our bond calculation, we would only have to do it in one place.)
6161

6262
Let's change this code so that we write a function to calculate the bond distance. As explained above, to define a function, you start with the word `def` and then give the name of the function. In parenthesis are in inputs of the function followed by a colon. The the statements the function is going to execute are indented on the next lines. For this function, we will `return` a value. The last line of a function shows the return value for the function, which we can use to store a variable with the output value. Let's write a function to calculate the distance between atoms.
63-
```
63+
~~~
6464
def calculate_distance(atom1_coord, atom2_coord):
6565
x_distance = atom1_coord[0] - atom2_coord[0]
6666
y_distance = atom1_coord[1] - atom2_coord[1]
6767
z_distance = atom1_coord[2] - atom2_coord[2]
6868
bond_length_12 = numpy.sqrt(x_distance**2+y_distance**2+z_distance**2)
6969
return bond_length_12
70-
```
70+
~~~
7171
{: .language-python}
7272

7373
Now we can change our `for` loop to just call the distance function we wrote above.
74-
```
74+
~~~
7575
for num1 in range(0,num_atoms):
7676
for num2 in range(0,num_atoms):
7777
if num1<num2:
7878
bond_length_12 = calculate_distance(coordinates[num1], coordinates[num2])
7979
if bond_length_12 > 0 and bond_length_12 <= 1.5:
8080
print(F'{symbols[num1]} to {symbols[num2]} : {bond_length_12:.3f}')
81-
```
81+
~~~
8282
{: .language-python}
8383

8484
Next, let's write another function that checks to see if a particular bond distance represents a bond. This function will be called `bond_check`, and will return `True` if the bond distance is within certain bounds (At first we'll set this to be between 0 and 1.5 angstroms).
@@ -109,11 +109,89 @@ This is great! Our function will currently return `True` if our bond distance is
109109
> {: .solution}
110110
{: .challenge}
111111
112+
## Function Documentation
113+
Recall from our work with tabular data that we were able to use `help` to see a help message on a function. As a reminder, we used it like this.
114+
115+
~~~
116+
help(numpy.genfromtxt)
117+
~~~
118+
{: .language-python}
119+
120+
~~~
121+
Help on function genfromtxt in module numpy.lib.npyio:
122+
123+
genfromtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None, encoding='bytes')
124+
Load data from a text file, with missing values handled as specified.
125+
126+
Each line past the first `skip_header` lines is split at the `delimiter`
127+
character, and characters following the `comments` character are discarded.
128+
~~~
129+
{: .output}
130+
131+
Let's try the same thing on our function.
132+
133+
~~~
134+
help(calculate_distance)
135+
~~~
136+
{: .language-python}
137+
138+
~~~
139+
Help on function calculate_distance in module __main__:
140+
141+
calculate_distance(atom1_coord, atom2_coord)
142+
~~~
143+
{: .output}
144+
145+
There is no help for our function! That is because you haven't written it yet. In Python, we can document our functions using something called `docstrings`. When you call help on something, it will display the docstring you have written. In fact, most Python libraries use docstrings and other automated tools to pull the docstrings out of the code to make online documentation. For example, see the [documentation](https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html) for the `genfromtxt` function online.
146+
147+
To add a docstring to our function, we simply add a block quote directly underneath the function definition. We do this in in the same way we type a string, except that we use three quotation marks to open and close the string instead of one.
148+
149+
~~~
150+
def calculate_distance(atom1_coord, atom2_coord):
151+
"""Calculate the distance between two three-dimensional points."""
152+
153+
x_distance = atom1_coord[0] - atom2_coord[0]
154+
y_distance = atom1_coord[1] - atom2_coord[1]
155+
z_distance = atom1_coord[2] - atom2_coord[2]
156+
bond_length_12 = numpy.sqrt(x_distance**2+y_distance**2+z_distance**2)
157+
return bond_length_12
158+
~~~
159+
{: .language-python}
160+
161+
We are using a very simple docstring in this example. However, there are many formats for docstrings. Now, you should see a message when you call help on this function.
162+
163+
~~~
164+
help(calculate_distance)
165+
~~~
166+
{: .language-python}
167+
168+
~~~
169+
Help on function calculate_distance in module __main__:
170+
171+
calculate_distance(atom1_coord, atom2_coord)
172+
Calculate the distance between two three-dimensional points
173+
~~~
174+
{: .output}
175+
176+
If you use a well-known format, you can use software to extract the docstring and make a webpage with your documentation. MolSSI recommends using numpy style docstrings. You can learn more about this in our [Python Package Development Best Practices Workshop](https://molssi-education.github.io/python-package-best-practices/).
177+
178+
> ## Help vs Online Documentation
179+
> Many python libraries we have used such as numpy and matplotlib have extensive online documentation. It is a good idea to use online documentation if it is available. Typically, documentation for functions will be pulled from docstrings in the code, but additional information the code developers have provided will also be available through online documentation.
180+
>
181+
> However, if you are offline or using a library without online documentation, you can check for documentation using the `help` function.
182+
{: .callout}
183+
184+
Remember, help for your code only exists if you write it! Every time you write a function, you should take some time to also write a docstring describing what the function does.
185+
112186
### Function Default arguments
113187
When there are parameters in a function definition, we can set these parameters to default values. This way, if the user does not input values, the default values can be used instead of the code just not working. For example, if we want the default values in bond check to be 0 and 1.5, we can change the function definition to the following:
114188
115189
~~~
116190
def bond_check(atom_distance, minimum_length=0, maximum_length=1.5):
191+
"""
192+
Check if a distance is a bond based on a minimum and maximum bond length.
193+
"""
194+
117195
if atom_distance > minimum_length and atom_distance <= maximum_length:
118196
return True
119197
else:
@@ -229,26 +307,37 @@ import numpy
229307
import os
230308

231309
def calculate_distance(atom1_coord, atom2_coord):
310+
"""
311+
Calculate the distance between two three-dimensional points.
312+
"""
232313
x_distance = atom1_coord[0] - atom2_coord[0]
233314
y_distance = atom1_coord[1] - atom2_coord[1]
234315
z_distance = atom1_coord[2] - atom2_coord[2]
235316
bond_length_12 = numpy.sqrt(x_distance**2+y_distance**2+z_distance**2)
236317
return bond_length_12
237318

238319
def bond_check(atom_distance, minimum_length=0, maximum_length=1.5):
320+
"""Check if a distance is a bond based on a minimum and maximum bond length"""
321+
239322
if atom_distance > minimum_length and atom_distance <= maximum_length:
240323
return True
241324
else:
242325
return False
243326

244327
def open_xyz(filename):
328+
"""
329+
Open and read an xyz file. Returns tuple of symbols and coordinates.
330+
"""
245331
xyz_file = numpy.genfromtxt(fname=filename, skip_header=2, dtype='unicode')
246332
symbols = xyz_file[:,0]
247333
coord = (xyz_file[:,1:])
248334
coord = coord.astype(numpy.float)
249335
return symbols, coord
250336

251337
def print_bonds(atom_symbols, atom_coordinates):
338+
"""
339+
Prints atom symbols and bond length for a set of atoms.
340+
"""
252341
num_atoms = len(atom_symbols)
253342
for num1 in range(0,num_atoms):
254343
for num2 in range(0, num_atoms):

0 commit comments

Comments
 (0)