Skip to content

Commit 3c0c41d

Browse files
update episode 05
1 parent 48aa62c commit 3c0c41d

File tree

1 file changed

+66
-210
lines changed

1 file changed

+66
-210
lines changed

docs/05-writing-scripts.md

Lines changed: 66 additions & 210 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,20 @@
1313
- How can we automate a commonly used set of commands?
1414

1515

16-
<script language="javascript" type="text/javascript">
17-
function set_page_view_defaults() {
18-
document.getElementById('div_win').style.display = 'block';
19-
document.getElementById('div_unix').style.display = 'none';
20-
};
21-
22-
function change_content_by_platform(form_control){
23-
if (!form_control || document.getElementById(form_control).value == 'win') {
24-
set_page_view_defaults();
25-
} else if (document.getElementById(form_control).value == 'unix') {
26-
document.getElementById('div_win').style.display = 'none';
27-
document.getElementById('div_unix').style.display = 'block';
28-
} else {
29-
alert("Error: Missing platform value for 'change_content_by_platform()' script!");
30-
}
31-
}
32-
33-
window.onload = set_page_view_defaults;
34-
</script>
3516

3617
## Writing files
3718

3819
We've been able to do a lot of work with files that already exist, but what if we want to write our own files? We're not going to type in a FASTA file, but we'll see as we go through other tutorials, there are a lot of reasons we'll want to write a file, or edit an existing file.
3920

40-
To add text to files, we're going to use a text editor called Nano. We're going to create a file to take notes about what we've been doing with the data files in `~/obss_2023/commandline/shell_data/untrimmed_fastq`.
21+
To add text to files, we're going to use a text editor called Nano. We're going to create a file to take notes about what we've been doing with the data files in `~/shell_data/untrimmed_fastq`.
4122

4223
This is good practice when working in bioinformatics. We can create a file called `README.txt` that describes the data files in the directory or documents how the files in that directory were generated. As the name suggests, it's a file that we or others should read to understand the information in that directory.
4324

44-
Let's change our working directory to `~/obss_2023/commandline/shell_data/untrimmed_fastq` using `cd`,
25+
Let's change our working directory to `~/shell_data/untrimmed_fastq` using `cd`,
4526
then run `nano` to create a file called `README.txt`:
4627

4728
```bash
48-
$ cd ~/obss_2023/commandline/shell_data/untrimmed_fastq
29+
$ cd ~/shell_data/untrimmed_fastq
4930
$ nano README.txt
5031
```
5132

@@ -55,7 +36,7 @@ You should see something like this:
5536

5637
The text at the bottom of the screen shows the keyboard shortcuts for performing various tasks in `nano`. We will talk more about how to interpret this information soon.
5738

58-
::::::::::::::::::::::::::::::::::::::::: callout
39+
5940

6041
## Which Editor?
6142

@@ -80,7 +61,7 @@ your computer's start menu, the editor may want to save files in your desktop or
8061
documents directory instead. You can change this by navigating to
8162
another directory the first time you "Save As..."
8263

83-
::::::::::::::::::::::::::::::::::::::::::::::::::
64+
8465

8566
Let's type in a few lines of text. Describe what the files in this
8667
directory are or what you've been doing with them.
@@ -91,7 +72,7 @@ press <kbd>Return</kbd> to accept the suggested default of `README.txt`.
9172
Once our file is saved, we can use <kbd>Ctrl</kbd>\-<kbd>X</kbd> to quit the `nano` editor and
9273
return to the shell.
9374

94-
::::::::::::::::::::::::::::::::::::::::: callout
75+
9576

9677
## Control, Ctrl, or ^ Key
9778

@@ -111,26 +92,19 @@ In `nano`, along the bottom of the screen you'll see `^G Get Help ^O WriteOut`.
11192
This means that you can use <kbd>Ctrl</kbd>\-<kbd>G</kbd> to get help and <kbd>Ctrl</kbd>\-<kbd>O</kbd> to save your
11293
file.
11394

114-
::::::::::::::::::::::::::::::::::::::::::::::::::
115-
116-
Now you've written a file. You can take a look at it with `less` or `cat`, or open it up again and edit it with `nano`.
117-
118-
::::::::::::::::::::::::::::::::::::::: challenge
11995

120-
## Exercise
12196

122-
Open `README.txt` and add the date to the top of the file and save the file.
97+
Now you've written a file. You can take a look at it with `less` or `cat`, or open it up again and edit it with `nano`.
12398

124-
::::::::::::::: solution
12599

126-
## Solution
100+
!!! dumbbell "Exercise"
127101

128-
Use `nano README.txt` to open the file.
129-
Add today's date and then use <kbd>Ctrl</kbd>\-<kbd>X</kbd> followed by `y` and <kbd>Enter</kbd> to save.
102+
Open `README.txt` and add the date to the top of the file and save the file.
130103

131-
:::::::::::::::::::::::::
104+
??? success "Solution"
132105

133-
::::::::::::::::::::::::::::::::::::::::::::::::::
106+
Use `nano README.txt` to open the file.
107+
Add today's date and then use <kbd>Ctrl</kbd>\-<kbd>X</kbd> followed by `y` and <kbd>Enter</kbd> to save.
134108

135109
## Writing scripts
136110

@@ -140,92 +114,97 @@ One thing we will commonly want to do with sequencing results is pull out bad re
140114

141115
We're going to create a new file to put this command in. We'll call it `bad-reads-script.sh`. The `sh` isn't required, but using that extension tells us that it's a shell script.
142116

143-
```bash
144-
$ nano bad-reads-script.sh
145-
```
117+
!!! terminal "code"
118+
119+
```bash
120+
$ nano bad-reads-script.sh
121+
```
146122

147123
Bad reads have a lot of N's, so we're going to look for `NNNNNNNNNN` with `grep`. We want the whole FASTQ record, so we're also going to get the one line above the sequence and the two lines below. We also want to look in all the files that end with `.fastq`, so we're going to use the `*` wildcard.
148124

149-
```bash
150-
grep -B1 -A2 -h NNNNNNNNNN *.fastq | grep -v '^--' > scripted_bad_reads.txt
151-
```
125+
!!! terminal "code"
126+
127+
```bash
128+
grep -B1 -A2 -h NNNNNNNNNN *.fastq | grep -v '^--' > scripted_bad_reads.txt
129+
```
130+
152131

153-
::::::::::::::::::::::::::::::::::::::::: callout
154132

155133
## Custom `grep` control
156134

157135
We introduced the `-v` option in [the previous episode](04-redirection.md), now we
158136
are using `-h` to "Suppress the prefixing of file names on output" according to the documentation shown by `man grep`.
159137

160-
::::::::::::::::::::::::::::::::::::::::::::::::::
138+
161139

162140
Type your `grep` command into the file and save it as before. Be careful that you did not add the `$` at the beginning of the line.
163141

164142
Now comes the neat part. We can run this script. Type:
165143

166-
```bash
167-
$ bash bad-reads-script.sh
168-
```
169-
170-
It will look like nothing happened, but now if you look at `scripted_bad_reads.txt`, you can see that there are now reads in the file.
144+
!!! terminal "Code"
171145

172-
::::::::::::::::::::::::::::::::::::::: challenge
146+
```bash
147+
$ bash bad-reads-script.sh
148+
```
173149

174-
## Exercise
150+
It will look like nothing happened, but now if you look at `scripted_bad_reads.txt`, you can see that there are now reads in the file.
175151

176-
We want the script to tell us when it's done.
177152

178-
1. Open `bad-reads-script.sh` and add the line `echo "Script finished!"` after the `grep` command and save the file.
179-
2. Run the updated script.
153+
!!! dumbbell "Exercise"
180154

181-
::::::::::::::: solution
155+
We want the script to tell us when it's done.
156+
157+
1. Open `bad-reads-script.sh` and add the line `echo "Script finished!"` after the `grep` command and save the file.
158+
2. Run the updated script.
182159

183-
## Solution
160+
??? success "Solution"
184161

185-
```
186-
$ bash bad-reads-script.sh
187-
Script finished!
188-
```
162+
```
163+
$ bash bad-reads-script.sh
164+
Script finished!
165+
```
189166

190-
:::::::::::::::::::::::::
191167

192-
::::::::::::::::::::::::::::::::::::::::::::::::::
193168

194169
## Making the script into a program
195170

196171
We had to type `bash` because we needed to tell the computer what program to use to run this script. Instead, we can turn this script into its own program. We need to tell the computer that this script is a program by making the script file executable. We can do this by changing the file permissions. We talked about permissions in [an earlier episode](03-working-with-files.md).
197172

198-
First, let's look at the current permissions.
173+
!!! terminal-2 "First, let's look at the current permissions."
199174

200-
```bash
201-
$ ls -l bad-reads-script.sh
202-
```
175+
```bash
176+
$ ls -l bad-reads-script.sh
177+
```
203178

204-
```output
205-
-rw-rw-r-- 1 dcuser dcuser 0 Oct 25 21:46 bad-reads-script.sh
206-
```
179+
```output
180+
-rw-rw-r-- 1 dcuser dcuser 0 Oct 25 21:46 bad-reads-script.sh
181+
```
207182

208183
We see that it says `-rw-r--r--`. This shows that the file can be read by any user and written to by the file owner (you). We want to change these permissions so that the file can be executed as a program. We use the command `chmod` like we did earlier when we removed write permissions. Here we are adding (`+`) executable permissions (`+x`).
209184

210-
```bash
211-
$ chmod +x bad-reads-script.sh
212-
```
185+
!!! terminal "code"
213186

214-
Now let's look at the permissions again.
187+
```bash
188+
$ chmod +x bad-reads-script.sh
189+
```
215190

216-
```bash
217-
$ ls -l bad-reads-script.sh
218-
```
191+
!!! terminal-2 "Now let's look at the permissions again."
219192

220-
```output
221-
-rwxrwxr-x 1 dcuser dcuser 0 Oct 25 21:46 bad-reads-script.sh
222-
```
193+
```bash
194+
$ ls -l bad-reads-script.sh
195+
```
196+
197+
```output
198+
-rwxrwxr-x 1 dcuser dcuser 0 Oct 25 21:46 bad-reads-script.sh
199+
```
223200

224201
Now we see that it says `-rwxr-xr-x`. The `x`'s that are there now tell us we can run it as a program. So, let's try it! We'll need to put `./` at the beginning so the computer knows to look here in this directory for the program.
225202

226-
```bash
227-
$ ./bad-reads-script.sh
228-
```
203+
!!! terminal "code"
204+
205+
```bash
206+
$ ./bad-reads-script.sh
207+
```
229208

230209
The script should run the same way as before, but now we've created our very own computer program!
231210

@@ -314,133 +293,10 @@ command line belongs to. So, if you are logged into AWS on the command line and
314293
the `curl` command above in the AWS terminal, the file will be downloaded to your AWS
315294
machine, not your local one.
316295

317-
### Moving files between your laptop and NeSI with Jupyterhub
318-
319-
With Jupyterhub on NeSI, one of the easiest way to move small-medium sized files is to use the upload option on the file explorer panel
320-
321-
![](fig/nesi_images/upload.png)
322-
323-
And to download a file, you can right click on it in the explorer panel and select "Download"
324-
325-
![](fig/nesi_images/download.png)
326-
327-
::::::::::::::::::::::::::::::::::::::::: callout
328-
329-
**Original instructions for downloading data from AWS**
330-
331-
### Moving files between your laptop and your instance - AWS
332-
333-
What if the data you need is on your local computer, but you need to get it _into_ the
334-
cloud? There are also several ways to do this, but it's _always_ easier
335-
to start the transfer locally. **This means if you're typing into a terminal, the terminal
336-
should not be logged into your instance, it should be showing your local computer. If you're
337-
using a transfer program, it needs to be installed on your local machine, not your instance.**
338-
339-
## Transferring Data Between your Local Machine and the Cloud
340-
341-
If you're using Windows with PuTTY instead of Git Bash, please select the alternative option here:
342-
<select id="id_platform" name="platformlist" onchange="change_content_by_platform('id_platform');return false;">
343-
344-
<option value="unix" id="id_unix" selected> Linux, Mac OS, Git Bash </option>
345-
<option value="win" id="id_win"> PuTTY </option>
346-
</select>
347-
348-
<div id="div_unix" style="display:block" markdown="1">
349-
350-
### Uploading Data to your Virtual Machine with scp
351-
352-
`scp` stands for 'secure copy protocol', and is a widely used UNIX tool for moving files
353-
between computers. The simplest way to use `scp` is to run it in your local terminal,
354-
and use it to copy a single file:
355-
356-
```bash
357-
scp <file I want to move> <where I want to move it>
358-
```
359-
360-
Note that you are always running `scp` locally, but that _doesn't_ mean that
361-
you can only move files from your local computer. In order to move a file from your local computer to an AWS instance, the command would look like this:
362-
363-
```bash
364-
$ scp <local file> <AWS instance>
365-
```
366-
367-
To move it back to your local computer, you re-order the `to` and `from` fields:
368-
369-
```bash
370-
$ scp <AWS instance> <local file>
371-
```
372-
373-
#### Uploading Data to your Virtual Machine with scp
374-
375-
Open the terminal and use the `scp` command to upload a file (e.g. local_file.txt) to the dcuser home directory:
376-
377-
```bash
378-
$ scp local_file.txt [email protected]:/home/dcuser/
379-
```
380-
381-
#### Downloading Data from your Virtual Machine with scp
382-
383-
Let's download a text file from our remote machine. You should have a file that contains bad reads called ~/shell_data/scripted_bad_reads.txt.
384-
385-
**Tip:** If you are looking for another (or any really) text file in your home directory to use instead, try:
386-
387-
```bash
388-
$ find ~ -name *.txt
389-
```
390-
391-
Download the bad reads file in ~/shell_data/scripted_bad_reads.txt to your home ~/Download directory using the following command **(make sure you substitute [[email protected]](mailto:[email protected]) with your remote login credentials)**:
392-
393-
```bash
394-
$ scp [email protected]:/home/dcuser/shell_data/untrimmed_fastq/scripted_bad_reads.txt ~/Downloads
395-
```
396-
397-
Remember that in both instances, the command is run from your local machine, we've just flipped the order of the to and from parts of the command.
398-
399-
</div>
400-
401-
<div id="div_win" style="display:block" markdown="1">
402-
403-
### Uploading Data to your Virtual Machine with PSCP
404-
405-
If you're using a Windows PC without Git Bash, we recommend you use the _PSCP_ program.
406-
This program is from the same suite of tools as the PuTTY program we have been using to connect.
407-
408-
1. If you haven't done so, download pscp from [http://the.earth.li/~sgtatham/putty/latest/x86/pscp.exe](https://the.earth.li/~sgtatham/putty/latest/x86/pscp.exe)
409-
2. Make sure the _PSCP_ program is somewhere you know on your computer. In this case,
410-
your Downloads folder is appropriate.
411-
3. Open the windows [PowerShell](https://en.wikipedia.org/wiki/Windows_PowerShell);
412-
go to your start menu/search enter the term **'cmd'**; you will be able to start the shell
413-
(the shell should start from C:\\Users\\your-pc-username>).
414-
4. Change to the Downloads directory:
415-
416-
```bash
417-
> cd Downloads
418-
```
419-
420-
5. Locate a file on your computer that you wish to upload (be sure you know the path). Then upload it to your remote machine **(you will need to know your AMI instance address (which starts with ec2), and login credentials)**. You will be prompted to enter a password, and then your upload will begin. **(make sure you substitute 'your-pc-username' for your actual pc username and 'ec2-54-88-126-85.compute-1.amazonaws.com' with your AMI instance address)**
421-
422-
```bash
423-
C:\User\your-pc-username\Downloads> pscp.exe local_file.txt [email protected]:/home/dcuser/
424-
```
425-
426-
### Downloading Data from your Virtual Machine with PSCP
427-
428-
1. Follow the instructions in the Upload section to download (if needed) and access the _PSCP_ program (steps 1-3)
429-
2. Download the text file to your current working directory (represented by a .) using the following command **(make sure you substitute 'your-pc-username' for your actual pc username and 'ec2-54-88-126-85.compute-1.amazonaws.com' with your AMI instance address)**
430-
431-
```bash
432-
C:\User\your-pc-username\Downloads> pscp.exe [email protected]:/home/dcuser/shell_data/untrimmed_fastq/scripted_bad_reads.txt .
433-
434-
C:\User\your-pc-username\Downloads
435-
```
436-
437-
</div>
438296

439-
::::::::::::::::::::::::::::::::::::::::::::::::::
297+
!!! graduation-cap "keypoints"
440298

441-
:::::::::::::::::::::::::::::::::::::::: keypoints
299+
- Scripts are a collection of commands executed together.
300+
- Transferring information to and from virtual and local computers.
442301

443-
- Scripts are a collection of commands executed together.
444-
- Transferring information to and from virtual and local computers.
445302

446-
::::::::::::::::::::::::::::::::::::::::::::::::::

0 commit comments

Comments
 (0)