|
| 1 | +<!doctype html> |
| 2 | +<html> |
| 3 | +<head> |
| 4 | + <title>Submarine</title> |
| 5 | + <link rel="stylesheet" type="text/css" href="../static/css/style.css" /> |
| 6 | + <link href='http://fonts.googleapis.com/css?family=Source+Sans+Pro:400,600,700' rel='stylesheet' type='text/css'> |
| 7 | + <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0"> |
| 8 | + <link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.5/styles/default.min.css"> |
| 9 | +</head> |
| 10 | +<body> |
| 11 | + <div class="wrapper"> |
| 12 | + |
| 13 | + <section class="content"> |
| 14 | + <p><a href="index.html">table of contents</a></p> |
| 15 | + <h2 id="27-working-with-columns">27: Working with columns</h2> |
| 16 | +<p>Along your "command line adventures", you will encounter many files that are divided in columns, such as "csv" or "tsv" files.</p> |
| 17 | +<p>Fortunately, unix has many tools to handle and manipulate this type of files.</p> |
| 18 | +<p>First let's download ourselves a test file and look at it's contents:</p> |
| 19 | +<pre><code class="lang-bash">cd |
| 20 | +curl https://raw.githubusercontent.com/Blahah/command_line_bootcamp/master/testfiles/grades.txt > grades.txt |
| 21 | +less testfile.vcf |
| 22 | +</code></pre> |
| 23 | +<p><code>curl</code> will download the contents of any URL you provide it and print it to STDOUT. Since we want our test file on the filesystem, we redirect the output of <code>curl</code> to the file "grades.txt"</p> |
| 24 | +<p>As you can see, this file containing hypothetical grades for hypothetical characters. First of all, one character stands out - "Spock", as he aces every class. Let's extract his information:</p> |
| 25 | +<pre><code class="lang-bash">cut -f 5 grades.txt |
| 26 | +</code></pre> |
| 27 | +<p>This command provides us with all the rows for column "5" (-f 5), which contains the grades for "Spock" and prints it to STDOUT. Neat, hum?</p> |
| 28 | +<p>What else stands out here? "Luke" has a value of 150 where the maximum is 100. He's probably "forcing" that grade, and that's cheating. Speaking of cheaters, Malcom is a known cheater, and his scores of 50 on everything raise suspicions. Let's remove both these students from our file.</p> |
| 29 | +<pre><code class="lang-bash">cut -f -2,4-7,9 grades.txt > grades_no_cheaters.txt |
| 30 | +</code></pre> |
| 31 | +<p>Ok, there is a lot to sink in here. First, the syntax of what the "-f" argument takes: the "-" means "everything up to" when used as the first character, but also means "everything between" when used between two other values (it also can mean "everything after" if used as the last character). Note that we are separating values with ",". The <code>> grades_no_cheaters.txt</code> will redirect the output into a new file.</p> |
| 32 | +<p>Ok, so now let's add back the cheaters as the last columns of our grades file.</p> |
| 33 | +<pre><code class="lang-bash">cut -f 3,8 grades.txt | paste grades_no_cheaters.txt - > sorted_grades.txt |
| 34 | +</code></pre> |
| 35 | +<p>Easy, wasn't it? We just cut back the names of the cheaters and then "piped" them to paste which placed the columns in the end of the grades file. The "-" here means "read from STDIN", and we could use another file instead, to merge the contents of both files.</p> |
| 36 | +<p>There you have it. Now all you have to do is read <code>sorted_grades.txt</code> and figure out what to do with the cheating students.</p> |
| 37 | + |
| 38 | + <div class="navlinks"> |
| 39 | + <a class="nextp button" href="28_combining_commands.html">OK, next! »</a> |
| 40 | + <a class="lastp navlink" href="26_matching_lines.html">« back</a> |
| 41 | + </div> |
| 42 | + </section> |
| 43 | + </div> |
| 44 | + <script src="//code.jquery.com/jquery-1.11.3.min.js"></script> |
| 45 | + <script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/8.5/highlight.min.js"></script> |
| 46 | + <script> |
| 47 | + function neaten(str) { |
| 48 | + var pieces = str.split('_'); |
| 49 | + var s = pieces[1]; |
| 50 | + s = s.charAt(0).toUpperCase() + s.slice(1); |
| 51 | + pieces[1] = s; |
| 52 | + return pieces.join(' '); |
| 53 | + } |
| 54 | + $('.toc-list li a').text(function() { |
| 55 | + return neaten($(this).text()); |
| 56 | + }); |
| 57 | + </script> |
| 58 | +</body> |
| 59 | +</html> |
0 commit comments