Merge branch 'master' of github.com:cs231n/cs231n.github.io

subhasis256 · subhasis256 · commit 512d2747d83c · 2016-01-13T23:36:50.000-08:00
diff --git a/assignments2016/assignment1.md b/assignments2016/assignment1.md
@@ -17,18 +17,20 @@ In this assignment you will practice putting together a simple image classificat
 - get a basic understanding of performance improvements from using **higher-level representations** than raw pixels (e.g. color histograms, Histogram of Gradient (HOG) features)
 
 ## Setup
-You can work on the assignment in one of two ways: locally on your own machine, or on a virtual machine
-through [Terminal](https://www.terminal.com/).
+You can work on the assignment in one of two ways: locally on your own machine, or on a virtual machine through Terminal.com. 
+
+### Working in the cloud on Terminal
+
+Terminal has created a separate subdomain to serve our class, [www.stanfordterminalcloud.com](https://www.stanfordterminalcloud.com). Register your account there. The Assignment 1 snapshot can then be found [here](https://www.stanfordterminalcloud.com/snapshot/49f5a1ea15dc424aec19155b3398784d57c55045435315ce4f8b96b62819ef65). If you're registered in the class you can contact the TA (see Piazza for more information) to request Terminal credits for use on the assignment. Once you boot up the snapshot everything will be installed for you, and you'll be ready to start on your assignment right away. We've written a small tutorial on Terminal [here](/terminal-tutorial).
 
 ### Working locally
-Get the code [here](http://vision.stanford.edu/teaching/cs231n/winter1516_assignment1.zip)
+Get the code as a zip file [here](http://vision.stanford.edu/teaching/cs231n/winter1516_assignment1.zip). As for the dependencies:
 
-**[Optional] virtual environment:**
-Once you have unzipped the starter code, you might want to create a
-[virtual environment](http://docs.python-guide.org/en/latest/dev/virtualenvs/)
-for the project. If you choose not to use a virtual environment, it is up to you
-to make sure that all dependencies for the code are installed on your machine.
-To set up a virtual environment, run the following:
+**[Option 1] Use Anaconda:**
+The preferred approach for installing all the assignment dependencies is to use [Anaconda](https://www.continuum.io/downloads), which is a Python distribution that includes many of the most popular Python packages for science, math, engineering and data analysis. Once you install it you can skip all mentions of requirements and you're ready to go directly to working on the assignment.
+
+**[Option 2] Manual install, virtual environment:**
+If you'd like to (instead of Anaconda) go with a more manual and risky installation route you will likely want to create a [virtual environment](http://docs.python-guide.org/en/latest/dev/virtualenvs/) for the project. If you choose not to use a virtual environment, it is up to you to make sure that all dependencies for the code are installed globally on your machine. To set up a virtual environment, run the following:
 
 ```bash
 cd assignment1
@@ -57,14 +59,11 @@ After you have the CIFAR-10 data, you should start the IPython notebook server f
 **NOTE:** If you are working in a virtual environment on OSX, you may encounter
 errors with matplotlib due to the [issues described here](http://matplotlib.org/faq/virtualenv_faq.html). You can work around this issue by starting the IPython server using the `start_ipython_osx.sh` script from the `assignment1` directory; the script assumes that your virtual environment is named `.env`.
 
-### Working on Terminal
-We will create a Terminal snapshot that is preconfigured for this assignment. Terminal allows you to work on the assignment from your browser. You can find a tutorial on how to use it [here](/terminal-tutorial).
-
 ### Submitting your work:
 Whether you work on the assignment locally or using Terminal, once you are done
 working run the `collectSubmission.sh` script; this will produce a file called
 `assignment1.zip`. Upload this file to your dropbox on
-[the coursework](https://coursework.stanford.edu/portal/site/W15-CS-231N-01/)
+[the coursework](https://coursework.stanford.edu/portal/site/W16-CS-231N-01/)
 page for the course.
 
 ### Q1: k-Nearest Neighbor classifier (20 points)
diff --git a/index.html b/index.html
@@ -22,6 +22,7 @@
       </a>
     </div>
 
+    <!--
     <div class="module-header">Winter 2015 Assignments</div>
 
     <div class="materials-item">
@@ -41,6 +42,7 @@
         Assignment #3: ConvNets II, Transfer Learning, Visualization
       </a>
     </div>
+  -->
 
     <div class="module-header">Module 0: Preparation</div>
 
diff --git a/linear-classify.md b/linear-classify.md
@@ -115,19 +115,19 @@ For example, going back to the example image of a cat and its scores for the cla
 
 There are several ways to define the details of the loss function. As a first example we will first develop a commonly used loss called the **Multiclass Support Vector Machine** (SVM) loss. The SVM loss is set up so that the SVM "wants" the correct class for each image to a have a score higher than the incorrect classes by some fixed margin \\(\Delta\\). Notice that it's sometimes helpful to anthropomorphise the loss functions as we did above: The SVM "wants" a certain outcome in the sense that the outcome would yield a lower loss (which is good).
 
-Let's now get more precise. Recall that for the i-th example we are given the pixels of image \\( x\_i \\) and the label \\( y\_i \\) that specifies the index of the correct class. The score function takes the pixels and computes the vector \\( f(x\_i, W) \\) of class scores.  For example, the score for the j-th class is the j-th element: \\( f(x\_i, W)\_j \\). The Multiclass SVM loss for the i-th example is then formalized as follows:
+Let's now get more precise. Recall that for the i-th example we are given the pixels of image \\( x\_i \\) and the label \\( y\_i \\) that specifies the index of the correct class. The score function takes the pixels and computes the vector \\( f(x\_i, W) \\) of class scores, which we will abbreviate to \\(s\\) (short for scores).  For example, the score for the j-th class is the j-th element: \\( s\_j = f(x\_i, W)\_j \\). The Multiclass SVM loss for the i-th example is then formalized as follows:
 
 $$
-L\_i = \sum\_{j\neq y\_i} \max(0, f(x\_i, W)\_j - f(x\_i, W)\_{y\_i} + \Delta)
+L\_i = \sum\_{j\neq y\_i} \max(0, s\_j - s\_{y\_i} + \Delta)
 $$
 
-**Example.** This expression may seem daunting if you're seeing it for the first time, so lets unpack it with an example to see how it works. Suppose that we have three classes that receive the scores \\(f(x\_i, W) = [13, -7, 11]\\), and that the first class is the true class (i.e. \\(y\_i = 0\\)). Also assume that \\(\Delta\\) (a hyperparameter we will go into more detail about soon) is 10. The expression above sums over all incorrect classes (\\(j \neq y\_i\\)), so we get two terms:
+**Example.** Lets unpack this with an example to see how it works. Suppose that we have three classes that receive the scores \\( s = [13, -7, 11]\\), and that the first class is the true class (i.e. \\(y\_i = 0\\)). Also assume that \\(\Delta\\) (a hyperparameter we will go into more detail about soon) is 10. The expression above sums over all incorrect classes (\\(j \neq y\_i\\)), so we get two terms:
 
 $$
 L\_i = \max(0, -7 - 13 + 10) + \max(0, 11 - 13 + 10)
 $$
 
-You can see that the first term gives zero since [-7 - 13 + 10] gives a negative number, which is then thresholded to zero with the \\(max(0,-)\\) function. We get zero loss for this pair because the correct class score (13) was greater than the incorrect class score (-7) by at least the margin 10. In fact the difference was 20, which is much greater than 10 but the SVM only cares that the difference is at least 10; Any additional difference above the margin is clamped at zero with the max operation. The second term computes [11 - 13 + 10] which gives 8. That is, even though the correct class had a higher score than the incorrect class (13 > 11), it was not greater by the desired margin of 10. The difference was only 2, which is why the loss comes out to 8 (i.e. how much higher the difference would have to be to meet the margin). In summary, the SVM loss function wants the score of the correct class \\(y\_i\\) to be larger than the incorrect class scores by at least by \\(\Delta\\) (delta). If this is not the case, we will accumulate loss (and that's bad).
+You can see that the first term gives zero since [-7 - 13 + 10] gives a negative number, which is then thresholded to zero with the \\(max(0,-)\\) function. We get zero loss for this pair because the correct class score (13) was greater than the incorrect class score (-7) by at least the margin 10. In fact the difference was 20, which is much greater than 10 but the SVM only cares that the difference is at least 10; Any additional difference above the margin is clamped at zero with the max operation. The second term computes [11 - 13 + 10] which gives 8. That is, even though the correct class had a higher score than the incorrect class (13 > 11), it was not greater by the desired margin of 10. The difference was only 2, which is why the loss comes out to 8 (i.e. how much higher the difference would have to be to meet the margin). In summary, the SVM loss function wants the score of the correct class \\(y\_i\\) to be larger than the incorrect class scores by at least by \\(\Delta\\) (delta). If this is not the case, we will accumulate loss.
 
 Note that in this particular module we are working with linear score functions ( \\( f(x\_i; W) =  W x\_i \\) ), so we can also rewrite the loss function in this equivalent form: