|
| 1 | +%% Machine Learning Online Class |
| 2 | +% Exercise 7 | Principle Component Analysis and K-Means Clustering |
| 3 | +% |
| 4 | +% Instructions |
| 5 | +% ------------ |
| 6 | +% |
| 7 | +% This file contains code that helps you get started on the |
| 8 | +% exercise. You will need to complete the following functions: |
| 9 | +% |
| 10 | +% pca.m |
| 11 | +% projectData.m |
| 12 | +% recoverData.m |
| 13 | +% computeCentroids.m |
| 14 | +% findClosestCentroids.m |
| 15 | +% kMeansInitCentroids.m |
| 16 | +% |
| 17 | +% For this exercise, you will not need to change any code in this file, |
| 18 | +% or any other files other than those mentioned above. |
| 19 | +% |
| 20 | + |
| 21 | +%% Initialization |
| 22 | +clear ; close all; clc |
| 23 | + |
| 24 | +%% ================= Part 1: Find Closest Centroids ==================== |
| 25 | +% To help you implement K-Means, we have divided the learning algorithm |
| 26 | +% into two functions -- findClosestCentroids and computeCentroids. In this |
| 27 | +% part, you should complete the code in the findClosestCentroids function. |
| 28 | +% |
| 29 | +fprintf('Finding closest centroids.\n\n'); |
| 30 | + |
| 31 | +% Load an example dataset that we will be using |
| 32 | +load('ex7data2.mat'); |
| 33 | + |
| 34 | +% Select an initial set of centroids |
| 35 | +K = 3; % 3 Centroids |
| 36 | +initial_centroids = [3 3; 6 2; 8 5]; |
| 37 | + |
| 38 | +% Find the closest centroids for the examples using the |
| 39 | +% initial_centroids |
| 40 | +idx = findClosestCentroids(X, initial_centroids); |
| 41 | + |
| 42 | +fprintf('Closest centroids for the first 3 examples: \n') |
| 43 | +fprintf(' %d', idx(1:3)); |
| 44 | +fprintf('\n(the closest centroids should be 1, 3, 2 respectively)\n'); |
| 45 | + |
| 46 | +fprintf('Program paused. Press enter to continue.\n'); |
| 47 | +pause; |
| 48 | + |
| 49 | +%% ===================== Part 2: Compute Means ========================= |
| 50 | +% After implementing the closest centroids function, you should now |
| 51 | +% complete the computeCentroids function. |
| 52 | +% |
| 53 | +fprintf('\nComputing centroids means.\n\n'); |
| 54 | + |
| 55 | +% Compute means based on the closest centroids found in the previous part. |
| 56 | +centroids = computeCentroids(X, idx, K); |
| 57 | + |
| 58 | +fprintf('Centroids computed after initial finding of closest centroids: \n') |
| 59 | +fprintf(' %f %f \n' , centroids'); |
| 60 | +fprintf('\n(the centroids should be\n'); |
| 61 | +fprintf(' [ 2.428301 3.157924 ]\n'); |
| 62 | +fprintf(' [ 5.813503 2.633656 ]\n'); |
| 63 | +fprintf(' [ 7.119387 3.616684 ]\n\n'); |
| 64 | + |
| 65 | +fprintf('Program paused. Press enter to continue.\n'); |
| 66 | +pause; |
| 67 | + |
| 68 | + |
| 69 | +%% =================== Part 3: K-Means Clustering ====================== |
| 70 | +% After you have completed the two functions computeCentroids and |
| 71 | +% findClosestCentroids, you have all the necessary pieces to run the |
| 72 | +% kMeans algorithm. In this part, you will run the K-Means algorithm on |
| 73 | +% the example dataset we have provided. |
| 74 | +% |
| 75 | +fprintf('\nRunning K-Means clustering on example dataset.\n\n'); |
| 76 | + |
| 77 | +% Load an example dataset |
| 78 | +load('ex7data2.mat'); |
| 79 | + |
| 80 | +% Settings for running K-Means |
| 81 | +K = 3; |
| 82 | +max_iters = 10; |
| 83 | + |
| 84 | +% For consistency, here we set centroids to specific values |
| 85 | +% but in practice you want to generate them automatically, such as by |
| 86 | +% settings them to be random examples (as can be seen in |
| 87 | +% kMeansInitCentroids). |
| 88 | +initial_centroids = [3 3; 6 2; 8 5]; |
| 89 | + |
| 90 | +% Run K-Means algorithm. The 'true' at the end tells our function to plot |
| 91 | +% the progress of K-Means |
| 92 | +[centroids, idx] = runkMeans(X, initial_centroids, max_iters, true); |
| 93 | +fprintf('\nK-Means Done.\n\n'); |
| 94 | + |
| 95 | +fprintf('Program paused. Press enter to continue.\n'); |
| 96 | +pause; |
| 97 | + |
| 98 | +%% ============= Part 4: K-Means Clustering on Pixels =============== |
| 99 | +% In this exercise, you will use K-Means to compress an image. To do this, |
| 100 | +% you will first run K-Means on the colors of the pixels in the image and |
| 101 | +% then you will map each pixel onto its closest centroid. |
| 102 | +% |
| 103 | +% You should now complete the code in kMeansInitCentroids.m |
| 104 | +% |
| 105 | + |
| 106 | +fprintf('\nRunning K-Means clustering on pixels from an image.\n\n'); |
| 107 | + |
| 108 | +% Load an image of a bird |
| 109 | +A = double(imread('bird_small.png')); |
| 110 | + |
| 111 | +% If imread does not work for you, you can try instead |
| 112 | +% load ('bird_small.mat'); |
| 113 | + |
| 114 | +A = A / 255; % Divide by 255 so that all values are in the range 0 - 1 |
| 115 | + |
| 116 | +% Size of the image |
| 117 | +img_size = size(A); |
| 118 | + |
| 119 | +% Reshape the image into an Nx3 matrix where N = number of pixels. |
| 120 | +% Each row will contain the Red, Green and Blue pixel values |
| 121 | +% This gives us our dataset matrix X that we will use K-Means on. |
| 122 | +X = reshape(A, img_size(1) * img_size(2), 3); |
| 123 | + |
| 124 | +% Run your K-Means algorithm on this data |
| 125 | +% You should try different values of K and max_iters here |
| 126 | +K = 16; |
| 127 | +max_iters = 10; |
| 128 | + |
| 129 | +% When using K-Means, it is important the initialize the centroids |
| 130 | +% randomly. |
| 131 | +% You should complete the code in kMeansInitCentroids.m before proceeding |
| 132 | +initial_centroids = kMeansInitCentroids(X, K); |
| 133 | + |
| 134 | +% Run K-Means |
| 135 | +[centroids, idx] = runkMeans(X, initial_centroids, max_iters); |
| 136 | + |
| 137 | +fprintf('Program paused. Press enter to continue.\n'); |
| 138 | +pause; |
| 139 | + |
| 140 | + |
| 141 | +%% ================= Part 5: Image Compression ====================== |
| 142 | +% In this part of the exercise, you will use the clusters of K-Means to |
| 143 | +% compress an image. To do this, we first find the closest clusters for |
| 144 | +% each example. After that, we |
| 145 | + |
| 146 | +fprintf('\nApplying K-Means to compress an image.\n\n'); |
| 147 | + |
| 148 | +% Find closest cluster members |
| 149 | +idx = findClosestCentroids(X, centroids); |
| 150 | + |
| 151 | +% Essentially, now we have represented the image X as in terms of the |
| 152 | +% indices in idx. |
| 153 | + |
| 154 | +% We can now recover the image from the indices (idx) by mapping each pixel |
| 155 | +% (specified by its index in idx) to the centroid value |
| 156 | +X_recovered = centroids(idx,:); |
| 157 | + |
| 158 | +% Reshape the recovered image into proper dimensions |
| 159 | +X_recovered = reshape(X_recovered, img_size(1), img_size(2), 3); |
| 160 | + |
| 161 | +% Display the original image |
| 162 | +subplot(1, 2, 1); |
| 163 | +imagesc(A); |
| 164 | +title('Original'); |
| 165 | + |
| 166 | +% Display compressed image side by side |
| 167 | +subplot(1, 2, 2); |
| 168 | +imagesc(X_recovered) |
| 169 | +title(sprintf('Compressed, with %d colors.', K)); |
| 170 | + |
| 171 | + |
| 172 | +fprintf('Program paused. Press enter to continue.\n'); |
| 173 | +pause; |
| 174 | + |
0 commit comments