Update the PFI Binary Classification XML Docs (#1792)

rogancarr · web-flow · commit be115f416a77 · 2018-11-30T13:51:36.000-08:00
Updating Binary Classification XML Docs to the latest version of the PFI documentation text.
diff --git a/src/Microsoft.ML.Transforms/PermutationFeatureImportanceExtensions.cs b/src/Microsoft.ML.Transforms/PermutationFeatureImportanceExtensions.cs
@@ -18,7 +18,7 @@ public static class PermutationFeatureImportanceExtensions
         /// <remarks>
         /// <para>
         /// Permutation feature importance (PFI) is a technique to determine the global importance of features in a trained
-        /// machine learning model. PFI is a simple yet powerul technique motivated by Breiman in his Random Forest paper, section 10
+        /// machine learning model. PFI is a simple yet powerful technique motivated by Breiman in his Random Forest paper, section 10
         /// (Breiman. <a href='https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf'>&quot;Random Forests.&quot;</a> Machine Learning, 2001.)
         /// The advantage of the PFI method is that it is model agnostic -- it works with any model that can be
         /// evaluated -- and it can use any dataset, not just the training set, to compute feature importance metrics.
@@ -33,7 +33,7 @@ public static class PermutationFeatureImportanceExtensions
         /// </para>
         /// <para>
         /// In this implementation, PFI computes the change in all possible regression evaluation metrics for each feature, and an
-        /// <code>ImmutableArray</code> of <code>RegressionEvaluator.Result</code> objects is returned. See the sample below for an
+        /// <code>ImmutableArray</code> of <code>RegressionMetrics</code> objects is returned. See the sample below for an
         /// example of working with these results to analyze the feature importance of a model.
         /// </para>
         /// </remarks>
@@ -85,10 +85,37 @@ private static RegressionMetrics RegressionDelta(
         }
 
         /// <summary>
-        /// Permutation Feature Importance is a technique that calculates how much each feature 'matters' to the predictions.
-        /// Namely, how much the model's predictions will change if we randomly permute the values of one feature across the evaluation set.
-        /// If the quality doesn't change much, this feature is not very important. If the quality drops drastically, this was a really important feature.
+        /// Permutation Feature Importance (PFI) for Binary Classification
         /// </summary>
+        /// <remarks>
+        /// <para>
+        /// Permutation feature importance (PFI) is a technique to determine the global importance of features in a trained
+        /// machine learning model. PFI is a simple yet powerful technique motivated by Breiman in his Random Forest paper, section 10
+        /// (Breiman. <a href='https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf'>&quot;Random Forests.&quot;</a> Machine Learning, 2001.)
+        /// The advantage of the PFI method is that it is model agnostic -- it works with any model that can be
+        /// evaluated -- and it can use any dataset, not just the training set, to compute feature importance metrics.
+        /// </para>
+        /// <para>
+        /// PFI works by taking a labeled dataset, choosing a feature, and permuting the values
+        /// for that feature across all the examples, so that each example now has a random value for the feature and
+        /// the original values for all other features. The evalution metric (e.g. AUC or R-squared) is then calculated
+        /// for this modified dataset, and the change in the evaluation metric from the original dataset is computed.
+        /// The larger the change in the evaluation metric, the more important the feature is to the model.
+        /// PFI works by performing this permutation analysis across all the features of a model, one after another.
+        /// </para>
+        /// <para>
+        /// In this implementation, PFI computes the change in all possible binary classification evaluation metrics for each feature, and an
+        /// <code>ImmutableArray</code> of <code>BinaryClassificationMetrics</code> objects is returned. See the sample below for an
+        /// example of working with these results to analyze the feature importance of a model.
+        /// </para>
+        /// </remarks>
+        /// <example>
+        /// <format type="text/markdown">
+        /// <![CDATA[
+        /// [!code-csharp[PFI](~/../docs/samples/doc/samples/Microsoft.ML.Samples/Dynamic/PermutationFeatureImportance.cs)]
+        /// ]]>
+        /// </format>
+        /// </example>
         /// <param name="ctx">The binary classification context.</param>
         /// <param name="model">The model to evaluate.</param>
         /// <param name="data">The evaluation data set.</param>