From 7bba201fe142220ef0e4352424e5bd1994865290 Mon Sep 17 00:00:00 2001 From: Brad Miller Date: Thu, 10 Oct 2024 16:46:37 -0700 Subject: [PATCH] header fixes --- documentation/under-the-hood/ranking-notes.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/documentation/under-the-hood/ranking-notes.md b/documentation/under-the-hood/ranking-notes.md index 43aaba4a..4fc62d3e 100644 --- a/documentation/under-the-hood/ranking-notes.md +++ b/documentation/under-the-hood/ranking-notes.md @@ -90,13 +90,13 @@ Additionally, because the matrix factorization is re-trained from scratch every While the matrix factorization approach above has many nice properties, it doesn't give us a natural built-in way to estimate the uncertainty of its parameters. We take two approaches to model uncertainty: -#### Pseudo-rating sensitivity analysis +**Pseudo-rating sensitivity analysis** While the matrix factorization approach above has many nice properties, it doesn't give us a natural built-in way to estimate the uncertainty of its parameters. One approach that we use to help quantify the uncertainty in our parameter estimates is by adding in "extreme" ratings from "pseudo-raters", and measuring the maximum and minimum possible values that each note's intercept and factor parameters take on after all possible pseudo-ratings are adding. We add both helpful and not-helpful ratings, from pseudo-raters with the max and min possible rater intercepts, and with the max and min possible factors (as well as 0, since 0-factor raters can often have outsized impact on note intercepts). This approach is similar in spirit to the idea of pseudocounts in Bayesian modeling, or to Shapley values. We currently assign notes a "Not Helpful" status if the max (upper confidence bound) of their intercept is less than -0.04, in addition to the rules on the raw intercept values defined in the previous section. -#### Supervised confidence modeling +**Supervised confidence modeling** We also employ a supervised model to detect low confidence matrix factorization results. If the model predicts that a note will lose Helpful status, then the note will remain in Needs More Ratings status for an additional 30 minutes to allow it to gather a larger set of ratings. @@ -327,7 +327,7 @@ For not-helpful notes: ## Complete Algorithm Steps: -### Prescoring +**Prescoring** 1. Pre-filter the data: to address sparsity issues, only raters with at least 10 ratings and notes with at least 5 ratings are included (although we don’t recursively filter until convergence). Also, coalesce ratings made by raters with high post-selection-similarity. 2. For each scorer (Core, Expansion, ExpansionPlus, and multiple Group and Topic scorers): @@ -335,7 +335,7 @@ For not-helpful notes: - Compute Author and Rater Helpfulness Scores based on the results of the first matrix factorization, then filter out raters with low helpfulness scores from the ratings data as described in [Filtering Ratings Based on Helpfulness Scores](./contributor-scores.md). - Fit the harassment-abuse tag-consensus matrix factorization model on the helpfulness-score filtered ratings, then update Author and Rater Helpfulness scores using the output of the tag-consensus model. -### Scoring +**Scoring** 1. Load the output of step 2 above from prescoring, but re-run step 1 on the newest available notes and ratings data. 2. For each scorer (Core, Expansion, ExpansionPlus, and multiple Group and Topic scorers):