Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate “Other” into “None of the above” and “Algorithm can’t decide” #10

Open
jucor opened this issue Jan 21, 2025 · 3 comments

Comments

@jucor
Copy link
Collaborator

jucor commented Jan 21, 2025

Dear Jigsaw team

As discussed by email, it would be very useful if the topic analysis part of the Sensemaking tools could make the difference between “Other” (in the sense of “None of the above”) and “The Algorithm Can’t Decide”. See rationale at compdemocracy/polis#1878

I believe this can be achieved in src/tasks/categorization.ts in function assignDefaultCategory https://github.com/Jigsaw-Code/sensemaking-tools/blame/8eb482e35c44d2399ab68d684a12d51a74472ad4/src/tasks/categorization.ts#L381 called by function categorizeWithRetry .

I do note that SenseMaker does in some cases provide a form of distinction by using the Uncategorized sub-topic of the category Other . However this:

  • does not work for conversations which do not need subtopics, i.e. when the argument includeSubtopics of categorizeWithRetry is set to False, due to line
    includeSubtopics
    ? ({ name: "Other", subtopics: [{ name: "Uncategorized" }] } as NestedTopic)
    : ({ name: "Other" } as FlatTopic),
    -- in those case, there is no distinction.
  • Is not documented (but yay open-source, it can be found in the code :) ) ,
  • Thus even when includeSubtopics is True it is easy to overlook,
  • and because of that, can very easily confuse the users and the interpretation of the different topics, be it for debugging purposes (“Why is ‘Other’ so big?”) or for actual use.

Would be terrific to thus move the Uncategorized as a top-level topic rather than a subtopic.
Thanks !

@jucor
Copy link
Collaborator Author

jucor commented Jan 21, 2025

PS: it's neat that the Uncategorized feature when includeSubtopics are enabled has its own unit tests in

it('should assign "Other" topic and "Uncategorized" subtopic to comments that failed categorization after max retries', async () => {
const comments: Comment[] = [
{ id: "1", text: "Comment 1" },
{ id: "2", text: "Comment 2" },
{ id: "3", text: "Comment 3" },
];
const topics = '[{"name": "Topic 1", "subtopics": []}]';
const instructions = "Categorize the comments based on these topics: " + topics;
const includeSubtopics = true;
const topicsJson = [{ name: "Topic 1", subtopics: [] }];
// Mock the model to always return an empty response. This simulates a
// categorization failure.
mockGenerateData.mockReturnValue(Promise.resolve([]));
const commentRecords = await categorizeWithRetry(
new VertexModel("project", "location", "gemini-1000"),
instructions,
comments,
includeSubtopics,
topicsJson
);
expect(mockGenerateData).toHaveBeenCalledTimes(3);
const expected = [
{
id: "1",
text: "Comment 1",
topics: [{ name: "Other", subtopics: [{ name: "Uncategorized" }] }],
},
{
id: "2",
text: "Comment 2",
topics: [{ name: "Other", subtopics: [{ name: "Uncategorized" }] }],
},
{
id: "3",
text: "Comment 3",
topics: [{ name: "Other", subtopics: [{ name: "Uncategorized" }] }],
},
];
expect(commentRecords).toEqual(expected);
});
:)

@cianbrassilg
Copy link
Collaborator

@jucor This is a good call out, and an interesting suggested approach! Thanks for sharing. Clarifying question: from your testing so far, do you see the primary benefit of the 'Algorithm can't decide' category as being cases where the underlying comment is too ambiguous? Linked issue also mentions categorization failure instances, but curious how much this has been an issue in your experience? We can test this out and see how it impacts the resulting category numbers

@jucor
Copy link
Collaborator Author

jucor commented Jan 27, 2025

Thanks @cianbrassilg -- great to move the convo here. As mentioned in the other thread compdemocracy/polis#1876 (comment) , yes, for example when I ran the BG2018-short "2018 BG with vote tallies (filtered) - comments-with-votes-small" example spreadsheet provided by @metasoarous : more than 70% of the comments ended in "algorithm could not determine". I suspect (but I did not verify) that's because the spreadsheet had a lot of comments that were not filtered out but whose content had been deleted.

I remember also @DZNarayanan mentioning that "Other" is often pretty big, and discussing that it's often the biggest category -- so as we're investigating why, I think ruling out "Algorithm could not determine" would be the first thing to check for (and since doing it automatically is just a code change, that'd be easier than doing it manually :) ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants