Skip to content

Commit 02ce8cb

Browse files
authored
Add CrossEncoder to the Sentence Transformers "How to Use" snippet & patch SetFit (#1337)
Hello! ## Pull Request overview * Add CrossEncoder to the Sentence Transformers "How to Use" snippet * Patch "How to Use" for SetFit models (that are also tagged as Sentence Transformers) ## Details This PR adds the `CrossEncoder` class to the "How to Use" snippets. I'm working on feature parity of these models with the more common `SentenceTransformer` models, and the recent Sentence Transformers v4.0 release was a big step in that. We're already at 150 models: https://huggingface.co/models?pipeline_tag=text-ranking&library=sentence-transformers Beyond that, this ensures that `[widgetExample.source_sentence, ...widgetExample.sentences]` doesn't crash, apparently that currently happens for some SetFit models, e.g.: https://huggingface.co/faodl/setfit-paraphrase-mpnet-base-v2-5ClassesDesc-multilabel-augmented?library=sentence-transformers - Tom Aarsen
1 parent 9b3e2cf commit 02ce8cb

File tree

1 file changed

+20
-1
lines changed

1 file changed

+20
-1
lines changed

Diff for: packages/tasks/src/model-libraries-snippets.ts

+20-1
Original file line numberDiff line numberDiff line change
@@ -920,13 +920,32 @@ export const sampleFactory = (model: ModelData): string[] => [
920920

921921
function get_widget_examples_from_st_model(model: ModelData): string[] | undefined {
922922
const widgetExample = model.widgetData?.[0] as WidgetExampleSentenceSimilarityInput | undefined;
923-
if (widgetExample) {
923+
if (widgetExample?.source_sentence && widgetExample?.sentences?.length) {
924924
return [widgetExample.source_sentence, ...widgetExample.sentences];
925925
}
926926
}
927927

928928
export const sentenceTransformers = (model: ModelData): string[] => {
929929
const remote_code_snippet = model.tags.includes(TAG_CUSTOM_CODE) ? ", trust_remote_code=True" : "";
930+
if (model.tags.includes("cross-encoder") || model.pipeline_tag == "text-ranking") {
931+
return [
932+
`from sentence_transformers import CrossEncoder
933+
934+
model = CrossEncoder("${model.id}"${remote_code_snippet})
935+
936+
query = "Which planet is known as the Red Planet?"
937+
passages = [
938+
"Venus is often called Earth's twin because of its similar size and proximity.",
939+
"Mars, known for its reddish appearance, is often referred to as the Red Planet.",
940+
"Jupiter, the largest planet in our solar system, has a prominent red spot.",
941+
"Saturn, famous for its rings, is sometimes mistaken for the Red Planet."
942+
]
943+
944+
scores = model.predict([(query, passage) for passage in passages])
945+
print(scores)`,
946+
];
947+
}
948+
930949
const exampleSentences = get_widget_examples_from_st_model(model) ?? [
931950
"The weather is lovely today.",
932951
"It's so sunny outside!",

0 commit comments

Comments
 (0)