Skip to content

Latest commit

 

History

History
59 lines (36 loc) · 1.42 KB

README.md

File metadata and controls

59 lines (36 loc) · 1.42 KB

ᯤpaper-sonar

README_ja

  • A code that suggests papers similar to input keywords

Demo using Huggingface spaces

  • You can try from here

Target

  • Accepted papers from ICLR2025

Note

This can be repurposed for any conference where paper titles, abstracts, and URLs are available by simply replacing the target files

Process

Note

The embedding model was selected based on MTEB leaderboard

Implementation

  1. Download accepted papers from this link
  • Place in data directory
  1. Convert to txt file
python src/build_db/json2txt.py
  1. Convert to vectorstore
python src/build_db

HuggingFace demo

  1. Create a new space on Huggingface account
  2. Move the following files to the created space:
  • app.py
  • vectorstore files created in data/db folder
  • requirements.txt

Example

query: β-calibration of Language Model Confidence Scores for Generative QA suggestion:

sugegstion