- A code that suggests papers similar to input keywords
- You can try from here
- Accepted papers from ICLR2025
Note
This can be repurposed for any conference where paper titles, abstracts, and URLs are available by simply replacing the target files
- Concatenate title, first 600 characters of abstract, and URL in plain text
- Convert to vectorstore using embedding model
- Output best-n similar papers using approximate nearest neighbor search
Note
The embedding model was selected based on MTEB leaderboard
- Download accepted papers from this link
- Place in
data
directory
- Convert to txt file
python src/build_db/json2txt.py
- Convert to vectorstore
python src/build_db
- Create a new space on Huggingface account
- Move the following files to the created space:
app.py
- vectorstore files created in
data/db
folder requirements.txt
query
: β-calibration of Language Model Confidence Scores for Generative QA
suggestion
: