AI701-project

BERT

The BERT training experiment was done in the following order: the following ipynb were run on kaggle. We have made the notebooks public(for the duration of grading) on kaggle so you can run them, if needed.

Huggingface BERT is finetuned on our dataset and the best model is saved - huggingface-bert.ipynb
the saved model is used for inference - bert-mcq.ipynb
Kfolds is performed on Huggingface BERT and is finetuned on our dataset and the best model(of all folds) is saved - kfolds-on-bert.ipynb
The saved model(from3) is used for inference - bert-mcq.ipynb

BERT with RAG

RAG takes an input and retrieves a set of relevant/supporting documents given a source (e.g., Wikipedia).

Huggingface BERT is finetuned on our dataset + the retrieved context and the best model is saved - huggingface-bert-with-wikipedia-rag.ipynb
The saved model is used for inference - bert-with-wikipedia-rag.ipynb
Kfolds is performed on Huggingface BERT and is finetuned on our dataset and the best model(of all folds) is saved - kfolds-of-huggingface-bert-with-wikipedia-rag.ipynb
The saved model(from3) is used for inference - bert-with-wikipedia-rag.ipynb

RoBERTa

The notebooks run on kaggle.

Huggingface RoBERTa finetuning/finetuned (uncomment/comment necessary parts) - roberta-llm-exam.ipynb
Kfolds is performed on Huggingface RoBERTa finetuning/finetuned (uncomment/comment necessary parts) - roberta-kfold.ipynb

RoBERTa with RAG

The process for getting suppoting information is the same as for BERT and other models in our project.

Huggingface RoBERTa finetuning/finetuned (uncomment/comment necessary parts) + the retrieved context - roberta-RAG.ipynb
Kfolds is performed on Huggingface RoBERTa finetuning/finetuned (uncomment/comment necessary parts) + the retrieved context - roberta-RAG-kfold.ipynb

GPT3.5

gpt3.5 API was used, and the predictions are in gpt3.5_pred.csv

Llama 2 7b chat

Llama 2 7b chat api was used. Notebooks for the inference process are llama-api.ipynb and llama-api-with-context.ipynb for inferencing without and with RAG context respectively. The predictions are in llama-2-7b-answers.csv and llama-2-7b-answers-context.csv

LSTM

LSTM multiclass classification without rag - [lstm-without-rag.ipynb] (https://www.kaggle.com/code/fathinahizzati/lstm-1?scriptVersionId=151435997)
LSTM multiclass classification with rag - [lstm-with-rag.ipynb] (https://www.kaggle.com/fat2321321/lstm-3)
LSTM for next word inference - [lstm-next-token-pred.ipynb] (https://www.kaggle.com/tinaaaaaaaaa/lstm-2-2)

Platypus

Platypus inference without wikipedia rag - [platypus2-70b-without-wikipedia-rag.ipynb] (https://www.kaggle.com/code/tinaaaaaaaaa/platypus2-70b-without-wikipedia-rag)
Platypus inference with wikipedia rag - [platypus2-70b-with-wikipedia-rag.ipynb] (https://www.kaggle.com/code/fathinahizzati/platypus2-70b-with-wikipedia-rag)

Bag-of-Words

Bag-of-words + cosine similarity bow-without-sklearn.ipynb
Bag-of-words + cosine similarity with sklearn bow-with-sklearn.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
BERT		BERT
BoW		BoW
GPT3.5		GPT3.5
LSTM		LSTM
Llama		Llama
Platypus		Platypus
RoBERTa		RoBERTa
README.md		README.md
sample_submission.csv		sample_submission.csv
test.csv		test.csv
test_context.csv		test_context.csv
train.csv		train.csv
train_context.csv		train_context.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI701-project

BERT

BERT with RAG

RoBERTa

RoBERTa with RAG

GPT3.5

Llama 2 7b chat

LSTM

Platypus

Bag-of-Words

About

Releases

Packages

Languages

Nour-rabih/mcq-answering

Folders and files

Latest commit

History

Repository files navigation

AI701-project

BERT

BERT with RAG

RoBERTa

RoBERTa with RAG

GPT3.5

Llama 2 7b chat

LSTM

Platypus

Bag-of-Words

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages