The present repository proposes a benchmarking framework to easily evaluate LLMs' capacity to predict argument relations at the micro-scale level (only guessing the relation between pairs of arguments). The framework is intended to be used with a preprocessed sample of a dataset from the IBM Debater project.
You can found my own results here.
As mentioned earlier, this work is part of an academic project for the validation of my Master's Degree at Heriot-Watt University, preventing me from accepting any contributions until the final release of my project. Thank you for your understanding.
This work is part of a collection of works whose ultimate goal is to deliver a framework to automatically analyze social media content (e.g., X, Reddit) to extract their argumentative value and predict their relations, leveraging Large Language Models' (LLMs) abilities:
- liaisons (the developed client for social media content analysis)
- liaisons-preprocess (the preprocessing of the original IBM dataset)
- liaisons-claim-stance-sample (the preprocessed sample used with this benchmarking framework)
- liaisons-experiments-results (the obtained results with this benchmarking)
- mantis-shrimp (the configuration-as-code used to set up my workstation for this project)
This project is solely conducted by me, Guilhem Santé. I am a postgraduate student pursuing the MSc in Artificial Intelligence at Heriot-Watt University in Edinburgh.
I would like to credits Andrew Ireland, my supervisor for this project.