GitHub

project_name	description
LDGNet-Project	LDGNet: LLMs Debate-Guided Network for Multimodal Sarcasm Detection – Official PyTorch Implementation. This framework introduces an innovative multimodal debate mechanism, leveraging large language models (LLMs) to utilize open-world knowledge for enhanced sarcasm detection in multimodal datasets.

Installation

Follow these steps to set up the environment:

conda create -n sarcasm python=3.8
conda activate sarcasm
pip install -r requirements.txt

Data Preparation

Download and prepare the datasets from the following sources:

HFM Dataset: A widely used dataset for multimodal sarcasm detection, available on Mendeley Data.
DMSD-CL Dataset: Designed to evaluate model generalization, more details can be found in the paper.
Data Processing Framework: Inspired by HKEmodel’s preprocessing approach, available on GitHub.

Training

Use the following scripts to train the LDGNet model. Select the configuration that fits your setup.

Configuration 1: ResNet and BERT

cd /path/to/main
export DATA="/path/to/dataset"
export LOG="/path/to/save/logs"

CUDA_VISIBLE_DEVICES=0,1 python run.py with data_root=$DATA \
    num_gpus=2 num_nodes=1 per_gpu_batchsize=4 batch_size=16 \
    image_size=224 vit_randaug max_text_len=512 \
    tokenizer="bert-base" vit="resnet" seed=42 \
    log_dir=$LOG precision=32 max_epoch=15 learning_rate=1e-4

Configuration 2: CLIP and T5

export DATA="/path/to/twitter/dataset"
export LOG="/path/to/checkpoints"

rm -rf $LOG
mkdir $LOG

CUDA_VISIBLE_DEVICES=0 python run.py with data_root=$DATA \
    num_gpus=1 num_nodes=1 per_gpu_batchsize=6 batch_size=6 \
    clip32_base224 text_t5_base image_size=224 vit_randaug \
    max_text_len=512 seed=42 \
    log_dir=$LOG precision=32 max_epoch=5 learning_rate=1.5e-4

Inference

Load the trained model and use the following commands for evaluation and testing.

Configuration 1: ResNet and BERT

cd /path/to/main
export DATA="/path/to/test/data"
export LOG="/path/to/logs"

CUDA_VISIBLE_DEVICES=0,1 python run.py with data_root=$DATA \
    num_gpus=2 num_nodes=1 per_gpu_batchsize=4 batch_size=16 test_only=True \
    image_size=224 vit_randaug tokenizer="bert-base" vit="resnet" \
    log_dir=$LOG precision=32 max_text_len=512 \
    load_path="/path/to/checkpoints/model.ckpt"

Configuration 2: CLIP and T5

export DATA="/path/to/test/data"
export LOG="/path/to/logs"

CUDA_VISIBLE_DEVICES=0 python run.py with data_root=$DATA \
    num_gpus=1 num_nodes=1 per_gpu_batchsize=6 batch_size=6 test_only=True \
    clip32_base224 text_t5_base image_size=224 vit_randaug \
    log_dir=$LOG precision=32 max_text_len=512 \
    load_path="/path/to/checkpoints/model.ckpt"

Key Features

Innovative Approach: Simulates debates among LLMs to identify conflicting sarcastic rationales.
Enhanced Multimodal Understanding: Combines textual and visual information for improved sarcasm detection.
Open-World Knowledge Utilization: Integrates contextual knowledge from diverse domains to ensure robust sarcasm interpretation.
Comprehensive Framework: Includes a debate module to generate sentiment rationales and a judge module for nuanced sentiment classification.

Citation

If you use this framework in your research, please cite:

@misc{zhou2024ldgnet,
    title={LDGNet: LLMs Debate-Guided Network for Multimodal Sarcasm Detection},
    url={https://github.com/LDGNet-project/LDGNet},
    author={Zhou, Hengyang and Yan, Jinwu and Chen, Yaqing and Hong, Rongman and Zuo, Wenbo and Jin, Keyan},
    month={November},
    year={2024}
}

Acknowledgements

This project builds on the strengths of existing multimodal frameworks and advances sarcasm detection using a unique debate-based methodology.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.idea		.idea
data		data
src		src
LICENSE		LICENSE
README.md		README.md
image_models.py		image_models.py
mask_generator.py		mask_generator.py
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Data Preparation

Training

Configuration 1: ResNet and BERT

Configuration 2: CLIP and T5

Inference

Configuration 1: ResNet and BERT

Configuration 2: CLIP and T5

Key Features

Citation

Acknowledgements

About

Releases

Packages

Languages

License

zhyhome/LDGNet

Folders and files

Latest commit

History

Repository files navigation

Installation

Data Preparation

Training

Configuration 1: ResNet and BERT

Configuration 2: CLIP and T5

Inference

Configuration 1: ResNet and BERT

Configuration 2: CLIP and T5

Key Features

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages