Skip to content

MannXo/mt-textablex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MT-TexTableX: A Multitask Learning Framework for Table-to-Text Generation

MT-TexTableX is a lightweight multitask learning framework for table-to-text (T2T) generation, designed to improve content selection, structural alignment, and factual consistency. Built on a T5-base encoder–decoder, MT-TexTableX jointly learns five tasks:

  • Table-to-Text Generation
  • Content Selection
  • Table Reconstruction
  • Text-to-Table Generation
  • Fact Verification

πŸ”§ Setup

  1. Clone the repository

    git clone https://github.com/yourusername/mt-textablex.git
    cd mt-textablex
  2. Install dependencies We recommend using Python 3.10+ and a virtual environment.

    pip install -r requirements.txt
  3. Download a tokenizer You can use the default t5-base tokenizer or fine-tune your own if needed:

    from transformers import T5Tokenizer
    tokenizer = T5Tokenizer.from_pretrained('t5-base')
    tokenizer.save_pretrained('mtl_t5_tokenizer')

πŸ“Š Preprocessing

Individual scripts are also available under preprocessing/ for each task if you want fine-grained control:

  • preprocess_totto_table_to_text.py
  • preprocess_totto_content_selection.py
  • preprocess_totto_table_reconstruction.py
  • preprocess_totto_text_to_table.py
  • preprocess_tabfact.py

πŸš€ Training

Once preprocessing is complete:

python model/train.py     --batch_size 4     --epochs 5     --accumulation_steps 2     --save_dir checkpoints/

You can modify task weights and other hyperparameters directly in train.py.


πŸ“ˆ Evaluation

We use the official ToTTo evaluation suite for computing BLEU, PARENT, and BLEURT scores. Our model outputs are compatible with this format.

See the ToTTo GitHub evaluation instructions for more details.


πŸ“ Citation

If you use MT-TexTableX or our code in your work, please cite:

@article{mohammadalizadeh2025mttextablex,
  title = {MT-TexTableX: a multitask learning approach to table-to-text generation},
  author = {Parman Mohammadalizadeh and Leila Safari},
  journal = {Expert Systems with Applications},
  year = {2025},
  issn = {0957-4174},
  doi = {10.1016/j.eswa.2025.129060},
  url = {https://www.sciencedirect.com/science/article/pii/S0957417425026776}
}

πŸ“„ Pre-proof article


πŸ“Œ Highlights

  • Outperforms GPT‑3.5, GPT‑4.1, LLaMA‑3, and Phi‑3 on ToTTo
  • Achieves BLEU 40.9, PARENT 54.8, BLEURT 0.1465 on ToTTo test set
  • Fully supervised, no prompt tuning, no in-context learning
  • Jointly optimized multi-task framework using only T5-base
  • Human evaluation confirms improvements in fluency and faithfulness

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages