Skip to content

A repository for publication of code and data for SALT_KG: A Dataset that augments [SALT (Sales Autocompletion Linked Business Tables)](https://arxiv.org/abs/2501.03413) with Operational Business data from the Metadata KG for elements found on public access and published on Business Accelerator Hub (BAH)

License

Notifications You must be signed in to change notification settings

SAP-samples/salt-kg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SALT-KG: A Benchmark for Semantics-Aware Learning on Enterprise Tables

Made with Python License OpenReview REUSE Compliance

Description

This repository contains the dataset from our paper SALT-KG: A Benchmark for Semantics-Aware Learning on Enterprise Tables, presented at EurIPS'25 Table Representation Workshop.

Abstract

Building upon the SALT benchmark for relational prediction, we introduce SALT-KG, a benchmark for semantics-aware learning on enterprise tables. SALT-KG extends SALT by linking its multi-table transactional data with a structured Operational Business Knowledge represented in a Metadata Knowledge Graph (OBKG) that captures field-level descriptions, relational dependencies, and business object-types. This extension enables evaluation of models that jointly reason over tabular evidence and contextual semantics—an increasingly critical capability for foundation models on structured data. Empirical analysis reveals that while metadata-derived features yield modest improvements in classical prediction metrics, these metadata features consistently highlight gaps in models’ ability to leverage semantics in relational context. By reframing tabular prediction as semantics-conditioned reasoning, SALT-KG establishes a benchmark to advance tabular FMs grounded in declarative knowledge, providing the first empirical step toward semantically linked tables in structured data at enterprise scale.

Why SALT-KG

Motivation for Semantics with Tabular Data

There is growing research on Tabular Foundation Models. TRL models are typically trained and evaluated on benchmarks that represent relational structure but lack explicit semantic grounding or declarative context. Knowledge graph (KG) and data integration communities have explored connecting tables to semantic graphs through systems such as JENTAB. We can bridge this gap by enriching enterprise relational data with an explicit semantic layer that links tables, fields, and business objects through declarative knowledge in KG

How was SALT-KG Created

Motivation for Semantics with Tabular Data For every relation (Table) in the underlying SALT dataset, we find a matching node in the KG (a View).. We extract triples related to the Views that include: Fields: data abstraction nodes with associated fields, labels, associations, data classes, reference fields, and other elements. ObjectNodeTypes: further semantic metadata through technical definitions, business object descriptions

Dataset Overview

Motivation for Semantics with Tabular Data The SALT-KG dataset consists of 4 tables from the SALT benchmark, enriched with semantic metadata from an Operational Business Knowledge Graph (OBKG). The dataset includes:

  • 4 relational tables with transactional data
  • Metadata Knowledge Graph (OBKG) with field-level descriptions, relational dependencies, and business object
  • Train, validation, and test splits for each table

Requirements

N/A

Known Issues

No known issues

Authors

Citations

If you use this dataset in your research, please cite the following paper:

@inproceedings{mulang2025saltkg,
  title={SALT-KG: A Benchmark for Semantics-Aware Learning on Enterprise Tables},
  author={Mulang', Isaiah Onando and Sasaki, Felix and Klein, Tassilo and Kolk, Jonas and Grechanov, Nikolay and Hoffart, Johannes},
  booktitle={Proceedings of the NeurIPS 2025 Table Representation Learning Workshop},
  year={2025}
}

How to obtain support

Create an issue in this repository if you find a bug or have questions about the content.

For additional support, ask a question in SAP Community.

Contributing

If you wish to contribute code, offer fixes or improvements, please send a pull request. Due to legal reasons, contributors will be asked to accept a DCO when they create the first pull request to this project. This happens in an automated fashion during the submission process. SAP uses the standard DCO text of the Linux Foundation.

License

Copyright (c) 2025 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the CC-BY-NC-SA-4.0 except as noted otherwise in the LICENSE file.

About

A repository for publication of code and data for SALT_KG: A Dataset that augments [SALT (Sales Autocompletion Linked Business Tables)](https://arxiv.org/abs/2501.03413) with Operational Business data from the Metadata KG for elements found on public access and published on Business Accelerator Hub (BAH)

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •